1. 2018-04-17 - Dashboard with id x not found; Tags: Dashboard with id x not found
    Loading...

    Dashboard with id x not found

    X-Pack Reporting allows to automate and generate daily reports on pre-existing dashboards or visualizations in Kibana. To keep security tight I have created a reporting user. The first run with the reporting user gave me some mystery. Reporting complained Dashboard with id 'AWLOnWVZLaWygeBEGxLJ' not found. I did some digging and found the reason, which I am going to elaborate about in this post.

    Use-Case Scenario

    My watch watches for failed watches. Watcher itself is powerful but IMHO not very good for business users. You need deep or average knowledge about Elasticsearch in order to get it right. We have a couple of watches which failed from time to time. In my automation as Elasticsearch Admin I have to keep an eye on that.

    The watch:

    GET /_xpack/watcher/watch/failed_watches
    
    {
    "watch": {
        "trigger": {
          "schedule": {
            "daily": {
              "at": [
                "01:30"
              ]
            }
          }
        },
        "input": {
          "none": {}
        },
        "condition": {
          "always": {}
        },
        "actions": {
          "email_developers": {
            "email": {
              "profile": "standard",
              "attachments": {
                "count_report.pdf": {
                  "reporting": {
                    "url": "https://cinhtau.net/api/reporting/generate/dashboard/AWLOnWVZLaWygeBEGxLJ?_g=(time:(from:now-1d%2Fd,mode:quick,to:now))",
                    "auth": {
                      "basic": {
                        "username": "reporting_wotscher",
                        "password": "guess-what"
                      }
                    }
                  }
                }
              },
              "to": [
                "le-mapper@cinhtau.net"
              ],
              "subject": "Failed Watches"
            }
          }
        }
      }
    }
    

    Kibana Object

    First thought was ok, let’s check if dashboard has the respective id or can be found. Search in your kibana index. The default is .kibana.

    POST /.kibana/_search
    {
      "query": { "ids": { "values": ["AWLOnWVZLaWygeBEGxLJ" ] } }
    }
    

    If you get a similar output it is there.

    {
      "took": 2,
      "timed_out": false,
      "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 1,
        "hits": [
          {
            "_index": ".kibana",
            "_type": "dashboard",
            "_id": "AWLOnWVZLaWygeBEGxLJ",
            "_score": 1,
            "_source": {
              "title": "Report Failed Watches",
              "hits": 0,
              "description": "",
              "panelsJSON": """[{"size_x":3,"size_y":3,"panelIndex":1,"type":"visualization","id":"Watcher-Duration","col":4,"row":1},{"size_x":6,"size_y":3,"panelIndex":2,"type":"visualization","id":"fa7e4420-4080-11e7-ab57-7554a52ae433","col":7,"row":1},{"size_x":3,"size_y":3,"panelIndex":3,"type":"visualization","id":"Watches-Done","col":1,"row":1}]""",
              "optionsJSON": """{"darkTheme":false}""",
              "uiStateJSON": """{"P-1":{"vis":{"defaultColors":{"0 - 100":"rgb(0,104,55)"}}},"P-2":{"vis":{"params":{"sort":{"columnIndex":null,"direction":null}}}},"P-3":{"vis":{"defaultColors":{"0 - 100":"rgb(0,104,55)"}}}}""",
              "version": 1,
              "timeRestore": false,
              "kibanaSavedObjectMeta": {
                "searchSourceJSON": """{"id":"AWLOeppmLaWygeBEGxLI","filter":[{"meta":{"index":".watcher-*","type":"phrase","key":"state","value":"failed","disabled":false,"negate":false,"alias":null},"query":{"match":{"state":{"query":"failed","type":"phrase"}}},"$state":{"store":"appState"}},{"query":{"match_all":{}}}],"highlightAll":true,"version":true}"""
              }
            }
          }
        ]
      }
    }
    

    X-Pack Security

    Next checkpoint is to look on the security or granted permissions. In the official docs they are using the superuser elastic. Not recommended.

    The reporting user must have the roles

    • reporting_user in order to execute report generation
    • watcher_user in order to read the watch data
    • kibana_user in order to access the Kibana objects

    A quick check with Kibana Console:

    GET /_xpack/security/user/reporting_wotscher
    
    {
      "reporting_wotscher": {
        "username": "reporting_wotscher",
        "roles": [
          "monitoring_user",
          "reporting_user",
          "watcher_user"
        ],
        "full_name": "Elastic Wotscher",
        "email": "le-mapper@cinhtau.net",
        "metadata": {},
        "enabled": true
      }
    }
    

    kibana_user is missing. Without the permission he could not read the .kibana index and thus can’t find the dashboard object with the id.

    Add the permission with Kibana Console.

    PUT /_xpack/security/user/reporting_wotscher
    {
        "username": "reporting_wotscher",
        "password: "guess-it",
        "roles": [
          "monitoring_user",
          "reporting_user",
          "watcher_user",
          "kibana_user"
        ],
        "full_name": "Elastic Wotscher",
        "email": "le-mapper@cinhtau.net",
        "metadata": {},
        "enabled": true  
    }
    

    Comments


    Leave a comment


  2. 2018-04-03 - HTTP Input for Elasticsearch Watcher; Tags: HTTP Input for Elasticsearch Watcher
    Loading...

    HTTP Input for Elasticsearch Watcher

    Elasticsearch X-Pack Alerting or aka Watcher offers the capability to alert on specific events/constellation in the Elasticsearch data. Watcher can retrieve data from the cluster where it runs (on the master node), or fetch data from Restful Web-Services via the http input. Preferably having a production cluster, you should report the monitoring data to a dedicated Elasticsearch monitoring cluster. This monitoring cluster can also run watches. The watch I’am going to introduce is the cluster health watch.

    Purpose

    The health of your production cluster is of utmost importance. Elasticsearch provides the _cluster/health endpoint and returns three states. The cluster health status is: green, yellow or `red. By severity you might consider everything which is not green as alert. Having a dedicated monitoring cluster allows you to run watches, if you have a license subscription. Knowing when your cluster is in trouble, might give you the necessary time to act accordingly or react at least faster. Watcher can give you that time.

    The cluster health watch must not be performed by Elasticsearch Watcher. A Jenkins job or a cron job is also a viable option. Watcher is a integrated all in one solution in the elastic stack. It documents every watch and can you compare or scroll through the history. Regardless how you do it, it must be done.

    Watch Definition

    Find below the watch definition with example data. Replace it, to your needs.

    Following watch definition:

    • input
      • request the input from the production cluster over the http endpoint
      • uses basic auth, optional if you don’t have X-Pack Security or Nginx protected endpoint
    • condition
      • everything that is not green, causes an action
      • you might invert the condition by asking for health state equal to red
    • action
      • sends a email to the Elasticsearch Administrator or SMS through a mail gateway
    {
      "trigger": {
        "schedule": {
          "interval": "5m"
        }
      },
      "input": {
        "http": {
          "request": {
            "scheme": "http",
            "host": "elasticsearch",
            "port": 9200,
            "method": "get",
            "path": "/_cluster/health",
            "params": {},
            "headers": {},
            "auth": {
              "basic": {
                "username": "healthcheck",
                "password": "check_it_out"
              }
            }
          }
        }
      },
      "condition": {
        "compare": {
          "ctx.payload.status": {
            "not_eq": "green"
          }
        }
      },
      "actions": {
        "notify_admins": {
          "email": {
            "profile": "standard",
            "from": "watcher@cinhtau.net",
            "reply_to": [
              "le_mapper@cinhtau.net"
            ],
            "to": [
              "le_mapper@cinhtau.net"       
            ],
            "subject": "Status  detected for Production Cluster.",
            "body": {
              "html": "Please check cluster! This watch is deactivated during maintenance!"
            }
          }
        }
      }
    }
    

    Summary

    • Provided example could also be applied to any other Web-Service.
    • You might also consider a Statuspage if you want to provide the information for your customers/users.
    • The http input allows to query data from other endpoints.
    • Response must be in JSON or YAML for watcher to process it.

    Comments


    Leave a comment


  3. 2018-03-27 - Shard Allocation in a Elasticsearch Cluster; Tags: Shard Allocation in a Elasticsearch Cluster
    Loading...

    Shard Allocation in a Elasticsearch Cluster

    Shards are parts of an Apache Lucene Index, the storage unit of Elasticsearch. An index may consists of more than one shard. Elasticsearch distributes the storage to its nodes. In a regular case each shard (as primary) has a replica. Primary and Replica are never stored on the same node. If a node fails, the replica takes over as primary and Elasticsearch tries to allocate a replica shard in the remaining cluster nodes. Cluster Shard Allocation is a pretty decent mechanism to ensure high availability. This post gives some insights and recipes how to deal with cluster shard allocation in a hot-warm architecture.

    Index Details

    Since this is about an index, use the Index API to check how many primaries and replicas an index has.

    >GET _cat/indices/metricbeat-6.2.2-2018.03.27?v

    The output as plain text with headers.

    health status index                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   metricbeat-6.2.2-2018.03.27 H_sYkWaiRmS2H4dJesu9Pg   1   1    2024496            0      5.2gb          2.6gb
    

    Above example shows us one primary and one replica for index metricbeat-6.2.2-2018.03.27.

    Cluster Scenario

    Cluster Shard Allocation is a pretty cool feature of Elasticsearch. If you have a hot warm architecture you can e.g. prioritize data regarding IO. Hot and warm are just semantic values to describe Elasticsearch nodes regarding IO. For instance hot is super fast (SSD or SAN), warm slower (magnetic drives). My cluster setup regarding data nodes:

    • 5 hot nodes
    • 2 warm nodes
    • 1 volatile node

    The volatile node is for insignificant data, like Machine Learning (currently in evaluation :smile:). You can define custom attributes in the elasticsearch.yml like

    node:
      master: false
      data:   true
      ingest: true
      ml:     false
      attr:
        rack: "with-nas"
        box_type: "hot"
    

    and use box_type as discriminator for data nodes in a hot-warm architecture.

    Analyze Index Settings

    If you wanna check if a allocation exists on the index

    GET metricbeat-6.2.2-2018.03.27/_settings
    

    The output

    {
      "metricbeat-6.2.2-2018.03.27": {
        "settings": {
          "index": {
            "routing": {
              "allocation": {
                "include": {
                  "box_type": "volatile,hot"
                }
              }
            },
            "mapping": {
              "total_fields": {
                "limit": "10000"
              }
            },
            "refresh_interval": "30s",
            "number_of_shards": "1",
            "provided_name": "metricbeat-6.2.2-2018.03.27",
            "creation_date": "1522101603804",
            "unassigned": {
              "node_left": {
                "delayed_timeout": "30m"
              }
            },
            "number_of_replicas": "1",
            "uuid": "H_sYkWaiRmS2H4dJesu9Pg",
            "version": {
              "created": "5060899"
            }
          }
        }
      }
    }
    

    "box_type": "volatile,hot" on the include part means that the index can reside on a volatile or hot node. The combination is a OR conjunction.

    Explain API

    Elasticsearch offers a powerful explain API, to check that:

    GET /_cluster/allocation/explain
    {
      "index": "metricbeat-6.2.2-2018.03.27",
      "shard": 0,
      "primary": true
    }
    

    Basically the shortened output tells you about the decisions the cluster (master node) has made. In the node_allocation_decisions you find all the details.

    {
      "index": "metricbeat-6.2.2-2018.03.27",
      "shard": 0,
      "primary": true,
      "current_state": "started",
      "current_node": {
        "name": "machine-learning-master",
        "attributes": {
          "box_type": "hot"
        },
        "weight_ranking": 2
      },
      "can_remain_on_current_node": "yes",
      "can_rebalance_cluster": "throttled",
      "can_rebalance_cluster_decisions": [],
      "can_rebalance_to_other_node": "throttled",
      "rebalance_explanation": "rebalancing is throttled",
      "node_allocation_decisions": [
        {
          "node_name": "machine-learning-slave",
          "node_attributes": {
            "box_type": "volatile"
          },
          "node_decision": "throttled",
          "weight_ranking": 1,
          "deciders": [
            {
              "decider": "throttling",
              "decision": "THROTTLE",
              "explanation": "reached the limit of incoming shard recoveries [2], cluster setting [cluster.routing.allocation.node_concurrent_incoming_recoveries=2] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
            }
          ]
        },
        {
          "node_name": "hot-node",
          "node_attributes": {
            "box_type": "hot"
          },
          "node_decision": "no",
          "weight_ranking": 6,
          "deciders": [
            {
              "decider": "same_shard",
              "decision": "NO",
              "explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[metricbeat-6.2.2-2018.03.27][0], node[p87j8OP3R4GzTrgau_chsw], [R], s[STARTED], a[id=NeGf1la8TC-czqIMLspnXg]]"
            }
          ]
        },
        {
          "node_name": "warm-node",
          "node_attributes": {
            "box_type": "warm"
          },
          "node_decision": "no",
          "weight_ranking": 7,
          "deciders": [
            {
              "decider": "filter",
              "decision": "NO",
              "explanation": "node does not match index setting [index.routing.allocation.include] filters [box_type:\"volatile OR hot\"]"
            }
          ]
        }
        //..
      ]
    }
    

    Change Shard Allocation

    If you want to have a different allocation, following template list all available options:

    PUT metricbeat-6.2.2-2018.03.27/_settings
    {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "box_type": "hot"
            },
            "exclude": {
              "box_type": "warm"
            },
            "require": {
              "box_type": ""
            }
          }
        }
      }
    }
    

    The use case for a custom routing is for instance a performance reason. For instance: You have to investigate 2 days/indices in the past, residing on the warm nodes, you can allocate them to the hot node during the investigation and let Elasticsearch Curator (the housekeeping tool) move them automatically back after the investigation.

    Include

    Above example would allocate the index on hot nodes only. If you add multiple values like volatile,hot, Elasticsearch will also use the volatile node.

    Exclude

    In this example the warm node is excluded. To unset it use "". As tested in v5.6.8 null does not work.

    Require

    If you choose require instead of include, multiple values are AND joined. If you use

    PUT metricbeat-6.2.2-2018.03.27/_settings
    {
      "index.routing.allocation.require.box_type": "volatile,hot"
    }
    

    a node must have both attributes. In my above scenario, that won’t work. To unset it:

    PUT metricbeat-6.2.2-2018.03.27/_settings
    {
      "index.routing.allocation.require.box_type": ""
    }
    

    Cluster Reroute

    If you want to intervene in the master node decision, the reroute command allows to explicitly execute a cluster reroute allocation command including specific commands. A simple replica allocation to a specific data node. Elasticsearch allows to use the node names. Node ids are unique, but are hard to remember :wink:.

    POST /_cluster/reroute
    {
      "commands": [
        {
          "allocate_replica": {
            "index": "metricbeat-6.2.2-2018.03.27",
            "shard": 0,
            "node": "hot-node-3"
          }
        }
      ]
    }
    

    Summary

    • Elasticsearch does a wonderful job keeping your indices distributed among data nodes.
    • If you need to check the Cluster Routing examine the index settings.
    • Specific Cluster Filtering with include, exclude and require allows a fine grained control on the distribution among data nodes.
    • The Cluster Allocation Explain gives you detailed information about the decisions.
    • You can adjust or intervene in the allocation with the Cluster Reroute commands.

    A template for cluster filtering

    Comments


    Leave a comment


  4. 2018-03-19 - Elasticsearch Certificates; Tags: Elasticsearch Certificates
    Loading...

    Elasticsearch Certificates

    Since Version 6 X-Pack Security for Elasticsearch requires Node to Node encryption to secure the Elasticsearch cluster. The main reason is, that no unknown node can join the cluster and gets data by shard allocation. Since V6, V6.1 and V6.2 the tool certgen became deprecated and was replaced by certutil. My use case scenario: Created certificates with certgen for my cluster and needed to generate a new certificate for a new data node.

    Baseline

    I have in total three clusters. yosemite is my monitoring cluster.

    tan@omega:/opt/elasticsearch-6.0.0> ls -l *.yml
    -rw-r--r-- 1 elastic elastic  1152 Dec  1 12:41 prod-instances.yml
    -rw-r--r-- 1 elastic elastic   604 Dec  1 12:54 test-instances.yml
    -rw-r--r-- 1 elastic elastic   399 Nov 29 13:49 yosemite-instances.yml
    

    The YAML definition is just an input for the certificate generation.

    tan@omega:/opt/elasticsearch-6.0.0> cat yosemite-instances.yml
    instances:
      - name: "Taft Point"
        ip: "10.22.62.137"
        dns:
          - "taft-point"
          - "taft-point.cinhtau.net"
      - name: "Setinal Rock"
        ip: "10.22.63.221"
        dns:
          - "sentinal-rock"
          - "sentinal-rock.cinhtau.net"
      - name: "El Capitan"
        ip: "10.123.19.11"
        dns:
          - "el-capitan"
          - "el-capitan.cinhtau.net"
    

    certutil

    certutil basic help.

    tan@omega:/opt/elasticsearch-6.2.2> bin/x-pack/certutil --help
    Simplifies certificate creation for use with the Elastic Stack
    
    Commands
    --------
    csr - generate certificate signing requests
    cert - generate X.509 certificates and keys
    ca - generate a new local certificate authority
    
    Non-option arguments:
    command
    
    Option         Description
    ------         -----------
    -h, --help     show help
    -s, --silent   show minimal output
    -v, --verbose  show verbose output
    

    For generating a certificate:

    tan@omega:/opt/elasticsearch-6.2.2> bin/x-pack/certutil cert --help
    generate X.509 certificates and keys
    
    Option               Description
    ------               -----------
    -E <KeyValuePair>    Configure a setting
    --ca                 path to an existing ca key pair (in PKCS#12 format)
    --ca-cert            path to an existing ca certificate
    --ca-dn              distinguished name to use for the generated ca. defaults
                           to CN=Elastic Certificate Tool Autogenerated CA
    --ca-key             path to an existing ca private key
    --ca-pass            password for an existing ca private key or the generated
                           ca private key
    --days <Integer>     number of days that the generated certificates are valid
    --dns                comma separated DNS names
    -h, --help           show help
    --in                 file containing details of the instances in yaml format
    --ip                 comma separated IP addresses
    --keep-ca-key        retain the CA private key for future use
    --keysize <Integer>  size in bits of RSA keys
    --multiple           generate files for multiple instances
    --name               name of the generated certificate
    --out                path to the output file that should be produced
    --pass               password for generated private keys
    --pem                output certificates and keys in PEM format instead of
                           PKCS#12
    -s, --silent         show minimal output
    -v, --verbose        show verbose output
    

    To generate a new certificate, I assemble this command:

    bin/x-pack/certutil cert \
      --ca-cert /tmp/ca.crt --ca-key /tmp/ca.key \
      --name "machine-learning-master" \
      --ip "10.22.61.131" \
      --dns "ml-master,ml-master.cinhtau.net" \
      --pem -v
    

    Some notes:

    • ca.crt and ca.key are the preexisting root certificate authority
    • instead of the p12 format use previous pem file output

    Comments


    Leave a comment


  5. 2018-03-15 - Using Sidecar Container for Elasticsearch Configuration; Tags: Using Sidecar Container for Elasticsearch Configuration
    Loading...

    Using Sidecar Container for Elasticsearch Configuration

    Applications shipped in Docker containers are a major game changer, especially having a Elasticsearch cluster. My production cluster consists of 11 nodes. In the core, Elasticsearch is the same. Each node though has its specific configuration, settings and purpose. On top of that, Elasticsearch X-Pack Security in Version 6 requires that the communication within the cluster must run encrypted. This is accomplished by SSL certificates. Each node has its own private key and certificate. So I was facing with the problem, how to ship the node specific parts along with the core elasticsearch container. Use the core container as baseline and copy the configuration and certificate into the container? This would resolve in 11 specific images. Not in the spirit of reusability though. :thinking: The better approach or answer came by remembering the tech talk Docker Patterns by Roland Huss, given at the Java Conference (Javaland 2016). Use a configuration container as a sidecar!

    Concept

    The basic question to the pattern is: How to configure containerized applications for different environments ?

    Solution Patterns:

    • Env-Var Configuration
    • Configuration Container
    • Configuration Service

    ENV-VAR Configuration

    The bullet points from Docker Patterns:

    • Standard configuration method for Docker container
    • Specified during build or run time
    • Universal

    We can define environment variables and configure the docker application to use them. Environment variables can be overwritten by passing them.

    Elasticsearch already does that, for example:

    >sudo docker run -it \ -v /etc/timezone:/etc/timezone -v /etc/localtime:/etc/localtime \ -v /srv/nas/elasticsearch/config/test:/opt/elasticsearch/config \ -v /var/opt/elasticsearch:/var/opt/elasticsearch \ -v /var/log/elasticsearch:/var/log/elasticsearch \ -v /srv/nas/elasticsearch/backup:/srv/nas/elasticsearch/backup \ --name="elasticsearch" \ --cap-add=IPC_LOCK --ulimit memlock=-1:-1 --ulimit nofile=65536:65536 \ -e TZ=Europe/Zurich \ -e ES_JAVA_OPTS="-Xms8g -Xmx8g" \ elasticsearch:latest bin/elasticsearch \ -E network.host=node-1 \ -E network.publish_host=node-1 \ -E node.name=alpha \ -E path.conf=/opt/elasticsearch/config

    We have multiple environment bypasses in above example.

    • For instance we set docker environment variables like timezone (TZ).
    • The -E node.name=alpha is an Elasticsearch argument.

    The elasticsearch node configuration was identical. Only the node specific information were provided.

    This was done by me in the past. It worked until the requirement for node certificates. This doesn’t work for certificate files. Let’s take a look on the next approach.

    Configuration container

    The bullet points from Docker Patterns:

    • Configuration in extra Docker container
    • Volume linked during runtime

    Pros

    • Flexible
    • Explicit

    Cons

    • Static
    • Maintenance overhead
    • Custom file layout

    The basic idea is to run 11 elasticsearch containers and just attaching or linking them to the configuration container (symbolically as sidecar).

    Summary

    The third approach is using some kind of central configuration registry or service. Consul, etcd or Apache Zookeeper are viable solutions, but in my scenario with Elasticsearch not applicable.

    So Sidecar it is!

    Sidecar

    This pattern is named Sidecar because it resembles a sidecar attached to a motorcycle. In the pattern, the sidecar is attached to a parent application and provides supporting features for the application. The sidecar also shares the same lifecycle as the parent application, being created and retired alongside the parent. The sidecar pattern is sometimes referred to as the sidekick pattern and is a decomposition pattern.

    Motorcycle with Sidecar

    Creating the Sidecar Container

    I will demonstrate how I applied the sidecar pattern for my elasticsearch test-cluster of 3 nodes.

    First my project layout:

    $ ll
    total 78
    -rw-r--r-- 1 tan 1049089   463 Mar  1 12:50 Dockerfile
    drwxr-xr-x 1 tan 1049089     0 Feb 28 18:01 node-1/
    drwxr-xr-x 1 tan 1049089     0 Feb 28 18:00 node-2/
    drwxr-xr-x 1 tan 1049089     0 Feb 28 18:00 node-3/
    -rw-r--r-- 1 tan 1049089  1513 Feb 28 17:28 Jenkinsfile
    

    Each node has following node specific configuration. Mandatory for Elasticsearch is elasticsearch.yml and certs folder for X-Pack security.

    $ ls -lR node-1/
    itu/:
    total 20
    drwxr-xr-x 1 tan 1049089    0 Mar  1 12:57 certs/
    -rw-r--r-- 1 tan 1049089 1700 Feb 28 17:12 elasticsearch.yml
    -rw-r--r-- 1 tan 1049089 1920 Feb 28 17:02 jvm.options
    -rw-r--r-- 1 tan 1049089 4459 Feb 28 16:53 log4j2.properties
    drwxr-xr-x 1 tan 1049089    0 Feb 28 18:02 scripts/
    
    node-1/certs:
    total 16
    -rw-r--r-- 1 tan 1049089 1314 Feb 28 16:53 ca.crt
    -rw-r--r-- 1 tan 1049089 2985 Feb 28 16:53 ElasticCloudCaChain.pem
    -rw-r--r-- 1 tan 1049089 1346 Feb 28 17:10 node-1.crt
    -rw-r--r-- 1 tan 1049089 1679 Feb 28 17:10 node-1.key
    
    node-1/scripts:
    total 1
    -rw-r--r-- 1 tan 1049089 60 Feb 28 18:02 README.md
    

    Dockerfile

    The Dockerfile to build the sidecar container.

    FROM alpine
    
    MAINTAINER Vinh Nguyen <le-mapper@cinhtau.net>
    LABEL es.version="5.6.8"
    
    ENV http_proxy=${http_proxy:-http://localhost:3128} \
        https_proxy=${https_proxy:-https://localhost:3128}
    
    RUN apk --no-cache add shadow && \
        adduser -D -u 1000 elasticsearch
    
    COPY . /config/
    
    RUN chown -R elasticsearch:elasticsearch /config
    
    VOLUME /config
    

    The COPY command copies the configuration to the container /config/ directory. Pay attention to ignore unwanted files in the .dockerignore file.

    Check contents

    You can inspect the sidecar container by executing the list directory command ls. After the execution, the docker container is automatically removed.

    docker run -it --volumes-from es_config --rm alpine /bin/sh -c "ls -lR /config"
    

    Deployment

    On the docker host, we don’t need to run the sidecar container. If we just create the container the docker volume is available.

    docker pull elasticsearch-config-test && \
    docker create --name es_config elasticsearch-config-test /bin/true
    

    Usage

    Using Elasticsearch with configuration: To use the configuration volume from our sidecar container, omit this option --volumes-from es_config, where es_config is the name of the docker container.

    sudo docker run -d \
     --volumes-from es_config \
     -v /etc/timezone:/etc/timezone -v /etc/localtime:/etc/localtime \ 
     -v /var/opt/elasticsearch:/var/opt/elasticsearch  \
     -v /var/log/elasticsearch:/var/log/elasticsearch  \
     -v /srv/nas/elasticsearch/backup:/srv/nas/elasticsearch/backup \ 
     -v /srv/nas/elasticsearch/security:/config/node-1/x-pack \
     -v /srv/nas/elasticsearch/ingest-geoip:/config/node-1/ingest-geoip \
     --log-driver none \
     --net=host --name="elasticsearch" \
     --cap-add=IPC_LOCK --ulimit memlock=-1:-1 --ulimit nofile=65536:65536 \
     --restart on-failure:10 \
     -e TZ=Europe/Zurich \
     -e ES_JAVA_OPTS="-Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128 -Dhttps.proxyHost=localhost -Dhttps.proxyPort=3128 -Dhttp.nonProxyHosts=\"localhost|127.0.0.1|.cinhtau.net|.aws.com\"" \
     elasticsearch:latest bin/elasticsearch \
     -E path.conf=/config/node-1
    

    Container Lifecycle

    The sidecar container wil be removed, if you have some cleanup, but the volume still exist, since it used in by the elasticsearch application container. A docker inspect of the application container will give you the name of the used volume.

    docker inspect elasticsearch
    

    Output shortened:

    {
      "Mounts": [
        {
          "Source": "/srv/nas/six/fo/elasticsearch/security",
          "Destination": "/config/itu/x-pack",
          "Mode": "",
          "RW": true,
          "Propagation": "rslave"
        },
        {
          "Source": "/etc/timezone",
          "Destination": "/etc/timezone",
          "Mode": "",
          "RW": true,
          "Propagation": "rslave"
        },  
        {
          "Name": "f89912de22e2b34170e0b331c8a5e25b00f921f4e2417c6b140382389fadee7e",
          "Source": "/var/lib/docker/volumes/f89912de22e2b34170e0b331c8a5e25b00f921f4e2417c6b140382389fadee7e/_data",
          "Destination": "/config",
          "Driver": "local",
          "Mode": "",
          "RW": true,
          "Propagation": ""
        }]
    }
    

    The volume name f89912de22e2b34170e0b331c8a5e25b00f921f4e2417c6b140382389fadee7e is used.

    To check:

    docker volume inspect f89912de22e2b34170e0b331c8a5e25b00f921f4e2417c6b140382389fadee7e
    [
        {
            "Name": "f89912de22e2b34170e0b331c8a5e25b00f921f4e2417c6b140382389fadee7e",
            "Driver": "local",
            "Mountpoint": "/var/lib/docker/volumes/f89912de22e2b34170e0b331c8a5e25b00f921f4e2417c6b140382389fadee7e/_data"
        }
    ]
    

    As you can see the volume name is hard for humans to remember. Docker offers to create named volumes as cherry on top.

    # create named volume
    docker volume create --name es_config_test
    
    # populate volume use one-shot sidecar
    docker run --rm -v es_config_test:/config elasticsearch-config-test /bin/true
    
    # check volume using shell
    docker run --rm -v es_config_test:/config alpine /bin/sh -c 'ls -laR /config'
    
    # show volume
    docker volume inspect es_config_test
    

    Summary

    • The sidecar pattern is very useful.
    • Sidecar containers have multiple purposes:
      • Configuration
      • Proxy (using Nginx as Load Balancer or Proxy)
      • Logging (running log shipper for centralized logging)
      • every other abstraction
    • A sidecar container lifecycle is tight to its application container(s).

    Comments


    Leave a comment


  6. 2018-02-02 - Jaegertracing with Elasticsearch Storage; Tags: Jaegertracing with Elasticsearch Storage
    Loading...

    Jaegertracing with Elasticsearch Storage

    Distributed Tracing with Jaeger by Uber Technologies is pretty impressive. As default you use Apache Cassandra as storage. Jaeger is also capable of using Elasticsearch 5/6. It took me some time and some code diving on github to find the respective options for Elasticsearch. But I finally got it together in this docker-compose.yml. My Elasticsearch Cluster runs with a commercial X-Pack license so we have to pass some authentication.

    Jaeger Components

    This image from Uber Technologies illustrates the components:

    Jaeger

    1. Agent = The agent is a daemon program that runs on every host and receives tracing information submitted by applications via Jaeger client libraries.
    2. Collector = The agent sends the traces to the collector which is responsible to write into the storage (default Cassandra). In our case it is Elasticsearch.
    3. Query = Query is a service that retrieves traces from storage and hosts a UI to display them.

    Elasticsearch Options

    Since the Collector and the Query access Elasticsearch as storage, I found respective configuration options in the options.go. In detail:

    • es.server-urls = A list of servers or simplify it with a load balancer address.
    • es.username = If you have secured Elasticsearch with X-Pack Security or Basic Auth Proxy.
    • es.password = The credential for the respective user.
    • es.num-shards = Creates new index with given number of shards.

    Docker Compose

    version: "3"
    
    services:
      collector:
        image: jaegertracing/jaeger-collector
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
        ports:
          - "14269"
          - "14268:14268"
          - "14267"
          - "9411:9411"
        restart: on-failure
        command: ["/go/bin/collector-linux", "--es.server-urls=http://es-loadbalancer:9200", "--es.username=jaeger_remote_agent", "--es.password=HunterSpir!t", "--es.num-shards=1", "--span-storage.type=elasticsearch", "--log-level=error"]
    
      agent:
        image: jaegertracing/jaeger-agent
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
        command: ["/go/bin/agent-linux", "--collector.host-port=collector:14267"]
        ports:
          - "5775:5775/udp"
          - "6831:6831/udp"
          - "6832:6832/udp"
          - "5778:5778"
        restart: on-failure
        depends_on:
          - collector
    
      query:
        image: jaegertracing/jaeger-query
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
          - no_proxy=localhost
        ports:
          - "16686:16686"
          - "16687"
        restart: on-failure
        command: ["/go/bin/query-linux", "--es.server-urls=http://es-loadbalancer:9200", "--span-storage.type=elasticsearch", "--log-level=debug", "--es.username=jaeger_remote_agent", "--es.password=HunterSpir!t", "--query.static-files=/go/jaeger-ui/"]
        depends_on:
          - agent

    You could easily just ramp up elasticsearch in above docker-compose file.

    If you do docker-compose up all services are nicely started.

    Starting collector_1 ... done
    Starting agent_1 ... done
    Starting query_1 ... done
    Attaching to collector_1, agent_1, query_1
    agent_1      | {"level":"info","ts":1517582471.4691374,"caller":"tchannel/builder.go:89","msg":"Enabling service discovery","service":"jaeger-collector"}
    agent_1      | {"level":"info","ts":1517582471.4693034,"caller":"peerlistmgr/peer_list_mgr.go:111","msg":"Registering active peer","peer":"collector:14267"}
    agent_1      | {"level":"info","ts":1517582471.4709918,"caller":"agent/main.go:64","msg":"Starting agent"}
    query_1      | {"level":"info","ts":1517582471.871992,"caller":"healthcheck/handler.go:99","msg":"Health Check server started","http-port":16687,"status":"unavailable"}
    query_1      | {"level":"info","ts":1517582471.9079432,"caller":"query/main.go:126","msg":"Registering metrics handler with HTTP server","route":"/metrics"}
    query_1      | {"level":"info","ts":1517582471.908044,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
    query_1      | {"level":"info","ts":1517582471.9086466,"caller":"query/main.go:135","msg":"Starting jaeger-query HTTP server","port":16686}
    agent_1      | {"level":"info","ts":1517582472.472438,"caller":"peerlistmgr/peer_list_mgr.go:157","msg":"Not enough connected peers","connected":0,"required":1}
    agent_1      | {"level":"info","ts":1517582472.4725628,"caller":"peerlistmgr/peer_list_mgr.go:166","msg":"Trying to connect to peer","host:port":"collector:14267"}
    agent_1      | {"level":"info","ts":1517582472.4746702,"caller":"peerlistmgr/peer_list_mgr.go:176","msg":"Connected to peer","host:port":"[::]:14267"}

    Dependencies

    Jaeger has the amazing feature to draw the dependency diagram for your traces.

    Jaeger

    If you want to have this amazing feature in a productive setup you need to start the spark-dependencies. Apache Spark is fast engine for large scale data processing, which the collection of spans are. I was a little bit afraid that Elasticsearch wasn’t supported, but luckily I was mistaken. The options are very well documented. A basic template for starting the spark-dependencies as docker container with Elasticsearch as storage backend:

    docker run \
    --env STORAGE=elasticsearch \
    --env ES_NODES=http://elasticsearch-server1:9200,http://elasticsearch-server2:9200 \
    --env ES_USERNAME=elastic \
    --env ES_PASSWORD=IHaveChangedMyPassword \
    jaegertracing/spark-dependencies

    This has resulted in

    tan@omega3:~> docker run --env STORAGE=elasticsearch --env ES_NODES=http://elastic-lb:9200 --env ES_USERNAME=elastic --env ES_PASSWORD=IHaveChangedMyPassword jaegertracing/spark-dependencies
    
    18/02/08 10:50:49 INFO ElasticsearchDependenciesJob: Running Dependencies job for 2018-02-08T00:00Z, reading from jaeger-span-2018-02-08 index, result storing to jaeger-dependencies-2018-02-08/dependencies
    18/02/08 10:50:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    18/02/08 10:50:55 INFO ElasticsearchDependenciesJob: Done, 4 dependency objects created

    In order to keep the dependencies up to date, we need some kind of scheduler. There are several possibilities as cron job or having the luxury of a cluster manager, would also suffice.

    Comments


    Leave a comment


  7. 2017-12-15 - Pretty print duration; Tags: Pretty print duration
    Loading...

    Pretty print duration

    Performing a reindex job in Elasticsearch gives you the time the job took.

    curl -XPOST "http://localhost:9200/_reindex" -H 'Content-Type: application/json' -d'
    > {
    >   "source": {
    >     "index": "fo-log-2017.05.05"
    >   },
    >   "dest": {
    >     "index": "fo-log-fix-2017.05.05",
    >      "pipeline": "kt-improve"
    >   }
    > }'

    The outcome.

    {
    	"took": 13147934,
    	"timed_out": false,
    	"total": 15855322,
    	"updated": 0,
    	"created": 15855322,
    	"deleted": 0,
    	"batches": 15856,
    	"version_conflicts": 0,
    	"noops": 0,
    	"retries": {
    		"bulk": 0,
    		"search": 0
    	},
    	"throttled_millis": 0,
    	"requests_per_second": -1.0,
    	"throttled_until_millis": 0,
    	"failures": []
    }

    13147934 milliseconds is a little bit cryptic. A simple way to give your customers a more human readable information is the JavaScript pretty-ms package and its CLI.

    npm install --global pretty-ms
    npm install --global pretty-ms-cli

    The usage prints 3 hours, 39 minutes, 7.9 seconds

    pretty-ms 13147934
    3h 39m 7.9s
  8. 2017-10-23 - Aggregate Data in Elasticsearch Part 2; Tags: Aggregate Data in Elasticsearch Part 2
    Loading...

    Aggregate Data in Elasticsearch Part 2

  9. 2017-10-23 - Timestamps in Painless; Tags: Timestamps in Painless
    Loading...

    Timestamps in Painless

    In short: Converting a UTC timestamp to a local timestamp (in Switzerland).

    Long story: I was facing the use case, that for a statistical analysis I aggregated data from a minute basis to a hourly basis. The date aggregation itself rendered UTC timestamps, since everything in Elasticsearch is stored in UTC. I have to calculate my local timestamp in Switzerland, in order to display the correct occurrence in Kibana. My solution involves in the transform part painless. Painless scripting in Elasticsearch allows the usage of the new Java Time API since Java 7.

    Therefore we can test the code in Java before:

    Instant myInstant = Instant.ofEpochMilli(1503176400000L);
    System.out.println(myInstant);
    
    ZoneId switzerland = ZoneId.of("Europe/Zurich");
    LocalDateTime localDateTime = LocalDateTime.ofInstant(myInstant, switzerland);
    System.out.println(localDateTime);
    
    System.out.println(LocalDateTime.ofInstant(Instant.ofEpochMilli(1503176400000L), ZoneOffset.UTC)
        .atZone(ZoneId.of("Europe/Zurich"))
        .toInstant().toEpochMilli());
    

    The console output will give us:

    2017-08-19T21:00:00Z
    2017-08-19T23:00
    1503169200000
    

    item.key contains the aggregated timestamp. I create an Instant based on milliseconds and retrieve the milliseconds for my respective time zone.

    "transform": {
        "script": {
          "lang": "painless",
          "source": """
    def docs=[];
    def id='';
    def value=0;
    for(item in ctx.payload.aggregations.trx_over_time.buckets) {
      def document = [
        '_id': item.key,
        '@timestamp': LocalDateTime.ofInstant(Instant.ofEpochMilli(item.key), ZoneOffset.UTC)
                      .atZone(ZoneId.of("Europe/Zurich")).toInstant().toEpochMilli(),
        'value': item.sum_trx.value,
        'logger': 'STA9101',
        'channel': 'Issuing',
        'ingest.time': ctx.execution_time,
        'ingest.agent': 'watcher'];
      docs.add(document);
    }
    return ['_doc': docs];
    """
        }
    }
    
  10. 2017-10-20 - Aggregate data in Elastisearch Part 1; Tags: Aggregate data in Elastisearch Part 1
    Loading...

    Aggregate data in Elastisearch Part 1

    Elasticsearch with its Query DSL allows powerful aggregations in order to save documents and disk space. After a certain period of time a certain level of detail is not needed anymore. For instance, I collect on a daily basis statistical data about fraud prevention services.

    GET _cat/indices/fraud?v&s=index:asc
    
    health status index            uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   fraud-2017.08.17 OlwgIRenQ_2dyiKm-Aapkw   2   1       1680            0      1.2mb          634kb
    green  open   fraud-2017.08.18 0aJcUMbFQSa3DpGtg1l5iw   2   1      20160            0     12.8mb          6.4mb
    green  open   fraud-2017.08.19 pQCsW7NpSZe5UuJT5vcIvQ   2   1      20160            0     12.6mb          6.3mb
    green  open   fraud-2017.08.20 G4qG8L5HRKGHddx9jyYbvQ   2   1      26160            0       15mb          7.5mb
    green  open   fraud-2017.08.21 UNkfZaXISj-p2fOBUxor0Q   2   1      92789            0     45.3mb         22.7mb
    green  open   fraud-2017.08.22 bz8vqtC2RUW1YjnN2L2oQw   2   1      88361            0     44.2mb           22mb
    green  open   fraud-2017.08.23 8AtPnSy-TVu0fMEQ3lbcsw   2   1      80999            0     40.6mb         20.3mb
    green  open   fraud-2017.08.24 3GIiEC7aRCOvFFzM0_Fi2w   2   1     194570            0     70.4mb         35.1mb
    green  open   fraud-2017.08.25 og1Xx9XITJa1gKBxdQMYLQ   2   1     234934            0     83.8mb         41.8mb
    green  open   fraud-2017.08.26 AUwvDE0JR0aKIHkKGTseSg   2   1     235553            0     84.4mb         42.2mb
    green  open   fraud-2017.08.27 MdLO0ULoQn6al0MSdPu8IA   2   1     275991            0     93.1mb         46.5mb
    green  open   fraud-2017.08.28 93nZmpkOSWOcUbm1CxHtCQ   2   1     324153            0      106mb           53mb
    green  open   fraud-2017.08.29 Nm021E6sTFi-9DmlGqQEkA   2   1     315797            0    103.4mb         51.7mb
    green  open   fraud-2017.08.30 NCJbV-uLSzGbjjRQMvRqug   2   1     283340            0     96.1mb           48mb
    green  open   fraud-2017.08.31 AQAWdCqDT3ehvjyn4FbLwg   2   1     332613            0    115.4mb         57.6mb
    green  open   fraud-2017.09.01 fNdnrAzIRbOhwuHMsBH-KA   2   1     305892            0    109.7mb           55mb
    green  open   fraud-2017.09.02 Z9ynOZfhQgeIi8EH9VUzTg   2   1     276176            0    103.6mb           52mb
    green  open   fraud-2017.09.03 IUE7xIf0RFyqfxqeTmDtTQ   2   1     231013            0       91mb         45.6mb
    green  open   fraud-2017.09.04 bwyZp5eMTa-9tZa2dGIw6g   2   1     268054            0    100.9mb         50.2mb
    green  open   fraud-2017.09.05 tuGZS68IT6aQV5fQbwkL1A   2   1     235889            0     92.9mb         46.4mb
    green  open   fraud-2017.09.06 DS2syWlHSSKzzwCqm2ImAA   2   1     227299            0     89.8mb           45mb
    green  open   fraud-2017.09.07 PwZ39BHVRDekpe3Eklapgg   2   1     251881            0     92.9mb         46.4mb
    green  open   fraud-2017.09.08 tkLcydSoT9KIBTjjEFXH1A   2   1     175374            0     68.5mb         34.3mb
    green  open   fraud-2017.09.09 IaRhV8MHTaO8WHCDYqyf7g   2   1     184333            0     79.3mb         39.6mb
    green  open   fraud-2017.09.10 5Kc-F3omQHiA1YzU4U5Y4Q   2   1     161799            0     70.6mb         35.4mb
    green  open   fraud-2017.09.11 Ajbw9XnNTga66bN-U7IgTA   2   1     205447            0     83.1mb         41.5mb
    green  open   fraud-2017.09.12 8DrE-dZ_TQmr1Boor09BKw   2   1     187816            0     70.2mb           35mb
    green  open   fraud-2017.09.13 hkQ3WQ49SM-rggxqQonfiw   2   1     234633            0     88.5mb         44.3mb
    green  open   fraud-2017.09.14 Q39tR7sKSHqgbEJpJTQiZQ   2   1     230865            0     87.9mb         43.9mb
    green  open   fraud-2017.09.15 ebpcLvWnSkSLen7OP7w14Q   2   1     188488            0     78.3mb         39.2mb
    green  open   fraud-2017.09.16 FBFbgbadQg-oTMDfNoob0g   2   1     224340            0       96mb           48mb
    green  open   fraud-2017.09.17 6z6TpLq7TXuM64K1hYlaDQ   2   1     239607            0     98.6mb         49.1mb
    green  open   fraud-2017.09.18 Zq-KBD2sTd2Wnj00Eqv2qQ   2   1     207967            0     90.4mb         45.2mb
    green  open   fraud-2017.09.19 VaZmTvtqRY6UmF779jb3TA   2   1     209122            0     77.5mb         38.8mb
    green  open   fraud-2017.09.20 avVtOhkqSZuccPkYuOmU5g   2   1     203056            0     74.6mb         37.3mb
    green  open   fraud-2017.09.21 gakY_3jHSUq1maHR-4wLXA   2   1     127662            0       55mb         27.4mb
    green  open   fraud-2017.09.22 zMDUizWVREq9590wUsJdqQ   2   1     127546            0     55.5mb         27.7mb
    green  open   fraud-2017.09.23 ptPaU1WZSKa57jflLpdvNA   2   1      91948            0     44.9mb         22.4mb
    green  open   fraud-2017.09.24 uOi464xxTBeoUGDGMQAckQ   2   1     104120            0     47.3mb         23.6mb
    green  open   fraud-2017.09.25 tHMnRTu9R3W_woxsKS2qAA   2   1      98119            0     46.3mb           23mb
    green  open   fraud-2017.09.26 XgHv3j9ASwq0q_U4otsSFw   2   1     118299            0       52mb           26mb
    green  open   fraud-2017.09.27 CeY_Qw1eQ1yEi7WalM_Zlg   2   1     135067            0     61.1mb         30.5mb
    green  open   fraud-2017.09.28 5SRhQvB1RdeLPy8WiDQjGA   2   1     121341            0     56.8mb         28.7mb
    green  open   fraud-2017.09.29 L2zfdZZCR9e-pQnQ9e5I1A   2   1     136221            0     63.1mb         31.2mb
    green  open   fraud-2017.09.30 oncWQAIcSzmBPRt2wlHvTA   2   1     165502            0     80.9mb         40.5mb
    green  open   fraud-2017.10.01 OOg1SZH1Qjmo85NSxbDfjg   2   1     162648            0       77mb         38.6mb
    green  open   fraud-2017.10.02 wc6l_5WDRVCHMPiUx9BBRQ   2   1     177023            0     82.5mb         41.4mb
    green  open   fraud-2017.10.03 6GYS6z8hSqynFFI9GxyWTA   2   1     186684            0       72mb           36mb
    green  open   fraud-2017.10.04 _ZkXUpbbRO-euZv_8Vatlw   2   1     177498            0     69.5mb         34.7mb
    green  open   fraud-2017.10.05 6G1OZobKTbKQ9MdncVhtPQ   2   1     180769            0     70.3mb         35.1mb
    green  open   fraud-2017.10.06 leXb6SkhQASzcZ164hSksg   2   1     194112            0     74.3mb         37.2mb
    green  open   fraud-2017.10.07 4rvy0nWWRZGf42eyPBECPg   2   1     181823            0     70.3mb         35.1mb
    green  open   fraud-2017.10.08 9rTk6wO_ThWAOntg98qIYg   2   1     125629            0     54.4mb         27.2mb
    green  open   fraud-2017.10.09 opTUqawdTzqeFDIs5ouAfw   2   1     144947            0     59.3mb         29.6mb
    green  open   fraud-2017.10.10 xvQlqnhlSSiwmJU345_tNA   2   1     141745            0     58.4mb         29.1mb
    green  open   fraud-2017.10.11 NODK8l5WS06iYsZ94Ui-AA   2   1     132986            0     56.5mb         28.2mb
    green  open   fraud-2017.10.12 aEwb4ihqQ7iKeLRm_ALB6w   2   1     135184            0     57.2mb         28.6mb
    green  open   fraud-2017.10.13 WJHzV1RzR2SazRfN5P_8Bw   2   1     143217            0     59.5mb         29.7mb
    green  open   fraud-2017.10.14 qQmNX0sySxG7ow21vn3Bnw   2   1     133659            0       57mb         28.5mb
    green  open   fraud-2017.10.15 xch4F_E_Rhi8eX1CQlLyCQ   2   1     121647            0     53.1mb         26.5mb
    green  open   fraud-2017.10.16 5bk-GdujRyOpK2JbSvjdOA   2   1     141811            0     58.5mb         29.2mb
    green  open   fraud-2017.10.17 wU1uLfkETgaTTyuAGQjxhw   2   1     173206            0     66.1mb           33mb
    green  open   fraud-2017.10.18 1wwFZwfjSLmVNxhzo3-3Rw   2   1     142172            0     59.5mb         29.7mb
    green  open   fraud-2017.10.19 ZzITeyNaSfWAvv9AiKNr9A   2   1     126300            0     55.8mb         27.8mb
    green  open   fraud-2017.10.20 BbwIASphRXe3jbE-hgammg   2   1      46324            0     20.8mb         10.4mb
    

    Each minute statistical values are logged. If we look for one logger a whole day:

    Get count

    GET fraud-2017.08.19/_search
    {
      "size": 0,
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "channel.keyword": "Issuing"
              }
            },
            {
              "match": {
                "logger.keyword": "STA9101"
              }
            }
          ]
        }
      }
    }
    
    {
      "took": 14,
      "timed_out": false,
      "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1440,
        "max_score": 0,
        "hits": []
      }
    }
    

    1440 documents = 1 doc * 60 minute * 24 hours = 1440 metric documents

    Aggregate

    1440 metric documents can be reduced to 24 documents on a hour basis

    Chain aggregations: date histogram and metric aggregation

    GET fraud-2017.08.19/_search
    {
      "size": 0,
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "channel.keyword": "Issuing"
              }
            },
            {
              "match": {
                "logger.keyword": "STA9101"
              }
            }
          ]
        }
      },
      "aggs": {
        "trx_over_time": {
          "date_histogram": {
            "field": "@timestamp",
            "interval": "1h"
          },
          "aggs": {
            "sum_trx": {
              "sum": {
                "field": "value"
              }
            }
          }
        }
      }
    }
    

    Get 24 hours

    {
      "took": 2,
      "timed_out": false,
      "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1440,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "trx_over_time": {
          "buckets": [
            {
              "key_as_string": "2017-08-19T00:00:00.000Z",
              "key": 1503100800000,
              "doc_count": 60,
              "sum_trx": {
                "value": 6742
              }
            },
            {
              "key_as_string": "2017-08-19T01:00:00.000Z",
              "key": 1503104400000,
              "doc_count": 60,
              "sum_trx": {
                "value": 4734
              }
            },
            {
              "key_as_string": "2017-08-19T02:00:00.000Z",
              "key": 1503108000000,
              "doc_count": 60,
              "sum_trx": {
                "value": 3752
              }
            },
            {
              "key_as_string": "2017-08-19T03:00:00.000Z",
              "key": 1503111600000,
              "doc_count": 60,
              "sum_trx": {
                "value": 5408
              }
            },
            {
              "key_as_string": "2017-08-19T04:00:00.000Z",
              "key": 1503115200000,
              "doc_count": 60,
              "sum_trx": {
                "value": 13376
              }
            },
            {
              "key_as_string": "2017-08-19T05:00:00.000Z",
              "key": 1503118800000,
              "doc_count": 60,
              "sum_trx": {
                "value": 34932
              }
            },
            {
              "key_as_string": "2017-08-19T06:00:00.000Z",
              "key": 1503122400000,
              "doc_count": 60,
              "sum_trx": {
                "value": 93086
              }
            },
            {
              "key_as_string": "2017-08-19T07:00:00.000Z",
              "key": 1503126000000,
              "doc_count": 60,
              "sum_trx": {
                "value": 163467
              }
            },
            {
              "key_as_string": "2017-08-19T08:00:00.000Z",
              "key": 1503129600000,
              "doc_count": 60,
              "sum_trx": {
                "value": 230601
              }
            },
            {
              "key_as_string": "2017-08-19T09:00:00.000Z",
              "key": 1503133200000,
              "doc_count": 60,
              "sum_trx": {
                "value": 264623
              }
            },
            {
              "key_as_string": "2017-08-19T10:00:00.000Z",
              "key": 1503136800000,
              "doc_count": 60,
              "sum_trx": {
                "value": 248176
              }
            },
            {
              "key_as_string": "2017-08-19T11:00:00.000Z",
              "key": 1503140400000,
              "doc_count": 60,
              "sum_trx": {
                "value": 238703
              }
            },
            {
              "key_as_string": "2017-08-19T12:00:00.000Z",
              "key": 1503144000000,
              "doc_count": 60,
              "sum_trx": {
                "value": 248056
              }
            },
            {
              "key_as_string": "2017-08-19T13:00:00.000Z",
              "key": 1503147600000,
              "doc_count": 60,
              "sum_trx": {
                "value": 247916
              }
            },
            {
              "key_as_string": "2017-08-19T14:00:00.000Z",
              "key": 1503151200000,
              "doc_count": 60,
              "sum_trx": {
                "value": 216478
              }
            },
            {
              "key_as_string": "2017-08-19T15:00:00.000Z",
              "key": 1503154800000,
              "doc_count": 60,
              "sum_trx": {
                "value": 160784
              }
            },
            {
              "key_as_string": "2017-08-19T16:00:00.000Z",
              "key": 1503158400000,
              "doc_count": 60,
              "sum_trx": {
                "value": 107450
              }
            },
            {
              "key_as_string": "2017-08-19T17:00:00.000Z",
              "key": 1503162000000,
              "doc_count": 60,
              "sum_trx": {
                "value": 86520
              }
            },
            {
              "key_as_string": "2017-08-19T18:00:00.000Z",
              "key": 1503165600000,
              "doc_count": 60,
              "sum_trx": {
                "value": 68501
              }
            },
            {
              "key_as_string": "2017-08-19T19:00:00.000Z",
              "key": 1503169200000,
              "doc_count": 60,
              "sum_trx": {
                "value": 55975
              }
            },
            {
              "key_as_string": "2017-08-19T20:00:00.000Z",
              "key": 1503172800000,
              "doc_count": 60,
              "sum_trx": {
                "value": 40971
              }
            },
            {
              "key_as_string": "2017-08-19T21:00:00.000Z",
              "key": 1503176400000,
              "doc_count": 60,
              "sum_trx": {
                "value": 27974
              }
            },
            {
              "key_as_string": "2017-08-19T22:00:00.000Z",
              "key": 1503180000000,
              "doc_count": 60,
              "sum_trx": {
                "value": 18237
              }
            },
            {
              "key_as_string": "2017-08-19T23:00:00.000Z",
              "key": 1503183600000,
              "doc_count": 60,
              "sum_trx": {
                "value": 13241
              }
            }
          ]
        }
      }
    }
    

    Source document

    If we look into the source document, which is parsed by a pipeline, we have a lot of information which aren’t related to the statistical information. We can not only save documents, but also clean up the number of fields.

    {
      "_index": "fraud-2017.10.20",
      "_type": "stats",
      "_id": "AV84tDR8IIOyJsb0pJ3m",
      "_score": 1,
      "_source": {
        "instance": 0,
        "offset": 1050043,
        "level": "I",
        "logger": "STA9101",
        "channel": "Issuing",
        "input_type": "log",
        "logmessage": "2290 transactions since 2017-10-20 09:33:10, next statistical log at: 2017-10-20 09:35:10",
        "index": "fraud",
        "source": "/var/log/RiskShield/iss/prd/cur/2017-10-20_DecisionServer_Stats.log",
        "type": "stats",
        "tags": [
          "beats_input_codec_plain_applied"
        ],
        "environment": "prd",
        "@timestamp": "2017-10-20T07:34:10.000Z",
        "application": "RiskShield",
        "@version": "1",
        "beat": {
          "hostname": "fraud-detect",
          "name": "fraud-detect",
          "version": "5.5.2"
        },
        "host": "fraud-detect",
        "value": 2290
      },
      "fields": {
        "@timestamp": [
          1508484850000
        ]
      }
    }
    

    Using Watcher

    One way for automation, is Elasticsearch Watcher. If you don’t have a commercial license, you could also easily accomplish this task with Spring Batch and a custom Tasklet Implementation using the official Elasticsearch Java Rest Client libraries.

    For demonstration purpose, following Index and Mapping will be used:

    DELETE test
    PUT test
    {
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      },
      "mappings": {
        "stats": {
           "_all": {
            "enabled": false
           },
          "properties": {
            "@timestamp": {
              "type": "date"
            },
            "channel": {
              "type": "keyword"
            },
            "logger": {
              "type": "keyword"
            },
            "value": {
              "type": "integer"
            },
            "ingest.agent": {
              "type": "keyword"
            },
            "ingest.time": {
              "type": "date"
            }
          }
        }
      }
    }
    

    Now the tricky part. As index source I choose a specific index for testing. Use an alias in production instead. The action part is more interesting. The search aggregation results will be used as index payload to write a new document.

    Define watch for testing purpose on specific index

    PUT /_xpack/watcher/watch/fraud-issuing-aggregations
    {
      "input": {
        "search": {
          "request": {
            "indices": [
              "fraud-2017.08.20"
            ],
            "types": [
              "stats"
            ],
            "body": {
              "size": 0,
              "query": {
                "bool": {
                  "must": [
                    {
                      "match": {
                        "channel.keyword": "Issuing"
                      }
                    },
                    {
                      "match": {
                        "logger.keyword": "STA9101"
                      }
                    }
                  ]
                }
              },
              "aggs": {
                "trx_over_time": {
                  "date_histogram": {
                    "field": "@timestamp",
                    "interval": "1h"
                  },
                  "aggs": {
                    "sum_trx": {
                      "sum": {
                        "field": "value"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      },
      "trigger": {
        "schedule": {
          "interval": "1d"
        }
      },
      "actions": {
        "index_payload": {
          "transform": {
            "script": {
              "lang": "painless",
              "source": """
              def docs = [];
    def id = '';
    def value = 0;
    for(item in ctx.payload.aggregations.trx_over_time.buckets) {
    def document = [
    '_id': item.key,
    '@timestamp': LocalDateTime.ofInstant(Instant.ofEpochMilli(item.key), ZoneOffset.UTC).atZone(ZoneId.of("Europe/Zurich")).toInstant().toEpochMilli(),
    'value': item.sum_trx.value,
    'logger': 'STA9101',
    'channel': 'Issuing',
    'ingest.time': ctx.execution_time,
    'ingest.agent': 'watcher'
    ];
    docs.add(document);}
    return ['_doc' : docs];
              """
            }
          },
          "index": {
            "index": "test",
            "doc_type": "stats"
          }
        }
      }
    }
    

    Execute watch manually

    POST _xpack/watcher/watch/fraud-issuing-aggregations/_execute
    

    Query aggregated documents

    GET test/_search
    {
      "query": {"match_all": {}}
    }
    

    If everything works perfectly, we can adjust the watcher to do it for indices older than two weeks and remove the old indices. Therefore Elasticsearch Curator comes in handy, by assigning indices older than two weeks to a dedicated alias.

  11. 2017-10-16 - Analyze Cluster Reroute; Tags: Analyze Cluster Reroute
    Loading...

    Analyze Cluster Reroute

    My test cluster health was yellow. The X-Pack Monitoring pointed to some indices, which were yellow.

    Monitoring Shard Allocation

    If I check the shards on a specific index

    GET _cat/shards/ep2-2017.10.15?V
    

    Result

    index          shard prirep state        docs   store ip           node
    ep2-2017.10.15 1     p      STARTED    453095 236.6mb 10.22.62.135 etu
    ep2-2017.10.15 1     r      UNASSIGNED                             
    ep2-2017.10.15 0     p      STARTED    454530 237.2mb 10.22.191.23 itu-dc2
    ep2-2017.10.15 0     r      STARTED    454530 237.2mb 10.22.62.130 itu
    

    If you try to allocate the unassigned replica:

    POST /_cluster/reroute
    {
      "commands": [
        {
          "allocate_replica": {
            "index": "ep2-2017.10.15",
            "shard": 1,
            "node": "etu-dc2"
          }
        }
      ]
    }
    

    We get an extended error reason.

    {
      "error": {
        "root_cause": [
          {
            "type": "remote_transport_exception",
            "reason": "[itu-dc2][10.22.191.23:9300][cluster:admin/reroute]"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "[allocate_replica] allocation of [ep2-2017.10.15][1] on node {etu-dc2}{aeT1BPu2SjW1g3A18RnuTA}{QxY1ImcVQuuFFGjE31bOzw}{mtlplfohap05}{10.22.191.14:9300}{ml.max_open_jobs=10, rack=with-nas, box_type=hot, ml.enabled=true} is not allowed, reason: [YES(shard has no previous failures)][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][NO(target node version [5.6.1] is older than the source node version [5.6.3])][YES(the shard is not being snapshotted)][YES(node passes include/exclude/require filters)][YES(the shard does not exist on the same node)][YES(enough disk for shard on node, free: [183.5gb], shard size: [0b], free after allocating shard: [183.5gb])][YES(below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2])][YES(total shard limits are disabled: [index: -1, cluster: -1] <= 0)][YES(allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it)]"
      },
      "status": 400
    }
    

    Check for the NO condition:

    [NO(target node version [5.6.1] is older than the source node version [5.6.3])]
    

    After I checked my nodes, I saw there was a partial cluster upgrade.

    GET /_cat/nodes?v&h=version,name,jdk
    
    version name    jdk
    5.6.1   itu     1.8.0_141
    5.6.1   etu-dc2 1.8.0_141
    5.6.3   etu     1.8.0_141
    5.6.3   dev     1.8.0_141
    5.6.1   itu-dc2 1.8.0_141
    

    After the upgrade everything worked fine and the cluster health was back to green.

  12. 2017-10-09 - Reset Persistent Elasticsearch Cluster Setting; Tags: Reset Persistent Elasticsearch Cluster Setting
    Loading...

    Reset Persistent Elasticsearch Cluster Setting

    If you setup Elasticsearch to report to a dedicated monitoring cluster

    PUT _cluster/settings
    {
      "persistent": {
        "xpack.monitoring.exporters.cloud_monitoring.type": "http",
        "xpack.monitoring.exporters.cloud_monitoring.host": "MONITORING_ELASTICSEARCH_URL",
        "xpack.monitoring.exporters.cloud_monitoring.auth.username": "cloud_monitoring_agent",
        "xpack.monitoring.exporters.cloud_monitoring.auth.password": "MONITORING_AGENT_PASSWORD"
      }
    }
    

    you can unset or reset it with, passing null.

    PUT _cluster/settings
    {
      "persistent": {
        "xpack.monitoring.exporters.cloud_monitoring.type": null,
        "xpack.monitoring.exporters.cloud_monitoring.host": null,
        "xpack.monitoring.exporters.cloud_monitoring.auth.username": null,
        "xpack.monitoring.exporters.cloud_monitoring.auth.password": null
      }
    }
    

    In the logs of the master node, following log message will appear

    [2017-10-09T15:30:40,007][INFO ][o.e.c.s.ClusterSettings  ] [master-one] updating [xpack.monitoring.exporters.] from [{"cloud_monitoring":{"host":"monitornode","type":"http","auth":{"password":"mapperking","username":"remote_monitor"}}}] to [{}]
    
  13. 2017-10-05 - Ship Monit logs with Filebeat; Tags: Ship Monit logs with Filebeat
    Loading...

    Ship Monit logs with Filebeat

    A quick recipe how to ship Monit logs to Elasticsearch. Some initial configuration was in place but I ran into some troubles.

    Problems

    1. Monit logs disappeared on the 1st of October
    2. Multiline messages haven’t been properly processed

    Example logs

    [CEST Oct  5 16:54:19] error    : 'batch_healthcheck' failed protocol test [HTTP] at [0.0.0.0]:80/batch/systemStatus [TCP/IP] -- Connection refused
    [CEST Oct  5 16:54:19] info     : 'batch_healthcheck' exec: /bin/bash
    [CEST Oct  5 16:54:19] error    : 'imp4' process is not running
    [CEST Oct  5 16:54:19] info     : 'imp4' trying to restart
    [CEST Oct  5 16:54:19] info     : 'imp4' start: /opt/six/fo/jboss/bin/jboss-opr.sh
    [CEST Oct  5 16:54:52] error    : 'imp4' failed to start (exit status 1) -- /opt/six/fo/jboss/bin/jboss-opr.sh: tput: No value for $TERM and no -T specified
    tput: No value for $TERM and no -T specified
    2017-10-05 16:54:19 root jboss.sh: imp4 not known
    2017-10-05 16:54:19 root jboss.sh: imp4 could not be started
    

    1. Monit logs disappeared

    The reason is a simple one. Monit just don’t log day of month with leading zero.

    [CEST Oct  5 16:54:19] error    : 'imp4' process is not running
    

    Just add the date pattern for a single day in the date processor formats.

    {
      "date": {
        "field": "jesus",
        "target_field": "datetime",
        "formats": [
          "MMM dd HH:mm:ss",
          "MMM  d HH:mm:ss"
        ],
        "timezone": "Europe/Zurich"
      }
    }
    

    2. Multiline messages

    Filebeat sends multiline messages by this configuration:

    multiline.pattern: '^\['
    multiline.negate: false
    multiline.match: before
    

    In the ingest pipeline the message is truncated since the newline character collides with the grok processor. With the gsub processor you can remove the newline in order to be properly kept by the grok processor.

    {
      "gsub": {
        "field": "message",
        "pattern": "\n",
        "replacement": " "
      }
    }
    

    Solution

    Extend pipeline

    PUT _ingest/pipeline/monit_logs

    {
      "description": "grok pipeline for monit logs",
      "processors": [
        {
          "gsub": {
            "field": "message",
            "pattern": "\n",
            "replacement": " "
          }
        },
        {
          "grok": {
            "field": "message",
            "patterns": [
              ""
              "\[%{WORD} %{GREEDYDATA:jesus}\] %{WORD:level} %{SPACE} : \'%{GREEDYDATA:service}\' %{GREEDYDATA:logmessage}"
              ""
            ]
          }
        },
        {
          "date": {
            "field": "jesus",
            "target_field": "datetime",
            "formats": [
              "MMM dd HH:mm:ss",
              "MMM  d HH:mm:ss"
            ],
            "timezone": "Europe/Zurich"
          }
        },
        {
          "remove": {
            "field": [
              "message",
              "jesus"
            ]
          }
        }
      ],
      "on_failure": [
        {
          "set": {
            "field": "error",
            "value": " on operation: "
          }
        }
      ]
    }
    

    Configure filebeat

    # monit logs
    - input_type: log
      paths:
         - /var/log/monit.log
      exclude_files: [".gz$"]
      fields:
        type: "logs"
        host: "${beat.hostname:dhost}"
        application: "monit"
        environment: "${FO_ENV:default}"
      fields_under_root: true
      multiline.pattern: '^\['
      multiline.negate: false
      multiline.match: before
      pipeline: "monit_logs"
    
  14. 2017-09-01 - Remove field from Elasticsearch document; Tags: Remove field from Elasticsearch document
    Loading...

    Remove field from Elasticsearch document

    Lets assume you have some unwanted field in a document. In my case this is an error field from a pipeline.

    GET ems/_search
    {
      "query": {
        "simple_query_string": {
          "query": "_exists_:error"
        }
      }
    }
    

    To get rid of it in a single document:

    POST /fo-ems-2017.04/logs/AVuzBCphEwjYNH5brv6M/_update
    {
      "script": "ctx._source.remove(\"error\")"
    }
    

    We can also utilize update by query to remove it from all documents, that have the error field.

    POST ems/logs/_update_by_query?wait_for_completion=false&conflicts=proceed
    {
      "script": {
        "inline": """ctx._source.remove("error")""",
        "lang": "painless"
      },
      "query": {
        "bool": {
          "must": [
            {
              "exists": {
                "field": "error"
              }
            }
          ]
        }
      }
    }
    
  15. 2017-08-30 - Update Documents By Query; Tags: Update Documents By Query
    Loading...

    Update Documents By Query

    I got a use case, where I needed to grok some text. Therefore I created this exemplary pipeline.

    curl -XPUT "http://localhost:9200/_ingest/pipeline/ems_flooding" -H 'Content-Type: application/json' -d'
    {
      "description" : "grok the flood counters of an ems message",
       "processors" : [
        {
          "grok" : {
            "field": "event",
            "patterns": ["%{GREEDYDATA}\\(\\<\\<%{DATA:flood.data}\\>\\>\\)\\? %{GREEDYDATA}"],
            "ignore_missing": true,
            "ignore_failure" : true
          }
        }
      ],
       "on_failure" : [
              {
                "set" : {
                  "field" : "error",
                  "value" : ""
                }
              }
            ]
    }'
    

    This pipeline can be used in the Update By Query, that will apply the pipeline to each document.

    curl -XPOST "http://localhost:9200/ems/_update_by_query?pipeline=ems_flooding&conflicts=proceed&pretty" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "foapplication.keyword": "AOME2PPP"
              }
            },
            {
              "query_string": {
                "fields": [
                  "event"
                ],
                "query": "MSGPA"
              }
            }
          ]
        }
      }
    }'
    

    To check the current task:

    curl -XGET 'localhost:9200/_tasks?detailed=true&actions=*byquery&pretty'
    
  16. 2017-08-30 - Add automatic timestamp to new documents; Tags: Add automatic timestamp to new documents
    Loading...

    Add automatic timestamp to new documents

    In the past Elasticsearch could add automatically a timestamp field. Since Elasticsearch 5.x I have to use a pipeline to ingest that timestamp field to the document. As a major change the internal `` value has also changed.

    In short

    • Old Timestamp: 2017-09-04T15:48:52.560+0000
    • New Timestamp: Mon Sep 04 15:48:52 CEST 2017 or Mon Sep 04 15:48:52 UTC 2017

    For the new timestamp it results in a new date format, that contains zone names.

    Zone names: Time zone names (‘z’) cannot be parsed → (Joda Time)[http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html].

    The workaround for that is just to take the literal and match the date with the respective timezone. This results in following pipeline definition:

    PUT _ingest/pipeline/timestamp
    {
      "description": "add timestamp field to the document, requires a datetime field date mapping",
      "processors": [
        {
          "set": {
            "field": "datetime",
            "value": ""
          },
          "date" : {
            "field" : "datetime",
            "formats" : ["EEE MMM dd HH:mm:ss 'UTC' yyyy", "EEE MMM dd HH:mm:ss 'CEST' yyyy"],
            "timezone" : "Europe/Zurich",
            "target_field": "datetime"
          }
        }
      ]
    }
    

    To use this pipeline

    PUT test/logs/vinh4711?pipeline=timestamp
    {
      "message": "Hi cinhtau!"
    }
    

    Query the result

    GET test/logs/vinh4711
    

    The new ingested date

    {
      "_index": "test",
      "_type": "logs",
      "_id": "vinh4711",
      "_version": 1,
      "found": true,
      "_source": {
        "datetime": "2017-09-04T15:54:29.000+02:00",
        "message": "Hi cinhtau!"
      }
    }
    
  17. 2017-08-28 - HTTPS monitoring with Heartbeat; Tags: HTTPS monitoring with Heartbeat
    Loading...

    HTTPS monitoring with Heartbeat

    Heartbeat is still beta, but is worth a try. If you have an external REST endpoint and you need a history to check if the endpoint is available, heartbeat is one eligible solution.

    Configuration

    First, let’s define the endpoint in the heartbeat.yml

    heartbeat.monitors:
    - type: http
    
      urls: ["https://monitoring-test.cinhtau.net","https://monitoring-prod.cinhtau.net"]
      schedule: '@every 60s'
      timeout: 2m
      ssl:
        certificate_authorities: ['/home/tan/ssl/ca.crt']
        supported_protocols: ["TLSv1.2"]
      check.request:
        method: GET
        headers:
          'Authorization': 'Basic bWFwcGVyOmtpbmc='
      check.response:
        status: 200
    

    Monitor Endpoints

    The urls field contains all the http endpoints.

    urls: ["https://monitoring-test.cinhtau.net","https://monitoring-prod.cinhtau.net"] 
    

    TLS

    Since the endpoint is https you have to omit the TLS information. In my case I needed to add the issuer certificate authorities. In my case is Symantec. The certificates are available on their support site.

    Just concatenate all certificates into one ca.crt file. Without the information, you will get a X509 certificate error → unknown certificate authority.

    ssl:
      certificate_authorities: ['/home/tan/ssl/ca.crt']
      supported_protocols: ["TLSv1.2"]
    

    Security

    Since Elasticsearch is protected with basic authentication, I add the auth header to the check request.

    check.request:
      method: GET
      headers:
        'Authorization': 'Basic bWFwcGVyOmtpbmc='
    

    Heartbeat checks for the HTTP response code 200 (OK). We could also check for the response body, but since it is subject to change on every elasticsearch upgrade, checking the response code is sufficient.

    check.response:
      status: 200
    

    TCP Monitoring

    To demonstrate TCP Monitoring, following config checks if logstash has started the beats input plugin on port 5044.

    - type: tcp
      schedule: '@every 1m'
      hosts: ["localhost:5044"]  # default TCP Echo Protocol
    

    Additional Information

    To add custom fields or custom values in the tags field add them in the General section.

    #================================ General =====================================
    
    name: "le-mapper"
    tags: ["mapper-king", "web-tier"]
    fields:
      env: staging
    

    Reporting Output

    The data might be send to logstash or directly to elasticsearch.

    #================================ Outputs =====================================
    
    output.elasticsearch:
      # Array of hosts to connect to.
      hosts: ["localhost:9200"]
    
      # Optional protocol and basic auth credentials.
      #protocol: "https"
      username: "elastic"
      password: "secret"
    

    Logging Output

    Use the logging section to define the internal output for debugging.

    #================================ Logging =====================================
    
    logging.level: info
    logging.to_files: true
    logging.to_syslog: false
    logging.files:
      path: /var/log/beats
      name: heart-beat.log
      keepfiles: 7
    

    A regular output:

    2017-09-04T11:36:14+02:00 INFO Setup Beat: heartbeat; Version: 5.5.2
    2017-09-04T11:36:14+02:00 INFO Loading template enabled. Reading template file: /home/tan/heartbeat-5.5.2-linux-x86_64/heartbeat.template.json
    2017-09-04T11:36:14+02:00 INFO Loading template enabled for Elasticsearch 2.x. Reading template file: /home/tan/heartbeat-5.5.2-linux-x86_64/heartbeat.template-es2x.json
    2017-09-04T11:36:14+02:00 INFO Loading template enabled for Elasticsearch 6.x. Reading template file: /home/tan/heartbeat-5.5.2-linux-x86_64/heartbeat.template-es6x.json
    2017-09-04T11:36:14+02:00 INFO Elasticsearch url: http://localhost:9200
    2017-09-04T11:36:14+02:00 INFO Activated elasticsearch as output plugin.
    2017-09-04T11:36:14+02:00 INFO Publisher name: le-mapper
    2017-09-04T11:36:14+02:00 INFO Flush Interval set to: 1s
    2017-09-04T11:36:14+02:00 INFO Max Bulk Size set to: 50
    2017-09-04T11:36:14+02:00 WARN Beta: Heartbeat is beta software
    2017-09-04T11:36:14+02:00 INFO Select (active) monitor http
    2017-09-04T11:36:14+02:00 INFO Select (active) monitor tcp
    2017-09-04T11:36:14+02:00 INFO heartbeat start running.
    2017-09-04T11:36:14+02:00 INFO heartbeat is running! Hit CTRL-C to stop it.
    2017-09-04T11:36:44+02:00 INFO No non-zero metrics in the last 30s
    2017-09-04T11:37:14+02:00 INFO No non-zero metrics in the last 30s
    2017-09-04T11:37:15+02:00 INFO Connected to Elasticsearch version 5.5.2
    2017-09-04T11:37:15+02:00 INFO Trying to load template for client: http://localhost:9200
    2017-09-04T11:37:15+02:00 INFO Template already exists and will not be overwritten.
    2017-09-04T11:37:44+02:00 INFO Non-zero metrics in the last 30s: libbeat.es.call_count.PublishEvents=1 libbeat.es.publish.read_bytes=972 libbeat.es.publish.write_bytes=2374 libbeat.es.published_and_acked_events=3 libbeat.publisher.messages_in_worker_queues=3 libbeat.publisher.published_events=3
    

    Data in Elasticsearch

    Heartbeat will write this kind of data.

    {
      "_index": "heartbeat-2017.09.04",
      "_type": "doc",
      "_id": "AV5MQFLFT-rF7Tttya86",
      "_score": 1,
      "_source": {
        "@timestamp": "2017-09-04T09:37:14.247Z",
        "beat": {
          "hostname": "omega",
          "name": "le-mapper",
          "version": "5.5.2"
        },
        "duration": {
          "us": 155771
        },
        "fields": {
          "env": "staging"
        },
        "host": "monitoring.cinhtau.six-group.net",
        "http_rtt": {
          "us": 36136
        },
        "ip": "10.22.12.118",
        "monitor": "http@https://monitoring.cinhtau.six-group.net",
        "port": 443,
        "resolve_rtt": {
          "us": 60807
        },
        "response": {
          "status": 200
        },
        "rtt": {
          "us": 94785
        },
        "scheme": "https",
        "tags": [
          "mapper-king",
          "web-tier"
        ],
        "tcp_connect_rtt": {
          "us": 10313
        },
        "tls_handshake_rtt": {
          "us": 47684
        },
        "type": "http",
        "up": true,
        "url": "https://monitoring.cinhtau.six-group.net"
      }
    }
    

    The Kibana Dashboard

    A preset dashboard is shipped within heartbeat.

    Heartbeat Dashboard

  18. 2017-07-26 - Reindex Subset Data in Elasticsearch; Tags: Reindex Subset Data in Elasticsearch
    Loading...

    Reindex Subset Data in Elasticsearch

    The Elasticsearch Reindex API is a powerful way to index a subset of data from existing data. If you think of a long term statistics solution, you can aggregate data and store the aggregated values instead the atomic details. In my company we have an index that contains approximately 150 fields in each document. For a longterm solution only 30 are relevant. The Reindex API can just fetch the 30 desired fields and store them in a new index.

    The reindex template

    curl -XPOST "http://elasticsearch:9200/_reindex" -H 'Content-Type: application/json' -d'{
    "source": {
     "index": "source-index-2017.07.26",
      "_source": [
         "field_1",
         "field_2",
         ..
         "field_30",      
       ],
       "query": {
         "match_all": {}
       }
    },
    "dest": {
     "index": "target-index-2017.07.26"
    }}'
    

    The general approach is to use source filtering for reindex action.

  19. 2017-07-25 - Elasticsearch Range Query; Tags: Elasticsearch Range Query
    Loading...

    Elasticsearch Range Query

    An accident in the Elasticsearch universe. Instead writing to an daily index it was index to a yearly index. Now I had to check the date range of the documents. The Elasticsearch Date Math is a great help for the Range Query.

    Detect Boundaries

    First check lower and upper bound

    Getting lower bound with sorting on date field

    GET fo-log-2017/_search
    {
      "_source": "datetime_host", 
      "size": 1,
       "sort": [
        {
          "datetime_host": {
            "order": "asc"
          }
        }
      ]
    }
    

    Getting upper bound

    GET fo-log-2017/_search
    {
      "_source": "datetime_host", 
      "size": 1,
       "sort": [
        {
          "datetime_host": {
            "order": "desc"
          }
        }
      ]
    }
    

    Get Docs Count

    Check how many documents exist for a specific day

    GET fo-log-2017/_search
    {
      "size": 0, 
      "query": {
        "range": {
          "datetime_host": {
            "gte": "2017-07-24 00:00",
            "lte": "2017-07-25 00:00",
            "format": "yyyy-MM-dd HH:mm"
          }
        }
      }
    }
    

    Example output

    {
      "took": 59,
      "timed_out": false,
      "_shards": {
        "total": 2,
        "successful": 2,
        "failed": 0
      },
      "hits": {
        "total": 9576222,
        "max_score": 0,
        "hits": []
      }
    }
    

    Using Date Math

    GET fo-log-2017/_search
    {
      "size": 0, 
      "query": {
        "range": {
          "datetime_host": {
            "gte": "now/d",
            "lte": "now+1d/d",
            "format": "yyyy-MM-dd"
          }
        }
      }
    }
    
    {
      "took": 9,
      "timed_out": false,
      "_shards": {
        "total": 2,
        "successful": 2,
        "failed": 0
      },
      "hits": {
        "total": 1627667,
        "max_score": 0,
        "hits": []
      }
    }
    

    Reindex with Range Query

    Now use it to transfer the data to the daily index

    POST _reindex
    {
      "source": {
        "index": "fo-log-2017",
        "query": {
          "range": {
            "datetime_host": {
              "gte": "2017-07-25 00:00",
              "lte": "2017-07-26 00:00",
              "format": "yyyy-MM-dd HH:mm"
            }
          }
        }
      },
      "dest": {
        "index": "fo-log-2017.07.25"
      }
    }
    

    Delete with Range Query

    The range query can also be utilized in the Delete By Query API. For example wrong indexed documents of the wrong month.

    curl -XPOST "http://localhost:9200/fo-log-2017.05.24/_delete_by_query" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "range": {
          "datetime_host": {
            "gte": "2017-07-24 00:00",
            "lte": "2017-07-25 00:00",
            "format": "yyyy-MM-dd HH:mm"
          }
        }
      }
    }'
    
  20. 2017-07-19 - Elasticsearch Date Processor Pipeline; Tags: Elasticsearch Date Processor Pipeline
    Loading...

    Elasticsearch Date Processor Pipeline

    I write some configuration documents with the Elasticsearch low level Java Rest Client. The documents are missing a timestamp, therefore I define a simple pipeline, which adds the @timestamp field to my documents.

    Definition

    Create pipeline

    PUT _ingest/pipeline/timestamp
    {
      "description" : "add timestamp field to the document",
       "processors" : [
        {
          "date" : {
            "field" : "timestamp",
            "formats" : ["yyyyMMddHHmm"],
            "timezone" : "Europe/Zurich"
          }
        }
      ]
    }
    

    Test

    Test the pipeline, we take test data from an existing document

    GET blackops/logstream/dev-F
    

    The output

    {
      "_index": "blackops",
      "_type": "logstream",
      "_id": "dev-F",
      "_version": 93,
      "found": true,
      "_source": {
        "logfile": "$POSDAT.DVTKSMDL.LF000007",
        "logfilePosition": 1546188226561,
        "timestamp": "201707191542",
        "logstrom": "F"
      }
    }
    

    Simulate with the test data

    POST _ingest/pipeline/timestamp/_simulate
    {
      "docs": [
        {
          "_source": {
            "logfile": "$POSDAT.DVTKSMDL.LF000007",
            "logfilePosition": 1546188226561,
            "timestamp": "201707191542",
            "logstrom": "F"
          }
        }
      ]
    }
    

    Output with the new timestamp field

    {
      "docs": [
        {
          "doc": {
            "_index": "_index",
            "_id": "_id",
            "_type": "_type",
            "_source": {
              "@timestamp": "2017-07-19T15:42:00.000+02:00",
              "logfile": "$POSDAT.DVTKSMDL.LF000007",
              "logfilePosition": 1546188226561,
              "logstrom": "F",
              "timestamp": "201707191542"
            },
            "_ingest": {
              "timestamp": "2017-07-19T13:49:15.480Z"
            }
          }
        }
      ]
    }
    

    REST Endpoint

    Use the pipeline by passing the param

    PUT blackops/logstream/dev-F?pipeline=timestamp
    
  21. 2017-07-15 - Import Currency codes into Elasticsearch; Tags: Import Currency codes into Elasticsearch
    Loading...

    Import Currency codes into Elasticsearch

    Working in the financial business requires to have the currency code master-data accessible for various reasons. The ISO 4217 currency codes can be obtained from the ISO Organization website. This post uses Logstash and the csv plugin to process the data and import it into Elasticsearch. Elasticsearch itself provides the REST interface, so every micro-service or web service can access the desired data.

    Export the data from Excel to CSV. This logstash configuration reads the CSV data, converts it and ships it to elasticsearch. Alter the values to your scenario.

    input {
      file {
        path => "/tmp/currencies.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"
      }
    }
    filter {
      csv {
        columns => ["entity", "currency", "alphaCode", "id", "numericCode", "minorUnit"]
        separator => ";"
      }
      mutate {
        remove_field => "message"
        convert => {
            "numericCode" => "integer"
            "minorUnit" => "integer"
        }
        add_field => {
            "[meta][edition]" => "ISO 4217:2015"
        }
        replace => {
            "id" => "%{alphaCode}-%{entity}"
        }
      }
    }
    output {
      stdout { codec => "rubydebug" }
      elasticsearch {
        hosts => [ "elasticsearch:9200" ]
        user => "elastic"
        password => "changeme"
        index => "masterdata"
        document_type => "currency"
        document_id => "%{id}"
      }
    }
    
  22. 2017-07-14 - Reindex Watcher Indices with Curator; Tags: Reindex Watcher Indices with Curator
    Loading...

    Reindex Watcher Indices with Curator

    Elasticsearch Alerting with X-Pack (formerly known as Watcher), writes it watch executions in a daily indices. If you don’t keep an eye on that, you use a lot of shards on small indices. Curator offers the capability of the reindex action, i.e. write data from a daily index into a month or year index. This post contains an example for Elasticsearch v5.4.3 and Elasticsearch Curator v5.1.1.

    The actionfile in yaml

    >actions: 1: description: "Create target index as named" action: create_index options: name: '.watcher-history-3-2017' 2: description: "Reindex daily watcher index into monthly index" action: reindex options: disable_action: False wait_interval: 9 max_wait: -1 request_body: source: index: REINDEX_SELECTION dest: index: .watcher-history-3-2017 filters: - filtertype: pattern kind: prefix value: .watcher-history-3-2017. 3: description: >- WATCHER: Delete indices older than 1 day action: delete_indices options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: prefix value: .watcher-history-3-2017. exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 1 exclude:

    The actions explained

    1. If the target index does not exists, it will be created. If it exists, nothing will happen :wink:.
    2. The reindex action will take all daily indices and reindex it to the target index.
    3. After the reindex the daily indices are deleted, since the data is then redundant.

    Curator is a great tool to tend to Elasticsearch indices, but on the reindex action I miss a little bit of flexibility. So far no date pattern can be used for replacing the year or current month. If you reindex the data into a year index, you don’t have to touch the actionfile so often.

  23. 2017-07-08 - Evaluating Elasticsearch Watcher Cron Expression; Tags: Evaluating Elasticsearch Watcher Cron Expression
    Loading...

    Evaluating Elasticsearch Watcher Cron Expression

    Working with Elasticsearch Watcher enables you to put a cron schedule into the trigger. This is no ordinary linux cron expression. It looks like Quartz. If you want to test the correctness of the cron, you use the shipped utility croneval in the installed X-Pack directory.

    tknga@omega:/opt/elasticsearch-5.4.3/bin/x-pack> ./croneval "0 */5 5-21 * * ?"
    Valid!
    Now is [Fri, 7 Jul 2017 11:41:35]
    Here are the next 10 times this cron expression will trigger:
    1.      Fri, 7 Jul 2017 11:45:00
    2.      Fri, 7 Jul 2017 11:50:00
    3.      Fri, 7 Jul 2017 11:55:00
    4.      Fri, 7 Jul 2017 12:00:00
    5.      Fri, 7 Jul 2017 12:05:00
    6.      Fri, 7 Jul 2017 12:10:00
    7.      Fri, 7 Jul 2017 12:15:00
    8.      Fri, 7 Jul 2017 12:20:00
    9.      Fri, 7 Jul 2017 12:25:00
    10.     Fri, 7 Jul 2017 12:30:00
    
  24. 2017-06-19 - Elasticsearch Nodes Memory Usage Watcher; Tags: Elasticsearch Nodes Memory Usage Watcher
    Loading...

    Elasticsearch Nodes Memory Usage Watcher

    TL,DR (Too long, Don’t read.). If you have a dedicated monitoring cluster for your Elasticsearch clusters, you should at least monitor the memory usage of each node. This is very helpful. Instead of fetching the data from the cluster within, we query the monitoring cluster remotely. This watch was created on Elasticsearch with X-Pack v5.4.3. Pay attention, that some values are tweaked and not realistic for production scenarios. The interval for instance is set to 6 hours, since we will execute this watch manually. We also choose 60% as threshold. 75% or 80% would be more realistic for warning scenarios.

    Following watch was developed in conjunction with our Elasticsearch Support Engineers. They provided the groundwork, since painless is not painless IMHO. I know this is very opinionated, you can take me on Twitter or mail me for that. We took the example from Elasticsearch Watcher (Version 2.3) and adjusted it to the new dedicated monitoring cluster.

    The Watcher Skeleton

    PUT _xpack/watcher/watch/mem-watch
    {
      "metadata": {
        "threshold": 60
      },
      "trigger": {
        "schedule": {
          "interval": "6h"
        }
      },
      "input": {
        "http": {
          "request": {
            "scheme": "http",
            "host": "your-monitoring-server",
            "port": 9200,
            "method": "get",
            "path": ".monitoring-es-2-*/node_stats/_search",
            "params": {},
            "headers": {},
            "body": """{"size": 0, "query":{"bool":{"filter": [{"range":{"timestamp":{"from": "now-10m", "to": "now", "include_lower": true, "include_upper": true, "boost": 1}}}], "disable_coord": false, "adjust_pure_negative": true, "boost": 1}}, "aggregations":{"minutes":{"date_histogram":{"field": "timestamp", "interval": "minute", "offset": 0, "order":{"_key": "asc"}, "keyed": false, "min_doc_count": 0}, "aggregations":{"nodes":{"terms":{"field": "source_node.name", "size": 10, "min_doc_count": 1, "shard_min_doc_count": 0, "show_term_doc_count_error": false, "order": [{"memory": "desc"},{"_term": "asc"}]}, "aggregations":{"memory":{"avg":{"field": "node_stats.jvm.mem.heap_used_percent"}}}}}}}, "ext":{}}"""
          }
        }
      },
      "condition": {
        "script": {
           "inline": "if (ctx.payload.aggregations.minutes.buckets.size() == 0) return false; def latest = ctx.payload.aggregations.minutes.buckets[-1]; def node = latest.nodes.buckets[0]; return node?.memory?.value >= ctx.metadata.threshold;"
        }
      },
      "actions": {
        "send_mem_warning": {
          "transform": {
            "script": {
              "lang": "painless",
              "inline": "def latest = ctx.payload.aggregations.minutes.buckets[-1]; return latest.nodes.buckets.stream().filter(item -> item.memory.value >= ctx.metadata.threshold).collect(Collectors.toList());"
            }
          },
          "email": {
            "profile": "standard",
            "from": "watcher@your-company.com",
            "reply_to": [
              "your-email@your-company.com"
            ],
            "to": [
              "le-mapper@cinhtau.net"
            ],
            "cc": [
              "my-buddies@your-company.com"
            ],
            "subject": "Watcher Notification - HIGH MEMORY USAGE",
            "body": {
              "html": {
                "stored": "mem-watch-warning",
                "lang": "mustache"
              }
            }
          }
        }
      }
    }
    

    Using Metadata

    Instead of hardcoding the threshold of 60 use the metadata capability of Elasticsearch Watcher v5.4.

    Put this json before trigger:

    "metadata": {
      "threshold" : 60
    } 
    

    Replace the literal 60 with ctx.metadata.threshold.

    "inline": "if (ctx.payload.aggregations.minutes.buckets.size() == 0) return false; def latest = ctx.payload.aggregations.minutes.buckets[-1]; def node = latest.nodes.buckets[0]; return node?.memory?.value >= ctx.metadata.threshold;"
    

    Schedule Trigger

    The trigger section defines how often the watch shall be executed. Below the interval is five minutes.

    "trigger": {
        "schedule": {
          "interval": "5m"
        }
    }
    

    Remote Input

    Since the input is a http request to the monitoring server, the path is the REST endpoint

    • index pattern → .monitoring-es-2-* (the x-pack monitoring data)
    • document type → nodes_stats
    • operation → _search

    The tricky part is the body, that contains the JSON request body. This is Elasticsearch Query DSL. In short, the watch looks back in the last 5 minutes and aggregates the average memory usage for each Elasticsearch node. You could restrict it only to data nodes, but why? Doesn’t hurt to monitor the master and client nodes. As you can see it is quite ugly when minified. You should develop this part always separately and in a readable format.

    A great help are these tools:

    You should always test the query first, before you can take on watcher.

    Condition Trigger

    Based on a condition an action is executed. This complicated expression is in fact just a comparison and returns the average value of the mem usage of each elasticsearch node.

    "condition": {
          "script": {
          "inline": "if (ctx.payload.aggregations.minutes.buckets.size() == 0) return false; def latest = ctx.payload.aggregations.minutes.buckets[-1]; def node = latest.nodes.buckets[0]; return node?.memory?.value >= ctx.metadata.threshold;",
          "lang": "painless"
        }
    }
    

    Action - Notification

    The body.html message part is empty, since I kind of like to have a separate mustache template for that.

    This simple mustache template will do. It will list every node name and its aggregated value. _value is a collection which were created by the painless script in the action body.

    <h2>Nodes with HIGH MEMORY</h2>
    
    Usage (above 60%):
    <ul>
    {{#ctx.payload._value}}
    <li>"{{key}}" - Memory Usage is at {{memory.value}}</li>
    {{/ctx.payload._value}}
    </ul>
    

    Minify above template to store it into elasticsearch.

    POST _scripts/mem-watch-warning
    {
      "script": {
        "lang": "mustache",
        "code": "<h2>Nodes with HIGH MEMORY</h2>Usage (above 60%):<ul>{{#ctx.payload._value}}<li>\"{{key}}\" - Memory Usage is at {{memory.value}}</li>{{/ctx.payload._value}}</ul>"
      }
    }
    

    In the action section of the watcher reference to the stored script.

     "actions": {
        "send_mem_warning": {
          "transform": {
            "script": {
              "lang": "painless",
              "inline": "def latest = ctx.payload.aggregations.minutes.buckets[-1]; return latest.nodes.buckets.stream().filter(item -> item.memory.value >= ctx.metadata.threshold).collect(Collectors.toList());"
            }
          },
          "email": {
            "profile": "standard",
            "from": "watcher@cinhtau.net",
            "to": [
              "le-mapper@cinhtau.net"
            ],
            "subject": "Watcher Notification - HIGH MEMORY USAGE",
            "body": {
              "html": {
                "stored": "mem-watch-warning"
              }
            }
          }
        }
      }
    

    If we want to change the notification message, I don’t have to touch the watcher anymore. So I can alter the look and feel as I like.

    You can see the outcome, by executing the watch manually, instead waiting for the trigger.

    POST _xpack/watcher/watch/mem-watch/_execute
    

    If you have set xpack.notification.email.html.sanitization.enabled: false in the elasticsearch.yml you can have colorful warnings.

    Memory Watch Warning

  25. 2017-06-14 - Elasticsearch Hot Warm Architecture; Tags: Elasticsearch Hot Warm Architecture
    Loading...

    Elasticsearch Hot Warm Architecture

    Running a Elasticsearch cluster can be an easy task. If you have the need to store data for a long time, but you know the data is infrequently requested, you may think of a hot-warm architecture. This post is a brief summary of my setup for my company at work. At that time Elasticsearch v5.4.1 was running.

    Terminology

    In terms of speed in an ideal world, a hot node is usually a data node with high IO operations, basically an SSD raid0. Warm nodes are typically magnetic HDD raid0 with significant greater capacity. Elasticsearch is already distributed and redundant. Utilize the striping speed of raid0.

    Above description must not comply with your needs. Just to be clear, a hot node is a data node with a fast storage solution and contains the newest data like the last 2 weeks or 1 month.

    A warm node is just a data node with less fast storage, because it contains old data, which aren’t access so often. Nothing is written anymore, so it comes down basically to read only operations.

    Setup

    For this concept to work, you need to have discriminator for each data node. In the elasticsearch.yml add to attributes section the box_type attribute. You are free to choose the attribute name as you like it to have.

    node:
      master: false
      data: true
      ingest: true
      attr:
        rack: "with-nas"
        box_type: "hot"
    

    You could also pass the attribute as environment variable, overriding the value in the elasticsearch.yml. For instance

    bin/elasticsearch -d -E node.attr.box_type="warm"
    

    Delay Allocation

    One important concept of indices, is to relocate its missing shards to other data nodes. If you have a cluster upgrade, rolling restarts are mandatory. Either you can disable the shard allocation or just put an timeout in the index setting. Below recipe will set a timeout to all indices.

    curl -XPUT "http://fo-itu:9200/_all/_settings" -H 'Content-Type: application/json' -d'
    {
      "settings": {
        "index.unassigned.node_left.delayed_timeout": "10m"
      }
    }' -u tknga
    

    The allocation will delay for 10 minutes, while the missing data node is recovered and the shards are available again. You safe some network and disk IO. Read more on Delay Allocation.

    Restart Node

    For the hot warm architecture to work, the data node must be restarted. Otherwise you won’t be able to allocate anything.

    Check with the Nodes API if the attribute was applied to the respective data node. Use the Console to list only data nodes

    GET _nodes/data:true
    

    The response body should contain something like this:

    "attributes": {
        "box_type": "warm",
        "ml.enabled": "false"
    }
    

    Allocation

    You can set routing for all indices with this template. Below recipe will force an allocation to hot nodes only.

    PUT _all/_settings
    {
      "settings": {
        "index.routing.allocation.require.box_type": "hot"
      }
    }
    

    Adjust Index Template

    If new indices are created, add the attributes to your index templates.

    "settings": {
        "index": {
            "number_of_shards": "1",
            "refresh_interval": "5s",
            "routing.allocation.require.box_type": "hot",
            "unassigned.node_left.delayed_timeout": "10m"
        }
    }
    

    Curator

    This action file template contains in logical order the allocation of indices from hot nodes to warm nodes.

    • First we set the replicas to 0. This might be a danger if we lose a data node permanently, but speeds up the allocation.
    • Based on a file name pattern all eligible indices are moved to the warm node.
    • After the allocation we reduce segments, to free some resources.
    actions:
      1:
        action: replicas
        description: >-
          Reduce the replica count to 0 for fo- prefixed indices older than
          4 days (based on index creation_date)
        options:
          count: 0
          wait_for_completion: True
          disable_action: True
        filters:
        - filtertype: pattern
          kind: prefix
          value: fo-
        - filtertype: age
          source: creation_date
          direction: older
          unit: days
          unit_count: 14
      2:
        action: allocation
        description: "Apply shard allocation filtering rules to the specified indices"
        options:
          key: box_type
          value: warm
          allocation_type: require
          wait_for_completion: true
          timeout_override:
          continue_if_exception: false
          disable_action: false
        filters:
        - filtertype: pattern
          kind: prefix
          value: fo-
        - filtertype: age
          source: name
          direction: older
          timestring: '%Y.%m.%d'
          unit: days
          unit_count: 14
      3:
        action: forcemerge
        description: "Perform a forceMerge on selected indices to 'max_num_segments' per shard"
        options:
          max_num_segments: 1
          delay:
          timeout_override: 21600
          continue_if_exception: false
          disable_action: false
        filters:
        - filtertype: pattern
          kind: prefix
          value: fo-
        - filtertype: age
          source: name
          direction: older
          timestring: '%Y.%m.%d'
          unit: days
          unit_count: 14
    
  26. 2017-05-02 - Move documents to another Index in Elasticsearch; Tags: Move documents to another Index in Elasticsearch
    Loading...

    Move documents to another Index in Elasticsearch

    If you run into the situation, that documents were written to a wrong index, you can use the Reindex API to copy the documents to the desired index. You can remove them afterwards with the Delete By Query API.

    First of all a simple search query.

    GET prod/_search
    {
      "query": {
        "match": {
          "application": "ep-deux"
        }
      }
    }
    

    Copy only queried data

    POST _reindex
    {
      "source": {
        "index": "fo-prod-2017.05.01",
        "type": "json",
        "query": {
          "match": {
            "application": "ep-deux"
          }
        }
      },
      "dest": {
        "index": "fix-2017.05.01"
      }
    }
    

    Check with the Task API the status of the reindex action.

    GET _tasks?detailed=true&actions=*reindex
    

    Delete old documents

    POST fo-prod-2017.05.01/json/_delete_by_query?conflicts=proceed
    {
      "query": {
        "match": {
          "application": "ep-deux"
        }
      }
    }
    
  27. 2017-05-01 - Coercion in Elasticsearch; Tags: Coercion in Elasticsearch
    Loading...

    Coercion in Elasticsearch

    If a field with its datatype in the mapping is defined, e.g. duration as Integer, Elasticsearch has a default behavior of coercion, if the value for the duration field is String. The String will be written, but interpreted as Integer. This can be a little bit misleading if you use only the document perspective.

    Field duration has default behavior. Coercion is on.

    PUT vinh
    {
      "mappings": {
        "logs": {
          "properties": {
            "duration": {
              "type": "integer"
            }
          }
        }
      }
    }
    

    Create two documents (1st is an Integer, 2nd is a string) with the bulk API. If you reverse the order, it does not matter.

    POST _bulk
    { "index" : { "_index" : "vinh", "_type" : "logs", "_id" : "1" } }
    { "duration" : 10 }
    { "index" : { "_index" : "vinh", "_type" : "logs", "_id" : "2" } }
    { "duration" : "7" }
    

    If you query the data, you see the String 7.

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
          {
            "_index": "vinh",
            "_type": "logs",
            "_id": "2",
            "_score": 1,
            "_source": {
              "duration": "7"
            }
          },
          {
            "_index": "vinh",
            "_type": "logs",
            "_id": "1",
            "_score": 1,
            "_source": {
              "duration": 10
            }
          }
        ]
      }
    }
    

    But for Kibana it is a number and thus usable for visualizations.

    Coercion on duration

    Some might argue this must be expensive for Elasticsearch or Kibana. It really is important to ensure that either you report the data correctly or turn off coercion, so the party who is generating the index events has to write the correct data type.

  28. 2017-05-01 - Reindex Data with Pipeline in Elasticsearch; Tags: Reindex Data with Pipeline in Elasticsearch
    Loading...

    Reindex Data with Pipeline in Elasticsearch

    Data is not always clean. Depending on how it is produced a number might be rendered in the JSON body as a true JSON number, e.g. 10, but it might also be rendered as a string, e.g. “10”. Some developers use MDC to pass meta data into Elasticsearch. If you have data as String and want to use Kibana for visualizations you need a fix. The only way to fix that is to reindex the data. Using the Reindex API with usage of pipelines ensures that the data have the correct data type.

    Reindex Data

    If you have a strict dynamic mapping or turned off coercion (forcing String to Integer), the index operation will fail.

    "failures": [
    {
      "index": "fo-prod-fix-2017.04.28",
      "type": "json",
      "id": "AVuzeFooEwjYNH5b13lU",
      "cause": {
    	"type": "mapper_parsing_exception",
    	"reason": "failed to parse [duration]",
    	"caused_by": {
    	  "type": "illegal_argument_exception",
    	  "reason": "Integer value passed as String"
    	}
      },
      "status": 400
    }

    Create Pipeline

    Ingest nodes on elasticsearch can perform the necessary conversion for that. I create a pipeline named counter-string and use the convert processor to convert the String into Integer.

    PUT _ingest/pipeline/counter-string
    {
      "description": "convert from string into number converter",
        "processors": [      
          {
            "convert": {
              "field": "duration",
              "type": "integer",
              "ignore_missing": true
            }
          }
        ]
    }

    Read Pipeline Settings

    You may check pipeline details at any time.

    GET _ingest/pipeline/counter-string
    

    The response with the pipeline object.

    {
      "counter-string": {
        "description": "convert from string into number converter",
        "processors": [
          {
            "convert": {
              "field": "duration",
              "type": "integer",
              "ignore_missing": true
            }
          }
        ]
      }
    }

    Ensure that the negative case (field is missing), won’t impact the index operation.

    Test the Pipeline

    Pipelines can be tested with the simulate operation. Pay attention to the last case. If the data is an Integer already, the index action shall not fail.

    POST _ingest/pipeline/counter-string/_simulate
    {
      "docs": [
        {
          "_source": {
    		"duration": "10"
          }
        },
        {
          "_source": {
            "duration": "1"
          }
        },
    	{
          "_source": {
            "duration": 67
          }
        }
      ]
    }

    The pipeline output

    {
      "docs": [
        {
          "doc": {
            "_id": "_id",
            "_index": "_index",
            "_type": "_type",
            "_source": {
              "duration": 10
            },
            "_ingest": {
              "timestamp": "2017-05-01T08:00:08.200Z"
            }
          }
        },
        {
          "doc": {
            "_id": "_id",
            "_index": "_index",
            "_type": "_type",
            "_source": {
              "duration": 1
            },
            "_ingest": {
              "timestamp": "2017-05-01T08:00:08.200Z"
            }
          }
        },
        {
          "doc": {
            "_id": "_id",
            "_index": "_index",
            "_type": "_type",
            "_source": {
              "duration": 67
            },
            "_ingest": {
              "timestamp": "2017-05-01T08:00:08.200Z"
            }
          }
        }
      ]
    }

    Reindex Data

    Use pipeline in reindex action

    POST _reindex
    {
      "source": {
        "index": "fo-prod-2017.04.28"
      },
      "dest": {
        "index": "fo-prod-fix-2017.04.28",
        "pipeline": "counter-string"
      }
    }
  29. 2017-04-28 - Shard Allocation Filtering; Tags: Shard Allocation Filtering
    Loading...

    Shard Allocation Filtering

    If you run Elasticsearch and use Kibana for various reasons, you better ensure to perform automatic backups. The time spend in searches, visualizations and dashboard should be worth that. If an Elasticsearch upgrade goes south, you are happy to have a backup. The main advantages of an Elasticsearch cluster, that you can join and remove additional nodes, which may differ in their resources and capacity. That’s the situation I constantly deal at work. Shard Allocation Filtering helps to setup smart rules for example hot warm architecture.

    Elasticsearch can backup the kibana index to various repositories. The only possibility, for me at work, is to store the backup to a shared NAS. That is the Elasticsearch file storage repository. My clusters consists of multiple nodes where some nodes are not attached to NAS. If kibana has shard on nodes, without the attached NAS mountpoint the backup fails. Shard allocation filtering enables me to mark indices to reside only on nodes with specific attributes.

    In the elasticsearch configuration - elasticsearch.yml - I put the node attribute named rack with the value with-nas.

    cluster.name: test
    node.name: omega
    node.master: true
    node.data: true
    node.ingest: true
    node.attr.rack: "with-nas"
    

    In the Kibana Console (Sense) I can change settings of the kibana index.

    PUT .kibana/_settings
    {
      "index.routing.allocation.include.rack": "with-nas"
    }
    

    After that the master node will ensure that the .kibana index will be put on the nodes with the respective value.

  30. 2017-04-19 - Use usermod and groupmod in Alpine Linux Docker Images; Tags: Use usermod and groupmod in Alpine Linux Docker Images
    Loading...

    Use usermod and groupmod in Alpine Linux Docker Images

    Elastic uses Alpine Linux as base image for their Elasticsearch docker images. Since v5.3.1 there will be no more supported docker images on dockerhub with Debian, it made it necessary to rewrite my docker files for my company. One obstacle is to assign the elasticsearch user to a specific uid and gid on the docker host system.

    Deprecated images

    Since I’m not familiar with Alpine Linux I had to investigate a little. To have usermod and groupmod, I have to install the shadow package.

    # change uid and gid for elasticsearch user
    RUN apk --no-cache add shadow && \
        usermod -u 2500 elasticsearch && \
        groupmod -g 2500 elasticsearch
    
  31. 2017-04-19 - Delete Elasticsearch documents by query in Version 5; Tags: Delete Elasticsearch documents by query in Version 5
    Loading...

    Delete Elasticsearch documents by query in Version 5

    To delete documents from an index has changed in Version 5. A little example how to delete documents in Elasticsearch v5.1.x, how to monitor the status and free up the disk space.

    Warning: There are significant differences between version 2 and 5.

    Search Query

    Check for log messages of application ep2-batch

    >GET logs-2017.02.07/logs/_search { "query": { "term": { "application": { "value": "ep2-batch" } } }, "size": 0, "aggs": { "levels": { "terms": { "field": "level" } } } }

    Too many log messages with DEBUG

    {
      "took": 2137,
      "timed_out": false,
      "_shards": {
        "total": 1,
        "successful": 1,
        "failed": 0
      },
      "hits": {
        "total": 44582853,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "levels": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "DEBUG",
              "doc_count": 24501347
            },
            {
              "key": "INFO",
              "doc_count": 20075370
            },
            {
              "key": "ERROR",
              "doc_count": 5225
            },
            {
              "key": "WARN",
              "doc_count": 911
            }
          ]
        }
      }
    }
    

    Delete Query

    POST logs-2017.02.07/logs/_delete_by_query?conflicts=proceed
    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "application": {
                  "value": "ep2-batch"
                }
              }
            }
          ],
          "filter": {
            "term": {
              "level": "DEBUG"
            }
          }
        }
      }
    }
    

    Tip: run this in a console!

    curl -XPOST "http://elasticsearch:9200/logs-2017.02.07/logs/_delete_by_query?conflicts=proceed" -d'
    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "application": {
                  "value": "ep2-batch"
                }
              }
            }
          ],
          "filter": {
            "term": {
              "level": "DEBUG"
            }
          }
        }
      }
    }' -u tan
    

    Check task status

    Since the task itself may run a long time, you can check the status with the task API.

    GET _tasks?actions=indices:data/write/delete/byquery
    
    {
      "nodes": {
        "UIETB7IDTUa7-vZMb3F11g": {
          "name": "kibana-lb",
          "transport_address": "10.22.62.141:9300",
          "host": "elasticsearch",
          "ip": "10.22.62.141:9300",
          "roles": [],
          "tasks": {
            "UIETB7IDTUa7-vZMb3F11g:2866377": {
              "node": "UIETB7IDTUa7-vZMb3F11g",
              "id": 2866377,
              "type": "transport",
              "action": "indices:data/write/delete/byquery",
              "start_time_in_millis": 1486545212270,
              "running_time_in_nanos": 574241493292,
              "cancellable": true
            }
          }
        }
      }
    }
    

    Free disk space

    The index itself won’t be truncated or optimized. The force merge API allows to force merging of one or more indices through an API. The merge relates to the number of segments a Lucene index holds within each shard. The force merge operation allows to reduce the number of segments by merging them.

    POST logs-2017.02.07/_forcemerge?only_expunge_deletes=true
    

    You can also check the forcemege task :wink:

    GET _tasks?actions=indices:admin/forcemerge*
    
  32. 2017-04-19 - Aggregations in the Elasticsearch Query DSL; Tags: Aggregations in the Elasticsearch Query DSL
    Loading...

    Aggregations in the Elasticsearch Query DSL

    A small example how to count documents in Elasticsearch using the Query DSL Aggregations.

    GET ep2-itu-2017.02.06/ep2/_search
    {
      "size": 0, 
      "aggs": {
        "applications": {
          "terms": {
            "field": "application"
          }
        }
      }
    }
    

    This is the result. Results

    {
      "took": 1565,
      "timed_out": false,
      "_shards": {
        "total": 1,
        "successful": 1,
        "failed": 0
      },
      "hits": {
        "total": 46331398,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "applications": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "ep2-batch",
              "doc_count": 46170714
            },
            {
              "key": "ep2-gateway",
              "doc_count": 145087
            },
            {
              "key": "ep2-proxy",
              "doc_count": 14956
            },
            {
              "key": "ep2-scs",
              "doc_count": 46
            }
          ]
        }
      }
    }
    
  33. 2017-01-24 - Fix timestamp parse failure in Elasticsearch; Tags: Fix timestamp parse failure in Elasticsearch
    Loading...

    Fix timestamp parse failure in Elasticsearch

    If you use logstash to send logs to elasticsearch in JSON format, you may experience a _timestampparsefailure for this date format [24/Jan/2017:09:04:07 +0100]. This date format is often used in the access.log e.g. JBoss EAP. To solve that, just add an additional date format pattern into the index template.

    The date [24/Jan/2017:09:04:07 +0100] has following format [dd/MMM/yyyy:HH:mm:ss Z]. The format is derived from JodaTime. We can add the new pattern to the index template, that defines the data mapping.

    A demonstration: Create index for test.

    PUT testdata
    {
      "settings": {
        "number_of_shards": 1
      },
      "mappings": {
        "_default_" :{
          "properties": {
            "@timestamp": {
              "type":   "date",
              "format": "\[dd/MMM/yyyy:HH:mm:ss Z\]"
            }
          }
        }
      }
    }
    

    Now we create a test entry, and look if Elasticsearch complains.

    POST testdata/logs
    {
      "@timestamp": "[24/Jan/2017:09:04:07 +0100]",
      "message" : "Salut Phillipe"
    }
    

    Now we query the created log entry.

    GET testdata/logs/_search
    {
      "query": { "match_all": {}}
    }
    

    And the result:

    TODO
    
  34. 2017-01-19 - The all meta field in Elasticsearch; Tags: The all meta field in Elasticsearch
    Loading...

    The all meta field in Elasticsearch

    Are you wondering yourself how Elasticsearch finds the text, that you are searching for? Learn more about the _all meta field.

    Definition

    The _all field is a special catch-all field which concatenates the values of all of the other fields into one big string, using space as a delimiter, which is then analyzed and indexed, but not stored. This means that it can be searched, but not retrieved. Elasticsearch Reference 5.1

    Demo

    Let’s look at some examples.

    Setup Index

    First create the testdata index. _all is usually always enabled, but this is a demo :smirk:.

    >PUT testdata { "settings": { "index": { "number_of_shards": 1 } }, "mappings": { "logs": { "_all": { "enabled": true } } } }

    Add some test data

    Some fancy data to search for. The data contains PANs (Primary Account Numbers or Credit Card Numbers). As you can see, the PAN occurs in different fields.

    POST testdata/logs
    {
      "pan" : "4000000000000002",
      "firstname": "John",
      "lastname": "Legend",
      "profession": "Musician"
    }
    POST testdata/logs
    {
      "fistname" : "4026000000000002",
      "lastname": "Danger"
    }
    POST testdata/logs
    {
      "merchant": "tancomat 3000",
      "firstname": "Beat",
      "lastname": "Sommer",
      "profession": "issuer",
      "comment": "5100000000000008"
    }
    

    Search for PANs

    To search for a PAN we just use the query_string search.

    When not explicitly specifying the field to search on in the query string syntax, the `index.query.default_field` will be used to derive which field to search on. It defaults to `_all` field.

    Query String Query

    GET testdata/_search
    {
      "query": {
        "query_string": {
          "query": "/[3-9][0-9]{13,18}/"
        }
      }
    }
    

    This will give you all documents, that matches the regexp for a PAN.

    {
      "took": 8,
      "timed_out": false,
      "_shards": {
        "total": 1,
        "successful": 1,
        "failed": 0
      },
      "hits": {
        "total": 3,
        "max_score": 1,
        "hits": [
          {
            "_index": "testdata",
            "_type": "logs",
            "_id": "AVm2NAF6bFiGlYewuu9_",
            "_score": 1,
            "_source": {
              "pan": "4000000000000002",
              "firstname": "John",
              "lastname": "Legend",
              "profession": "Musician"
            }
          },
          {
            "_index": "testdata",
            "_type": "logs",
            "_id": "AVm2NCI_bFiGlYewuu-F",
            "_score": 1,
            "_source": {
              "fistname": "4026000000000002",
              "lastname": "Danger"
            }
          },
          {
            "_index": "testdata",
            "_type": "logs",
            "_id": "AVm2NEF0bFiGlYewuu-L",
            "_score": 1,
            "_source": {
              "merchant": "tancomat 3000",
              "firstname": "Beat",
              "lastname": "Sommer",
              "profession": "issuer",
              "comment": "5100000000000008"
            }
          }
        ]
      }
    }
    

    Exclude fields

    You may want to exclude certain fields from the _all search. In the index type mapping just configure it not to include in the all field. See below the example for date field.

    PUT my_index
    {
      "mappings": {
        "my_type": {
          "properties": {
            "title": { 
              "type": "text"
            },
            "content": { 
              "type": "text"
            },
            "date": { 
              "type": "date",
              "include_in_all": false
            }
          }
        }
      }
    }
    

    Summary

    • _all is a metadata field
    • all values are appended into one field value as concatenated String
    • is enabled by default

    :warning:️ But remember: :warning:

    The _all field is not free: it requires extra CPU cycles and uses more disk space. If not needed, it can be completely disabled or customised on a per-field basis.

  35. 2017-01-17 - Backup your Elasticsearch data with Amazon S3; Tags: Backup your Elasticsearch data with Amazon S3
    Loading...

    Backup your Elasticsearch data with Amazon S3

    As I mentioned before, how easy it is to backup your Elasticsearch data with the snapshot and restore API, today’s post demonstrates how to backup the data to Amazon S3 file storage.

    Install the plugin

    First you need to install the elasticsearch plugin for that:

    ```bashsudo bin/elasticsearch-plugin install repository-s3

    
    ## Create a user for S3
    
    There are various ways how to access the S3 storage. I demonstrate the simplest one.
    
    Log in into your AWS console and go to Security. Create a user with an access key. I named the user elasticsearch.
    
    ![Add AWS S3 user](https://media.cinhtau.net/elasticsearch/s3-add-user.png){:class="img-responsive"}
    
    And assign the user to the group backup with permission for S3.
    
    ![User permissions](https://media.cinhtau.net/elasticsearch/s3-group-permission-1024x354.png){:class="img-responsive"}
    
    After that AWS will generate the access and secret key for the Elasticsearch user.
    
    ## Configure your S3 access
    
    You need to alter your `elasticsearch.yml` for that. Following settings are exemplary and doesn't represent real values.
    
    ```yaml
    cloud:
        aws:
            access_key: AAAABBBB1234CCCC5678
            secret_key: Ahfk380HqZR7sUYdeH2Xw*ZxyY8fwlF5QVQoxiJ$
            s3.region: eu-central
    

    Do the backup

    Following steps demonstrates the backup process.

    Check plugin

    In the Kibana console you can check if the repository-s3 is installed.

    GET _cat/plugins
    
    23Y9vRH repository-s3 5.1.2
    23Y9vRH x-pack        5.1.2
    

    Now we can define the S3 bucket. Replace my-s3-bucket with your bucket name and maybe replace the region, as you need it. The repository will also be verified.

    PUT _snapshot/s3
    {
      "type": "s3",
      "settings": {
        "bucket": "my-s3-bucket",
        "compress": true,
        "region": "eu-central-1"
      }
    }
    

    To verify it manually:

    POST /_snapshot/s3/_verify
    

    Backup Kibana

    We do a backup and name the snapshot upgrade_512.

    PUT _snapshot/s3/upgrade_512
    {
      "indices": ".kibana",
      "include_global_state": false
    }
    

    Check snapshot status

    GET _snapshot/s3/_all
    {
      "snapshots": [
        {
          "snapshot": "upgrade_512",
          "uuid": "RPUl1FXuRzyt6_pRxEcgWw",
          "version_id": 5010299,
          "version": "5.1.2",
          "indices": [
            ".kibana"
          ],
          "state": "SUCCESS",
          "start_time": "2017-01-15T20:37:47.333Z",
          "start_time_in_millis": 1484512667333,
          "end_time": "2017-01-15T20:37:50.838Z",
          "end_time_in_millis": 1484512670838,
          "duration_in_millis": 3505,
          "failures": [],
          "shards": {
            "total": 1,
            "failed": 0,
            "successful": 1
          }
        }
      ]
    }
    

    Looking in S3 I notice that elasticsearch did the upgrade in the root of the bucket.

    Wrong backup

    This was never intended. We can easily delete it with:

    DELETE _snapshot/s3/upgrade_512
    

    The right definition is to put a base_path to it. This will place the data into the folder elasticsearch.

    PUT _snapshot/s3
    {
      "type": "s3",
      "settings": {
        "bucket": "my-s3-bucket",
        "compress": true,
        "base_path": "elasticsearch",
        "region": "eu-central-1"
      }
    }
    
  36. 2017-01-15 - Snapshot and Restore Elasticsearch Indices; Tags: Snapshot and Restore Elasticsearch Indices
    Loading...

    Snapshot and Restore Elasticsearch Indices

    As written previously before how to backup the Kibana index, there is more to the snapshot and restore API. A pretty cool feature is the backup capability to Amazon S3 or a Hadoop FS. Furthermore you need to install respective plugins for that. This post demonstrates only a snapshot on a shared filesystem which doesn’t require a plugin. Some commands that were useful during my cluster upgrade.

    Define a snapshot destination

    PUT _snapshot/nas
    {
      "type": "fs",
        "settings": {
          "compress": "true",
          "location": "/var/opt/elasticsearch/repo/prod"
        }
    }
    

    List all snapshot destinations

    GET _snapshot/_all
    

    You might see something like this

    {
      "nas": {
        "type": "fs",
        "settings": {
          "compress": "true",
          "location": "/var/opt/elasticsearch/repo/prod"
        }
      }
    }
    

    Do a snapshot for Kibana

    PUT _snapshot/nas/upgrade_512
    {
      "indices": ".kibana",
      "ignore_unavailable": "true",
      "include_global_state": false
    }
    

    Query all existing snapshot on this snapshot point, pay attention to the state. If it isn’t SUCCESS you better ensure or investigate the backup problem.

    GET _snapshot/nas/_all
    
    {
      "snapshots": [
        {
          "snapshot": "upgrade_512",
          "uuid": "ruBTqGRHQMuyokU3OovSiw",
          "version_id": 5010299,
          "version": "5.1.2",
          "indices": [
            ".kibana"
          ],
          "state": "SUCCESS",
          "start_time": "2017-01-15T08:04:54.216Z",
          "start_time_in_millis": 1484467494216,
          "end_time": "2017-01-15T08:04:54.528Z",
          "end_time_in_millis": 1484467494528,
          "duration_in_millis": 312,
          "failures": [],
          "shards": {
            "total": 1,
            "failed": 0,
            "successful": 1
          }
        },
        {
          "snapshot": "snapshot_7",
          "uuid": "7ZdZOvTlQ2SrdjjbgcjqSw",
          "version_id": 5010299,
          "version": "5.1.2",
          "indices": [
            ".kibana"
          ],
          "state": "SUCCESS",
          "start_time": "2017-01-15T08:08:41.391Z",
          "start_time_in_millis": 1484467721391,
          "end_time": "2017-01-15T08:08:41.918Z",
          "end_time_in_millis": 1484467721918,
          "duration_in_millis": 527,
          "failures": [],
          "shards": {
            "total": 1,
            "failed": 0,
            "successful": 1
          }
        }
      ]
    }
    

    If you want to restore the index, you may have to close it before and then use the restore action.

  37. 2017-01-10 - List and sort Elasticsearch Indices; Tags: List and sort Elasticsearch Indices
    Loading...

    List and sort Elasticsearch Indices

    Just did this weekend the migration of several Elasticsearch Clusters from v2.4.3 to v5.1.1. The catalog API for indices has new features. You can sort the columns.

    See below the example for count of primary shards (output shortened).

    [vinh@localhost ~]$ curl "localhost:9200/_cat/indices?v&s=pri:desc"
    health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   fo-dev-2017.01.09               lhf2liV8Spej8wH-CfTlUw   5   1   10800448            0      9.5gb          4.7gb
    green  open   elk-etu-2017.01.10              d9xtBETKQ0OkKHqye3wXXQ   5   1         48            0    342.5kb          180kb
    green  open   tandem-2017.01                  -X1gUa_YTju8FCHyA3xtrw   2   1     444368            0    140.4mb         70.2mb
    green  open   fo-log-2016                     W-KA23jCTGO5-_fLGc4Qmw   2   1      47187            0       95mb         47.5mb
    green  open   .watcher-history-2-2016.11.22   Z5gq1NEqSQaBJ99LPPlbfA   1   1          2            0     19.7kb          9.8kb
    green  open   fo-etu-2017.01.07               IDNa1g17QY2ZR9-oKbzt_g   1   1    2751560            0      1.2gb        644.6mb
    green  open   fo-dev-2017.01.10               sBiZBcmQQfqbWAUO1mC3UA   1   1    3780704            0      2.3gb          1.2gb
    green  open   fo-itu-2017.01                  Ot9BuOjZQvemz_Z-45sbuA   1   1   15325866            0     14.6gb          7.3gb
    
  38. 2016-12-22 - Import and remove gpg key with rpm; Tags: Import and remove gpg key with rpm
    Loading...

    Import and remove gpg key with rpm

    Got to check out Elasticsearch Curator. Elastic use the PGP key D88E42B4. Since working with rhel 7 I had to use rpm and yum for installing packages.

    Install or import the public key into the keyring:

    rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
    

    After the Proof of Concept I perform a cleanup and remove the key.

    Query installed keys:

    root@omega:~# rpm -q gpg-pubkey
    gpg-pubkey-fd431d51-4ae0493b
    gpg-pubkey-352c64e5-52ae6884
    gpg-pubkey-5ce2d476-50be41ba
    gpg-pubkey-2fa658e0-45700c69
    gpg-pubkey-b1275ea3-546d1808
    gpg-pubkey-bfdbb0c6-54c7ee8b
    gpg-pubkey-de57bfbe-53a9be98
    gpg-pubkey-d88e42b4-52371eca
    

    The last key in the list is the imported one of elastic. Remove it with:

    rpm --erase --allmatches gpg-pubkey-d88e42b4-52371eca
    
  39. 2016-11-08 - Monitor Elasticsearch in Docker with Monit; Tags: Monitor Elasticsearch in Docker with Monit
    Loading...

    Monitor Elasticsearch in Docker with Monit

    Running Elasticsearch as docker container is straightforward. If you don’t have a cluster manager like Kubernetes, monit can help you to keep track of the container lifecycle.

    An exemplary monit configuration:

    CHECK PROCESS elasticsearch WITH MATCHING "org.elasticsearch.bootstrap.Elasticsearch"
    CHECK PROGRAM elasticsearch_container WITH PATH "/usr/bin/docker top elasticsearch"
      if status != 0 then alert
        alert warning@cinhtau.net
      group elkstack
    CHECK HOST elasticsearch_healthcheck WITH ADDRESS cinhtau.net
      if failed url http://cinhtau.net:9200 for 5 cycles
        then alert
          alert warning@cinhtau.net BUT not on { action, instance }
      depends on elasticsearch_container
      group elkstack
    CHECK FILE elasticsearch_logfile with path /var/log/elasticsearch/test-cluster.log
      if match "ERROR" for 2 times within 5 cycles then alert
        alert elasticsearch@cinhtau.net BUT not on { action, instance, nonexist }
      depends on elasticsearch_container
      group elkstack
    

    Pay attention to the nonexist option. Monit does an implicit check if the logifle exists. Elasticsearch writes a log file. Our housekeeping, logrotate or some kind of janitor script, rename, compress or delete this file. So if the file is missing, monit would complain without the option. If the file doesn’t exists, which is basically good for prod, you don’t want to be notified or warned. No logs, no errors, no worries.

  40. 2016-11-08 - Get distinct field values in Elasticsearch; Tags: Get distinct field values in Elasticsearch
    Loading...

    Get distinct field values in Elasticsearch

    The aggregations framework helps provide aggregated data based on a search query. It is based on simple building blocks called aggregations, that can be composed in order to build complex summaries of the data. There are several types of aggregations. The cardinality aggregation is the exact match for distinct field values.

    A single-value metrics aggregation that calculates an approximate count of distinct values. Values can be extracted either from specific fields in the document or generated by a script.

    Elasticsearch Reference Cardinality Aggregation

    Example: Terminals may have more than one transaction. We want to look for all terminals that have made transactions in the last 15 minutes.

    GET transactions/_search
    {
      "size": 0,
      "query": {
        "bool": {
          "must": [
            { "match": { "origin": "EPDEUX" }},
            { "match": { "respCode": "00" }},
            { "range": { "@timestamp": { "gte": "now-15m" }}}
          ]
        }
      },
      "aggs" : {
            "distinct_terminals" : {
                "cardinality" : {
                  "field" : "terminalId"
                }
            }
        }
    }
    

    The response looks like this. From 84075 hits/transactions there were 29700 terminals involved.

    {
      "took": 602,
      "timed_out": false,
      "_shards": {
        "total": 33,
        "successful": 33,
      },
      "hits": {
        "total": 84075,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "distinct_terminals": {
          "value": 29700
        }
      }
    }
    
  41. 2016-10-25 - Monitoring of TCP connections with collectd and Elasticsearch; Tags: Monitoring of TCP connections with collectd and Elasticsearch
    Loading...

    Monitoring of TCP connections with collectd and Elasticsearch

    If you have an application which does distributed computing, i.e. means connects to other servers and send data, it is interesting to monitor the connection handling. Therefore collectd provides the plugin tcpconns, that allows to monitor dedicated ports. This data can be send to logstash, where it can have graphite or elasticsearch as output. Having the data in a metrics storage, visualization with Kibana or Grafana is a piece of cake.

    Collectd

    Collectd is OpenSource and a daemon which collects system and application performance metrics periodically. The first step is to install collectd.

    # deb based install
    # apt-get install collectd collectd-utils
    
    # rpm based install
    # yum install collectd collectd-utils
    

    Minimal System Configuration

    Collectd is very efficient. If we have no any other areas of interest like CPU, Memory, etc. the minimal setup for collectd:

    /etc/collectd.conf

    LoadPlugin syslog
    Include "/etc/collectd.d"
    

    Add your Configuration

    The folder /etc/collectd.d should contain our configuration for application and the output destination.

    Monitoring Configuration

    Example config for tcpconns

    LoadPlugin tcpconns
    <Plugin "tcpconns">
      ListeningPorts false
      # Outbound connections; i.e. the servers we're connecting to:
      RemotePort "2251"
      RemotePort "3513"
      RemotePort "3504"
      # Locally listening ports; i.e. the servers we're running:
      LocalPort "22"
      LocalPort "25"
      LocalPort "8080"
      LocalPort "8181"
    </Plugin>
    

    Output Configuration

    collectd writes the collected data to the UDP port 25826 in the binary protocol. As receiving end I chose logstash with its collectd input plugin. Replace IP-address and port to your needs.

    out-network.conf

    LoadPlugin network
    <Plugin network>
        <Server "10.22.12.121" "25826">
        </Server>
    </Plugin>
    

    Start collecting data

    First we have to enable the service

    # systemctl enable collectd.service
    

    Next is to start the service

    # systemctl start collectd.service
    

    Check with status if the service is running properly. As you can see the monitoring takes only 508 kb in memory.

    root@alpha:/etc/collectd.d# systemctl status collectd.service
    ● collectd.service - Collectd statistics daemon
       Loaded: loaded (/usr/lib/systemd/system/collectd.service; enabled; vendor preset: disabled)
       Active: active (running) since Mon 2016-10-24 13:31:46 CEST; 22min ago
         Docs: man:collectd(1)
               man:collectd.conf(5)
     Main PID: 26116 (collectd)
       Memory: 508.0K
       CGroup: /system.slice/collectd.service
               └─26116 /usr/sbin/collectd
    Oct 24 13:31:46 alpha systemd[1]: Starting Collectd statistics daemon...
    Oct 24 13:31:46 alpha collectd[26116]: supervised by systemd, will signal readyness
    Oct 24 13:31:46 alpha systemd[1]: Started Collectd statistics daemon.
    Oct 24 13:31:46 alpha collectd[26116]: Initialization complete, entering read-loop.
    Oct 24 13:31:46 alpha collectd[26116]: tcpconns plugin: Reading from netlink succeeded. Will use the netlink method from now on.
    

    Logstash

    On the receiving end of collectd is logstash. Logstash needs an input configuration for collectd.

    100_input_collectd.conf

    input {
      udp {
        port => 25826
        buffer_size => 1452
        codec => collectd { }
      }
    }
    

    Logstash can read the collectd data, but must also decide where to store the data. As fitting endpoint I take Elasticsearch.

    The collected data shall be written to the metrics index.

    900_output_es.conf

    output {
    if [collectd_type] =~ /.+/ {
    	elasticsearch {
    		hosts => [ "es-1", "es-2", "es-aws" ]
    		index => "metrics-%{+xxxx.ww}"
    	}
    }}
    

    Notice that the if condition is not necessary, I only ensure to write all events that contains the field collectd_type to the metrics index.

    Since the data is in Elasticsearch we can visualize the data in Kibana, see the example visualization for SSH. This can be embedded into existing dashboards.

    Summary

    collectd is an easy and lightweight method to monitor network connections. If you stay low profile, and do not monitor cpu, memory and many other stuff, you have a simple, brain dead simple and robust monitoring.

  42. 2016-10-14 - Controlling Elasticsearch Marvel Data Collection; Tags: Controlling Elasticsearch Marvel Data Collection
    Loading...

    Controlling Elasticsearch Marvel Data Collection

    Marvel is the monitoring plugin for Elasticsearch and Kibana. If you do maintenance in Elasticsearch, and therefore close indices, you might stumble over some ERROR messages in the elasticsearch log. (Update: elastic rebranded it as x-pack monitoring)

    collector [index-recovery-collector] - failed collecting data
    ClusterBlockException[blocked by: [FORBIDDEN/4/index closed]
    

    In this case you can update the cluster settings, and tell marvel from which indices to collect data. You can explicitly include or exclude indices by prepending + to include the index, or - to exclude the index. Note the wildcard.

    PUT _cluster/settings
    {
      "transient": {
          "marvel.agent.indices" : ["dev,itu,etu,-prd*"]
      }
    }
    

    This should return the result:

    {
      "acknowledged": true,
      "persistent": {},
      "transient": {
        "marvel": {
          "agent": {
            "indices": [
              "dev,itu,etu,-prd*"
            ]
          }
        }
      }
    }
    
  43. 2016-09-25 - Correct type mapping in index template for Elasticsearch; Tags: Correct type mapping in index template for Elasticsearch
    Loading...

    Correct type mapping in index template for Elasticsearch

    If you use Dropwizard Metrics and the Metrics Reporter you might come into the situation, that the max value is not reported as long value. If it is reported as double Elasticsearch will complain you have an invalid mapping type, since a previous one has the type long. To avoid the situation, you can define in the index template, the type Elasticsearch for new indices from that template.

    This mapping declares for the document types timer and histogram the field max is of type double. The template field contains the index schema-name. So a new index metrics-2016.09.26 will use this template.

    PUT /_template/metrics
    {
      "template": "metrics-*",
      "mappings": {
        "timer": {
          "properties": {
            "max": {
              "type": "double"
            }
          }
        },
        "histogram": {
          "properties": {
            "max": {
              "type": "double"
            }
          }
        }
      },
      "settings": {
        "number_of_shards": 1
      },
      "aliases" : {
            "metrics" : {}
      }
    }
    
  44. 2016-09-19 - Reindex data in Elasticsearch; Tags: Reindex data in Elasticsearch
    Loading...

    Reindex data in Elasticsearch

    Today we have reached more than 3000 shards in our elasticsearch clusters. Digging a little deeper, that is definitely too much. Since a shard (primary or replica) is a Lucene index, it consumes file handles, memory, and CPU resources. Each search request will touch a copy of every shard in the index, which isn’t a problem when the shards are spread across several nodes. Contention arises and performance decreases when the shards are competing for the same hardware resources. :- If you keep the logstash daily default, you will come in the situation very soon. I choose now a monthly basis. The latest 2-3 months are kept and old indices are deleted. The outcome is to merge all daily indices of a month to a big index. Therefore the Elasticsearch Reindex API is very useful.

    A simple example

    curl -XPOST "http://alpha:9200/_reindex" -d'
    {
       "source": {
         "index": "metrics-2016.08.*"
       },
       "dest": {
         "index": "metrics-2016.08"
       }
    }'
    

    The source contains an asterisk (for each day). Depending on the data, it can take a while. Therefore you can check with the Task API, the current status of the reindexing.

    GET /_tasks/?pretty&detailed=true&actions=*reindex
    {"nodes": {
        "VrNH--qmTBOr2bwqzXdHqQ": {
          "name": "delta",
          "transport_address": "10.25.23.47:9300",
          "host": "10.25.23.47",
          "ip": "10.25.23.47:9300",
          "attributes": {
            "master": "false"
          },
          "tasks": {
            "VrNH--qmTBOr2bwqzXdHqQ:754011": {
              "node": "VrNH--qmTBOr2bwqzXdHqQ",
              "id": 754011,
              "type": "transport",
              "action": "indices:data/write/reindex",
              "status": {
                "total": 4411100,
                "updated": 0,
                "created": 3162000,
                "deleted": 0,
                "batches": 3163,
                "version_conflicts": 0,
                "noops": 0,
                "retries": 0,
                "throttled_millis": 0,
                "requests_per_second": "unlimited",
                "throttled_until_millis": 0
              },
              "description": "",
              "start_time_in_millis": 1474309906870,
              "running_time_in_nanos": 440069372981
            }}}}}
    

    Each batch contains of 1000 documents (default size). After the task has finished, the response looks like this:

    {"took":605281,"timed_out":false,"total":4411100,"updated":0,"created":4411100,"batches":4412,"version_conflicts":0,"noops":0,"retries":0,"throttled_millis":0,"requests_per_second":"unlimited","throttled_until_millis":0,"failures":[]}
    

    The number itself is in milliseconds and the re-indexing took approximately 10 minutes.

  45. 2016-09-15 - Handling logstash input multiline codec; Tags: Handling logstash input multiline codec
    Loading...

    Handling logstash input multiline codec

    The multiline codec will collapse multiline messages and merge them into a single event. The default limit is 500 lines. If there are over 500 lines appended, the multiline codec split the message to the next 500 lines and so forth. This post demonstrates how to deal with this situation. Elasticsearch receives in tags the multiline_codec_max_lines_reached.

        "tags" => [
            [0] "multiline",
            [1] "multiline_codec_max_lines_reached"
        ]
    

    Basically you can use a regular expression to handle these lines. As testing basis I just take some java application logs, e.g. JBoss EAP.

    2016-09-14 16:47:12,845 INFO  [default-threads - 22] [497a1672-52ff-42b7-b53c-df202834c2f5] [APP] [] [] [TERMINAL] (TxManager) Final response message with escaped sensitive data: ...
    

    Let’s assume a regular log line always starts with an ISO Timestamp. For demonstration purposes I lower the regular limit to 25 lines.

    input {
      file {
        path => "/var/log/test/server.log"
        start_position => beginning
        codec => multiline {
          pattern => "^%{TIMESTAMP_ISO8601}"
          negate => true
          what => previous
    	  max_lines => 25
        }
        sincedb_path => "/dev/null"
      }
    }
    output { stdout => { codec => "rubydebug" }}
    

    You can use a regular expression (line starts with ISO timestamp) to properly grok the message and in else do anything you like, e.g. drop the message.

    filter {
      if [message] =~ "^([0-9]{4})-?(1[0-2]|0[1-9])-?(3[01]|0[1-9]|[12][0-9])" {
    	grok {
    	  match => { "message" => "%{TIMESTAMP_ISO8601:datetime}\s%{WORD:level}\s*\[%{DATA:thread}\]\s\[%{DATA:requestid}?\]\s+\[%{WORD:appid}?\]\s+\[%{DATA:sessionid}?\]\s+\[%{DATA:trxid}?\]\s+\[%{DATA:terminalid}?\]\s+\(%{DATA:class}\)\s+%{GREEDYDATA:logmessage}?" }
    	}
      }
      else {
    	# drop continued multilines
    	drop { }
      }
    }
    

    If you do nothing just ensure not to grok the message or grok it properly.

  46. 2016-09-08 - Migrate elasticsearch indices from different clusters with logstash; Tags: Migrate elasticsearch indices from different clusters with logstash
    Loading...

    Migrate elasticsearch indices from different clusters with logstash

    I got an exceptional case in the office. Some application logs, which belongs to a dev and testing environment, were stored or reported in the elasticsearch production cluster. Therefore a cleanup or migration was necessary.

    logstash is an easy solution for migrating data from cluster a to cluster b. In my case cluster production to cluster test. logstash provides elasticsearch as input and output plugin. The input queries elasticsearch and retieves the documents as json. The output writes the json to the target elasticsearch cluster. An example configuration.

    vinh@omega:~/logstash-2.4.0> cat copy-data.conf
    input{
        elasticsearch {
            hosts => [ "prod-dc1", "prod-dc2", "alpha-dc2", "beta-dc2" ]
            index => "trx-*"
            user => "admin"
            password => "SiriSingsRiri"
        }
    }
    output {
        stdout { codec => "rubydebug" }
        elasticsearch {
            hosts => [ "dev", "delta", "gamma" ]
            index => "trx-%{+YYYY.MM.dd}"
        }
    }
    
  47. 2016-09-05 - Alerting with Elasticsearch Watcher; Tags: Alerting with Elasticsearch Watcher
    Loading...

    Alerting with Elasticsearch Watcher

    Watcher is a commercial plugin for alerting based on elasticsearch documents. The required knowledge could be overwhelming, but is rather straightforward and pretty simple after understanding the fundamental concepts. This post will give you a simple watch definition to grasp the concept. If you have application logs and store them into elasticsearch, you want to be alerted if a log entry with log level ERROR is reported. Let’s do this.

    Preconditions

    This demo requires a simple elasticsearch instance with the watcher plugin installed. A fresh installation grants you a 30 day trial period for watcher. Just take my elasticsearch docker image for the demo. The run command disables the security module shield, so you can use elasticsearch without user authentication.

    >tan@omega:~$ mkdir -p /tmp/data /tmp/logs tan@omega:~$ sudo docker run -it --net=host \ -v /tmp/data:/elasticsearch/data \ -v /tmp/logs:/elasticsearch/logs \ cinhtau/elasticsearch:latest \ -Des.shield.enabled=false

    Test data

    I will create a elasticsearch document that represents a log entry indicating an application error.

    Setup test index

    First of all we create a test index, for storing the elasticsearch documents or log entries. For each event we enable the timestamp to be automatically set. Following commands are RESTful http requests done with Sense - a Kibana UI plugin for the elasticsearch REST API (see Screenshot). Use curl if no Sense is available. (Update: Sense is now Kibana Console with x-pack rebranding)

    Kibana Sense

    PUT test
    {
      "mappings": {
        "logs": {
          "_timestamp": {
            "enabled": true
          }
        }
      }
    }
    

    Create test log entry

    We create a new document (log entry), the document type is logs :-) . A unique message id will be given by elasticsearch if we don’t provide one. In this case the id 1 is assigned.

    POST test/logs/1
    {
      "path": "/var/log/myapp.log",
      "host": "omega",
      "application": "p2-fear",
      "environment": "prd",
      "level": "ERROR",
      "thread": "MSC service thread 1-57",
      "logmessage": "MessageQueue is full.",
      "seq": 572431,
      "exchangeId": "4711",
      "transactionId": "DHF720l0S",
      "tags": [
        "critical", "infrastructure"
      ]
    }
    

    Result

      "_index": "test",
      "_type": "logs",
      "_id": "1",
      "_version": 1,
      "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
      },
      "created": true
    }
    

    Create a watch definition

    The elastic source pretty much sums it up:

    At a high-level, a typical watch is built from four simple building blocks:
    • Schedule - Define the schedule on which to trigger the query and check the condition.
    • Query - Specify the query to run as input to the condition. Watcher supports the full Elasticsearch query language, including aggregations.
    • Condition - Define your condition to determine whether to execute the actions. You can use simple conditions (always true), or use scripting for more sophisticated scenarios.
    • Actions - Define one or more actions, such as sending email, pushing data to 3rd party systems via webhook, or indexing the results of your query.

    Schedule

    The schedule is every 15 minutes.

    "trigger" : {
      "schedule" : { "interval" : "15m" }
    }
    

    Design the query

    The hardest part of the watcher definition is to build the search query.

    GET test/_search
    {
      "query": {
        "bool": {
          "must": [
            { "match": { "application": "p2-fear" }},
            { "match": { "level": "ERROR" }},
            { "range": { "_timestamp": {
                "gte": "now-15m",
                "lte": "now"}}}
          ]
        }
      }
    }
    

    The most important definition is the time range, it should be identical to your watch schedule. You want to check in the last 15 minutes if there was an error. Otherwise you will report the same errors again. Executing the search it should give you one hit :wink: . As a side note, if you are dealing with logs send by logstash the _timestamp field is @timestamp. :-o Adapt your search query to that, if it is the case!

      "took": 23,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 1.0701059,
        "hits": [
          {
            "_index": "test",
            "_type": "logs",
            "_id": "1",
            "_score": 1.0701059,
            "_timestamp": 1473102364442,
            "_source": {
              "path": "/var/log/myapp.log",
              "host": "omega",
              "application": "p2-fear",
              "environment": "prd",
              "level": "ERROR",
              "thread": "MSC service thread 1-57",
              "logmessage": "MessageQueue is full.",
              "seq": 572431,
              "exchangeId": "4711",
              "transactionId": "DHF720l0S",
              "tags": [
                "critical",
                "infrastructure"
              ]
            }
          }
        ]
      }
    }
    

    So the input definition looks like this:

    "input" : {
      "search" : {
        "request" : {
          "indices" : [ "logs" ],
          "body" : {
            "query": {
              "bool": {
                "must": [
                  { "match": { "application": "p2-fear" }},
                  { "match": { "level": "ERROR" }},
                  { "range": { "_timestamp": {
                      "gte": "now-15m",
                      "lte": "now"}}}]
              }
            }
          }
        }
      }
    }
    

    Watch Condition

    The condition in the watch definition, defines under which circumstances an action should be executed. In our case, execute an action if the search results are greater than 0.

    "condition" : {
      "compare" : { "ctx.payload.hits.total" : { "gt" : 0 }}
    }
    

    Action or Alerting

    Our action is an email notification. This assumes you have configured a smtp account in the elasticsearch.yml for watcher. The action is straightforward. An action named email_users is defined. A HTML mail is composed. Using HTML in JSON needs escapes of special characters.

    "actions" : {
      "email_users" : {
        "email": {
          "to": "'Jason Bourne <jason.bourne@example.com>'",
          "subject": " executed",
          "body": {
            "html": "<\/b> executed with  hits"
          }
        }
      }
    }
    

    Complete Watch Definition

    The complete defintion looks like this. Pay attention to replace a valid mail address.

    PUT /_watcher/watch/log_error_watch
    {
      "trigger" : {
        "schedule" : { "interval" : "15m" }
      },
      "input" : {
        "search" : {
          "request" : {
            "indices" : [ "logs" ],
            "body" : {
              "query": {
                "bool": {
                  "must": [
                    { "match": { "application": "p2-fear" }},
                    { "match": { "level": "ERROR" }},
                    { "range": { "_timestamp": {
                        "gte": "now-15m",
                        "lte": "now"}}}
                  ]
                }
              }
            }
          }
        }
      },
      "actions" : {
        "email_users" : {
          "email": {
            "to": "'Jason Bourne'",
            "subject": " executed",
            "body": {
              "html": "<\/b> executed with  hits"
            }
          }
        }
      }
    }
    

    Administering Watch Definition

    List all watch ids

    GET .watches/_search
    {
      "fields" : [],
      "query" : {"match_all" : { } }
    }
    

    The watch statistics itself can be queried with:

    GET /_watcher/watch/log_error_watch
    

    Since I haven’t configured an email account, an IllegalStateException is raised. But the interesting part when the watch was executed and if the conditions were met.

      "found": true,
      "_id": "log_error_watch",
      "_status": {
        "version": 2,
        "state": {
          "active": true,
          "timestamp": "2016-09-05T19:26:59.421Z"
        },
        "last_checked": "2016-09-05T19:42:00.182Z",
        "last_met_condition": "2016-09-05T19:42:00.182Z",
        "actions": {
          "email_users": {
            "ack": {
              "timestamp": "2016-09-05T19:26:59.421Z",
              "state": "awaits_successful_execution"
            },
            "last_execution": {
              "timestamp": "2016-09-05T19:42:00.182Z",
              "successful": false,
              "reason": "IllegalStateException[cannot find default email account as no accounts have been configured]"
            }
          }
        }
      }
    ..
    

    To delete the watch:

    DELETE /_watcher/watch/log_error_watch
    
  48. 2016-09-02 - Using the native realm in Elasticsearch Shield; Tags: Using the native realm in Elasticsearch Shield
    Loading...

    Using the native realm in Elasticsearch Shield

    Shield is the security plugin for Elasticsearch. Security in Elasticsearch is based on users with associated roles. A quick demonstration how to use it.

    First you need to setup the realm in the elasticsearch.yml configuration. Find below a custom test configuration:

    cluster.name: demo
    #
    node:
    	name: master
    	master: true
    	data: true
    #
    path:
    	data: /var/opt/es/data
    	logs: /var/log/es
    #
    network.host: alpha
    network.bind_host:
      - _local_
      - _bond0:ipv4_
    http.port: 3333
    shield:
      enabled: true
      authc:
        realms:
          file:
            type: file
            order: 0
          native:
            type: native
            order: 1
    

    The native realm stores the security data in elasticsearch itself. Create user

    curl -XPOST -u admin http://alpha:3333/_shield/user/ironman -d '
    {
      "password" : "frontoff!ce-f0reve3",
      "roles" : [ "devops" ],
      "full_name" : "Michel Erard",
      "email" : "er7@not-real.org",
      "metadata" : {
        "intelligence" : 7
      }
    }'
    

    Log entry in the elasticsearch log

    [2016-08-11 13:48:21,460][INFO ][shield.action.user       ] [client] added user [ironman]
    

    Show created user

    vinh@alpha:~> curl -XGET -u admin http://alpha:3333/_shield/user
    Enter host password for user 'admin':
    {"ironman":{"username":"ironman","roles":["devops"],"full_name":"Michel Erard","email":"er7@acme.com","metadata":{"intelligence":7}}}
    

    Query es as user ironman

    vinh@alpha:~> curl -XGET -u ironman http://alpha:3333
    Enter host password for user 'ironman':
    {
      "name" : "master",
      "cluster_name" : "demo",
      "version" : {
        "number" : "2.3.3",
        "build_hash" : "218bdf10790eef486ff2c41a3df5cfa32dadcfde",
        "build_timestamp" : "2016-05-17T15:40:04Z",
        "build_snapshot" : false,
        "lucene_version" : "5.5.0"
      },
      "tagline" : "You Know, for Search"
    }
    

    Delete user

    vinh@alpha:~> curl -XDELETE -u admin http://alpha:4444/_shield/user/ironman
    Enter host password for user 'admin':
    {"found":true}
    
  49. 2016-09-02 - Use Travis CI in Github to build and deploy to dockerhub; Tags: Use Travis CI in Github to build and deploy to dockerhub
    Loading...

    Use Travis CI in Github to build and deploy to dockerhub

    I love reveal.js - The HTML Presentation Framework. Attending at the Javaland 2016 Conference I saw a awesome usage of reveal.js within a docker container in the Docker Patterns Talk by Roland Huß. Curious and eager to know I explored his github account. Mr. Huß offers the basics in the docker-reveal repository. Using github for docker builds is a great idea. Then I started to play around with docker myself, mostly to maintain and ease administering multiple elasticsearch nodes in a cluster. I felt using github offers me the opportunity to use Travis CI to build the docker image and deploy it to dockerhub - the docker image storage. Is was easier than I thought and is much better than building it manually everytime. This post covers the progress and results.

    Basic Steps

    The general roadmap:

    • Create a public docker repository at dockerhub, to push the docker images to it :-)
    • Create a github repository, to maintain the Dockerfile and source
    • Setup continous build, use in github Travis CI to build and push the docker image to dockerhub
    • Run (pull the docker image from dockerhub) and have fun.

    Dockerhub

    Dockerhub offers unlimited public repositories free of charge for storing the docker images of your software projects. If your don’t want to expose your software to the public choose a private repository.

    Github

    GitHub is a Git repository hosting service that provides free repositories for the public, mostly Open Source development. I simply take my existing docker project, improved-docker-elasticsearch, a custom tailored elasticsearch instance.

    Travis CI

    The Travis CI integration is very simple. You only need to create .travis.yml file, that contains the build definition and will be explained in detail.

    Prior conditions

    We want use the docker service in Travis CI. Docker runs as root, so you need sudo permissions.

    sudo: required
    services:
      - docker
    

    Build the docker image

    I put the build instruction into before_install and check if the build was correctly build in the script section.

    before_install:
      - docker build -t cinhtau/elasticsearch .
    script:
      - docker images cinhtau/elasticsearch
    

    Deploy to Dockerhub

    The last section, contains the push instruction to dockerhub, only if the image was correctly build.

    after_success:
      - if [ "$TRAVIS_BRANCH" == "master" ]; then
        docker login -e="$DOCKER_EMAIL" -u="$DOCKER_USERNAME" -p="$DOCKER_PASSWORD";
        docker push cinhtau/elasticsearch;
        fi
    

    The environment variables are setup in the repository settings within Travis CI (see screenshot). travis-ci-docker-variables

    Run it anywhere

    Having the docker image in dockerhub, I can use my elasticsearch image on any computer that can pull the image from the dockerhub repository and has the right requirements to run the image. No need to build it locally anymore :-) .

    tan@omega:~$ sudo docker pull cinhtau/elasticsearch
    Using default tag: latest
    latest: Pulling from cinhtau/elasticsearch
    8ad8b3f87b37: Already exists
    751fe39c4d34: Already exists
    b165e84cccc1: Already exists
    acfcc7cbc59b: Already exists
    04b7a9efc4af: Already exists
    b16e55fe5285: Already exists
    8c5cbb866b55: Already exists
    e4412b99da57: Pull complete
    60fa44913e1f: Pull complete
    593bcc8c9106: Pull complete
    b065e784dc32: Pull complete
    10cc1e0e4dd9: Pull complete
    093a531dbb6f: Pull complete
    Digest: sha256:c90986a7f3799cdabc7c62ef7f576ed97a3d6648fb5c80a984312b26ec0375ea
    Status: Downloaded newer image for cinhtau/elasticsearch:latest
    
  50. 2016-09-01 - Resolve critical elasticsearch cluster health; Tags: Resolve critical elasticsearch cluster health
    Loading...

    Resolve critical elasticsearch cluster health

    From time to time, you need to perform a cluster upgrade in elasticsearch. During an upgrade, usually the cluster health turn from green to yellow. If it turns red, it is a critical state. One reason might be, that elasticsearch can’t replicate data shards, though the replicas are gone or lost. Using the ES Health REST API, allows you to identify the corrupt indices and delete them.

    First query the cluster health, below example has status red. BTW is was done with Sense. You can guess the respective curl command for the http request/rest call.

    GET _cluster/health
    {
      "cluster_name": "prod",
      "status": "red",
      "timed_out": false,
      "number_of_nodes": 5,
      "number_of_data_nodes": 4,
      "active_primary_shards": 336,
      "active_shards": 675,
      "relocating_shards": 0,
      "initializing_shards": 0,
      "unassigned_shards": 4,
      "delayed_unassigned_shards": 0,
      "number_of_pending_tasks": 0,
      "number_of_in_flight_fetch": 0,
      "task_max_waiting_in_queue_millis": 0,
      "active_shards_percent_as_number": 99.41089837997055
    }
    

    Dig deeper on index level with GET _cluster/health?level=indices. This will give you a large result set. Filter it for unassigned_shards > 0 or simply "status": "red".

        ".marvel-es-1-2016.07.18": {
          "status": "red",
          "number_of_shards": 1,
          "number_of_replicas": 1,
          "active_primary_shards": 0,
          "active_shards": 0,
          "relocating_shards": 0,
          "initializing_shards": 0,
          "unassigned_shards": 2
        },
        ".marvel-es-1-2016.07.21": {
          "status": "red",
          "number_of_shards": 1,
          "number_of_replicas": 1,
          "active_primary_shards": 0,
          "active_shards": 0,
          "relocating_shards": 0,
          "initializing_shards": 0,
          "unassigned_shards": 2
        },
    

    Delete the indices and the cluster will turn green again. In this example:

    DELETE .marvel-es-1-2016.07.18
    DELETE .marvel-es-1-2016.07.21
    
  51. 2016-08-12 - Debug Active Directory security within Elasticsearch Shield; Tags: Debug Active Directory security within Elasticsearch Shield
    Loading...

    Debug Active Directory security within Elasticsearch Shield

    Shield offers the capability to allow authentication with LDAP or the Windows Active Directory. This post explains a simple method to analyze the authentication process.

    There are several files involved:

    • elasticsearch.yml
    • shield/roles.yml
    • shield/role_mapping.yml

    The elasticsearch.yml holds the active directory configuration, for instance:

    shield:
      enabled: true
      authc:
        realms:
          file:
            type: file
            order: 0
          native:
            type: native
            order: 1
          active_directory:
            type: active_directory
            order: 2
            domain_name: ldap.cinhtau.net
            url: ldaps://ldap.cinhtau.net:636
            unmapped_groups_as_roles: false
            group_search.base_dn: "OU=Security,DC=cinhtau,DC=net"
      ssl:
            keystore:
    	  path: /home/tan/omega.jks
    	  password: 8eAx89lJ7
    	truststore:
    	  path: /home/tan/trust.jks
    	  password: 7k-LDPsbZs8d
    

    If you don’t use mutual SSL, the URL should result to ldap://ldap.cinhtau.net:389. Replace ldap.cinhtau.net with your ldap hostname. Pay attention, that I use a three level security model, usually it is not necessary to set the order to zero. The roles.yml should contain your permissions for all indices and kibana 4. See below the role devops which is sufficient as kibana 4 user.

    devops:
      cluster:
          - monitor
      indices:
        - names: '*'
          privileges:
            - view_index_metadata
            - read
        - names: '.kibana*'
          privileges:
            - manage
            - read
            - index
    

    The role_mapping.yml should contain the group or user cn, mapped to role devops.

    # Role mapping configuration file which has elasticsearch roles as keys
    # that map to one or more user or group distinguished names
    #roleA:   this is an elasticsearch role
    #  - groupA-DN  this is a group distinguished name
    #  - groupB-DN
    #  - user1-DN   this is the full user distinguished name
    power_user:
      - "CN=vinh,OU=Development,DC=cinhtau,DC=net"
    devops:
      - "CN=ApplicationEngineering,OU=Zuerich,OU=File Systems,OU=Security,OU=Control Groups,DC=cinhtau,DC=net"
    

    Add the logger to the logging.yml in the logging section:

    shield.authc.activedirectory: TRACE
    

    Doing auth you will similar log messages

    [2016-08-12 09:11:46,530][DEBUG][shield.authc.activedirectory] [zh2-lb] user not found in cache, proceeding with normal authentication
    [2016-08-12 09:11:46,573][DEBUG][shield.authc.activedirectory] [zh2-lb] found these groups [[CN=..]
    [2016-08-12 09:11:46,577][DEBUG][shield.authc.activedirectory] [zh2-lb] authenticated user [vinh], with roles [[devops, power_user]]
    
  52. 2016-08-10 - Remove password from private ssl key; Tags: Remove password from private ssl key
    Loading...

    Remove password from private ssl key

    In the kibana.yml configuration, I setup the mandatory configuration for SSL.

    server.ssl.key: "/opt/kibana/latest/ssl/key.pem"
    server.ssl.cert: "/opt/kibana/latest/ssl/cert.pem"
    

    Kibana can’t handle private SSL certificates with passwords (key.pem).

    tail -f /var/log/kibana/error.log
    FATAL [Error: error:0907B068:PEM routines:PEM_READ_BIO_PRIVATEKEY:bad password read]
    

    Therefore I had to remove the password in order to use existing private key. We just export the key into a new keyfile.

    openssl rsa -in key.pem -out newkey.pem
    

    The new file should contain following beginning and end:

    -----BEGIN RSA PRIVATE KEY-----
    ...
    -----END RSA PRIVATE KEY-----
    
  53. 2016-05-23 - Backup Kibana; Tags: Backup Kibana
    Loading...

    Backup Kibana

    Kibana is the visual web interface for elasticsearch. You can create searches, visualizations and dashboards. Sometimes you spend a lot of valuable work into them. Therefore is essential to have some kind of backup for Kibana. The Kibana data itself, is stored in Elasticsearch in the .kibana index. One way is to use the snapshot and restore capability of Elasticsearch.

    I created two scripts, that runs in Jenkins. The first script is a daily snapshot and the second script is a monthly snapshot. From there we are capable to restore the snapshot from a daily or monthly basis. These scripts only create backup from kibana, which is not very large. Other indices aren’t in scope.

    The daily snapshot:

    SNAPSHOT=snapshot_$(date +%u)
    # Test
    echo -n "Delete previous snapshot from test-cluster"
    curl -XDELETE "http://localhost:9200/_snapshot/kibana/$SNAPSHOT" -s
    echo -n "Create new snapshot for test-cluster"
    curl -XPUT "http://localhost:9200/_snapshot/kibana/$SNAPSHOT" -s -d '{ "indices": ".kibana",  "ignore_unavailable": "true",  "include_global_state": false }'
    # Prod
    echo -n "Delete previous snapshot from prod-cluster"
    curl -XDELETE "http://elasticsearch:9200/_snapshot/kibana_prod/$SNAPSHOT" -s
    echo -n "Create new snapshot for prod-cluster"
    curl -XPUT "http://elasticsearch:9200/_snapshot/kibana_prod/$SNAPSHOT"  -s -d '{ "indices": ".kibana",  "ignore_unavailable": "true",  "include_global_state": false }'
    

    The Jenkins schedule, e.g. would last have run at Sunday, May 22, 2016 11:12:54 PM CEST; would next run at Monday, May 23, 2016 11:12:54 PM CEST.

    H 23 * * *
    

    The monthly snapshot:

    SNAPSHOT=backup_$(date +%m)
    # Test
    echo -n "Create monthly snapshot for test-cluster"
    curl -XDELETE "http://localhost:9200/_snapshot/kibana/$SNAPSHOT" -s
    curl -XPUT "http://localhost:9200/_snapshot/kibana/$SNAPSHOT" -s -d '{ "indices": ".kibana",  "ignore_unavailable": "true",  "include_global_state": false }'
    # Prod
    echo -n "Create monthly snapshot for prod-cluster"
    curl -XDELETE "http://elasticsearch:9200/_snapshot/kibana_prod/$SNAPSHOT" -s
    curl -XPUT "http://elasticsearch:9200/_snapshot/kibana_prod/$SNAPSHOT" -s -d '{ "indices": ".kibana",  "ignore_unavailable": "true",  "include_global_state": false }'
    

    The Jenkins schedule, e.g. would last have run at Sunday, May 1, 2016 12:16:07 AM CEST; would next run at Wednesday, June 1, 2016 12:16:07 AM CEST.

    H 0 1 * *
    
  54. 2016-03-30 - Logging from HP NonStop to Elasticsearch cluster; Tags: Logging from HP NonStop to Elasticsearch cluster
    Loading...

    Logging from HP NonStop to Elasticsearch cluster

    This article demonstrates the fundemental milestones to get a decent log reporting on the HP NonStop to an Elasticsearch cluster. The HP NonStop itself offers with OSS an minimal Linux OS on top of the Guardian layer. Following articles involves the configuration on the HP NonStop (sending party) to the Linux Server, that runs Logstash and Elasticsearch (receiving party). We will also call the HP NonStop Tandem, for clarification.

    The scenario

    This article needs a basic understanding of Logstash and HP NonStop OSS. The circumstances are: My company has a HP NonStop (Itanium architecture). On the Tandem machine, several tomcat web applications are running and logging. Viewing the log files with tail under OSS is a pain in the .. you know where :wink: . So the basic idea is to report the log files to elasticsearch and view them with Kibana. The HP NonStop isn’t capable of running logstash (problems with JRuby), logstash-forwarder or filebeat (written in Go). There is an unofficial logstash forwarder implmentation in github. This programme was written for the IBM AIX and fits the purpose of running basic java applications on the Itanium architecture.

    Getting started

    Before we may begin we need to create self signed SSL certificates, that are essential for the logstash forwarder protocol lumberjack and the logstash input configuration. Logstash supports all certificates, including self-signed certificates. To generate a certificate, we run the following command on the Linux Server (receiving party):

    >$ openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout logstash-forwarder.key -out logstash-forwarder.crt -days 365

    This will generate a key at logstash-forwarder.key and the 1-year valid certificate at logstash-forwarder.crt. Both the server that is running logstash-forwarder as well as the logstash instances receiving logs will require these files on disk to verify the authenticity of messages. That means we have to distribute it also on the Tandem (the sending party). The logstash forwarder also needs a Java Keystore. We create a new one with the self-signed certificate

    keytool -importcert -trustcacerts -file logstash-forwarder.crt -alias ca -keystore keystore.jks
    

    The command will ask for a password, just the use the default changeit for simplicity. You may choose another password, but keep in mind to remember it.

    Configure logstash

    Logstash, that runs on the Linux Server, needs a lumberjack input configuration:

    input {
      lumberjack {
        port => 5400
        ssl_certificate => "/opt/logstash-2.2.1/logstash-forwarder.crt"
        ssl_key => "/opt/logstash-2.2.1/logstash-forwarder.key"
      }
    }
    

    We just choose the free port 5400 for simplicity. The output may be elasticsearch or for testing just stdout.

    output {
        elasticsearch {
            host => "10.24.62.120"
            protocol => "http"
            port => 9200
            index => "tandem-%{+YYYY.MM.dd}"
        }
        stdout {
            codec => rubydebug
        }
    }
    

    Of course can also apply custom filters, but for simplicity I leave it out the equation.

    The HP NonStop side

    The first obstacle under OSS is to setup the correct Java environment:

    export JAVA_HOME=/usr/tandem/java7.0
    export PATH=$PATH:$JAVA_HOME/bin
    

    Allowing programmes to use the TCP/IP stack is a special case, and had to be done:

    add_define =tcpip^process^name class=map file=\$ZKIP
    

    We assign the current OSS to the process name $ZKIP, that allows us to talk with the Linux Server on the outgoing site. You may have to replace the process name with your respective process name on your Tandem/HP NonStop. Download the latest release from above github repository and upload it to the HP NonStop.

    Configure the forwarder

    I put the SSL certificates under the same folder of the logstash-forwarder. The forwarder needs a configuration, which files he should tail and forward to. An example:

    {
       "network": {
         "servers": [ "10.24.62.120:5400" ],
         "ssl certificate": "/opt/logstash-forwarder/logstash-forwarder.crt",
         "ssl key": "/opt/logstash-forwarder/logstash-forwarder.key",
         "ssl ca": "/opt/logstash-forwarder/keystore.jks",
         "timeout": 15
       },
       "files": [
         {
           "paths": [
             "/var/dev/log/tomcat-server/-*.log"
           ],
           "fields": { "type": "logs" }
         }, {
           "paths": [
             "/var/dev/log/java/*.log"
           ],
           "fields": { "type": "logs" }
         }
       ]
     }
    

    Start the forwarder

    After that we can start the java logstash forwarder with the defined configuration:

    nohup java -jar logstash-forwarder-java-0.2.3.jar -config config > forwarder.log 2> error.log &
    

    On the receiving site or Kibana you should see the incoming messages flying in.

    Final steps

    After testing successfully the log forwarding you may configure a new pathway server to run the application automatically.

  55. 2016-03-29 - Enable and disable Elasticsearch cluster shard allocation; Tags: Enable and disable Elasticsearch cluster shard allocation
    Loading...

    Enable and disable Elasticsearch cluster shard allocation

    There will come a time when you need to perform a rolling restart of your cluster—keeping the cluster online and operational, but taking nodes offline one at a time. .. By nature, Elasticsearch wants your data to be fully replicated and evenly balanced. If you shut down a single node for maintenance, the cluster will immediately recognize the loss of a node and begin rebalancing.

    Disable shard allocation. This prevents Elasticsearch from rebalancing missing shards until you tell it otherwise.

    curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
        "transient" : {
            "cluster.routing.allocation.enable" : "none"
        }
    }'
    

    Reenable shard allocation as follows:

    curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
        "transient" : {
            "cluster.routing.allocation.enable" : "all"
        }
    }'
    
  56. 2016-03-19 - Delete Elasticsearch index-name with special characters; Tags: Delete Elasticsearch index-name with special characters
    Loading...

    Delete Elasticsearch index-name with special characters

    If you working with logstash and set the index name dynamically from variables, you might run into situations where the variable is not substituted due to errors. You can get rid of this bad indices by deleting them with the REST API of elasticsearch. The challenge lies in escaping the special characters.

    An example what might happen:

    curl -s http://localhost:9200/_cat/indices
    green open  .marvel-es-2016.03.16        1 1  750253 5159 717.1mb 358.5mb
    green open  fo-%{environment}-2016.03.19 5 1     186    0   346kb   173kb
    green open  .marvel-es-2016.03.17        1 1  772223 5592 726.5mb   367mb
    green open  .marvel-es-2016.03.18        1 1  753400 3932   777mb 388.4mb
    

    To delete the second index we have to escape

    • % → %25
    • { → %7B
    • } → %7D
    curl -XDELETE http://localhost:9200/fo-%25%7Benvironment%7D-2016.03.19
    
  57. 2016-03-17 - Elasticsearch Reporter for Dropwizard Metrics; Tags: Elasticsearch Reporter for Dropwizard Metrics
    Loading...

    Elasticsearch Reporter for Dropwizard Metrics

    This post describes how to export Dropwizard Metrics to elasticsearch. Instead of using logstash to parse application log files, the metrics can be exported directly within any Java application. This plugin is maintained by elastic.co.

    See https://github.com/elastic/elasticsearch-metrics-reporter-java. For the impatient: Add to your pom.xml

    <dependency>
      <groupId>org.elasticsearch</groupId>
      <artifactId>metrics-elasticsearch-reporter</artifactId>
      <version>2.2.0</version>
    </dependency>
    

    Start reporting to elasticsearch cluster e.g. in main()

    ElasticsearchReporter reporter = ElasticsearchReporter
        .forRegistry(registry())
        .hosts("elasticsearch-node1:9200", "elasticsearch-node2:9200")
        .index("metrics")
        .indexDateFormat(null) //no date suffix
        .build();
    reporter.start(10, TimeUnit.SECONDS);
    
  58. 2016-03-04 - Create alias for indices in Elasticsearch; Tags: Create alias for indices in Elasticsearch
    Loading...

    Create alias for indices in Elasticsearch

    If you have multiple indices and build your visualizations and dashboard in Kibana, you might consider using an alias. The advantage is that the physical index name can always be removed, renamed, reindex. The alias gives you the flexibility that your Kibana objects remain consistent, but in the background you have to freedom to reorganize. This post illustrates two examples.

    The first example is an environment case. In your development you have dev, a test system, a staging system and a production. Following curl command with JSON data, creates an alias dev for the indices fo-dev-*. The wildcard can be used.

    curl -XPOST 'http://localhost:9200/_aliases' -d '
    {
        "actions" : [
            { "add" : { "index" : "fo-dev-*", "alias" : "dev" } }
        ]
    }'
    {"acknowledged":true}
    

    The second case is the alias stats for the indices metrics-*.

    curl -XPOST 'http://localhost:9200/_aliases' -d '
    {
        "actions" : [
            { "add" : { "index" : "metrics-*", "alias" : "stats" } }
        ]
    }'
    {"acknowledged":true}
    
  59. 2016-02-12 - Disable shard allocation of your Elasticsearch nodes; Tags: Disable shard allocation of your Elasticsearch nodes
    Loading...

    Disable shard allocation of your Elasticsearch nodes

    If you going to upgrade your cluster or restart a node, it is wise to disable the shard allocation on that node. This prevents Elasticsearch from rebalancing missing shards until you tell it otherwise. This post demonstrates how to disable and enable shart allocation.

    Disable it:

    vinh@cinhtau:~> curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
        "transient" : {
            "cluster.routing.allocation.enable" : "none"
        }
    }'
    {"acknowledged":true,"persistent":{},"transient":{"cluster":{"routing":{"allocation":{"enable":"none"}}}}}
    

    Perform the upgrade and after the restart enable it:

    vinh@cinhtau:~> curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
        "transient" : {
            "cluster.routing.allocation.enable" : "all"
        }
    }'
    
  60. 2016-02-11 - Check for running Elasticsearch instance in a shell script; Tags: Check for running Elasticsearch instance in a shell script
    Loading...

    Check for running Elasticsearch instance in a shell script

    This small shell script checks if an Elasticsearch instance is running by querying the REST API.

    #!/usr/bin/env bash
    PORT=9200
    URL="http://localhost:$PORT"
    # Check that Elasticsearch is running
    curl -s $URL 2>&1 > /dev/null
    if [ $? != 0 ]; then
        echo "Unable to contact Elasticsearch on port $PORT."
        echo "Please ensure Elasticsearch is running and can be reached at $URL"
        exit -1
    fi
    
  61. 2016-02-10 - Parsing output with multiple whitespace; Tags: Parsing output with multiple whitespace
    Loading...

    Parsing output with multiple whitespace

    This post demonstrates how to parse output separated with multiple whitespace in the bash/shell.

    I have to implement some elasticsearch curator functions, since python is not an option on my machine :-( . I query elasticsearch for the catalog of indices.

    vinh@cinhtau:~> curl -s http://localhost:9200/_cat/indices?v
    health status index                  pri rep docs.count docs.deleted store.size pri.store.size
    green  open   logstash-2016.02.06      5   1    1899524      1077536      4.4gb          2.2gb
    green  open   logstash-2016.02.05      5   1    3051521      1078468      6.1gb            3gb
           close  logstash-2016.02.04
           close  logstash-2016.02.03
    green  open   logstash-2016.02.09      5   1    3571320      1077284      6.1gb            3gb
    green  open   logstash-2016.02.08      5   1    3854980      1076828      8.3gb          4.1gb
    green  open   logstash-2016.02.07      5   1    1384753      1077256      3.5gb          1.7gb
    green  open   .marvel-es-2016.02.10    1   1     415332         2970    393.9mb        196.9mb
    green  open   .kibana                  1   1         53            4    245.3kb        122.1kb
    green  open   .marvel-es-2016.02.08    1   1     113514          850     97.4mb         48.7mb
    green  open   .marvel-es-2016.02.09    1   1     348231         2682    332.2mb          166mb
    green  open   logstash-2016.02.12      5   1    1623111            0      5.9gb          2.8gb
    green  open   logstash-2016.02.11      5   1    2748311        42212      5.9gb          2.9gb
    green  open   logstash-2016.02.10      5   1    4494718      1021304      8.3gb          4.1gb
    ..
    

    If try cut with the delimiter ‘ ‘ it won’t work, because of the multiple spaces between the status and index name. In this case you can use awk with the regex of multiple spaces ' +':

    vinh@cinhtau:~> curl -s http://localhost:9200/_cat/indices | awk -F ' +' '{print $3}'
    logstash-2016.02.06
    logstash-2016.01.15
    logstash-2016.01.16
    logstash-2016.02.05
    logstash-2016.02.04
    logstash-2016.01.13
    logstash-2016.02.03
    logstash-2016.02.09
    logstash-2016.02.08
    logstash-2016.01.17
    logstash-2016.01.18
    logstash-2016.02.07
    .marvel-es-2016.02.10
    .kibana
    
  62. 2016-02-09 - Delete multiple documents from an Elasticsearch Index; Tags: Delete multiple documents from an Elasticsearch Index
    Loading...

    Delete multiple documents from an Elasticsearch Index

    This post demonstrates how to delete documents from an Index in Elasticsearch, that meet you search criteria of a query. You may have situations, that you are reporting to a wrong index. Therefore this post gives you an solution how you can clean up your index.

    Following example deletes all documents from the default index, that has the type books.

    vinh@cinhtau:/var/elasticsearch> curl -XDELETE 'http://localhost:9200/logstash-2016.02.10/_query' -d '
    {
      "query": {
        "match": {
          "type": {
            "query": "books",
            "type": "phrase"
          }
        }
      }
    }'
    {"took":69645,"timed_out":false,"_indices":{"_all":{"found":176720,"deleted":176720,"missing":0,"failed":0},"logstash-2016.02.10":{"found":176720,"deleted":176720,"missing":0,"failed":0}},..
    

    Keep in mind that, elasticsearch running in a cluster, has no problems with deleting multiple documents. In the past the API was crashing a single running instance of elasticsearch v1.x. In Elasticsearch 2 this functionality was reworked and outsource from the elasticsearch core. The Delete by query is in Elasticsearch 2.2 supported by the delete-by-query plugin and much more mature. To check for plugins just get the node information and check for the plugin. If it is installed it will be listed in the result.

  63. 2016-02-08 - Installing Elasticsearch Plugins behind a proxy or offline; Tags: Installing Elasticsearch Plugins behind a proxy or offline
    Loading...

    Installing Elasticsearch Plugins behind a proxy or offline

    This post demonstrates how to install elasticsearch or kibana plugins behind a proxy. There are several possibilities to achieve an online or offline installation. Following example installs the delete-by-query plugin for elasticsearch v 2.2. It is a great improvement to the previous buggy and non performant implementation. elastic has remove it from the elasticsearch core.

    Usually you would install it online the regular way

    vinh@cinhtau:/opt/elasticsearch> bin/plugin install delete-by-query
    

    Lets see what the plugin answers

    vinh@cinhtau:/opt/elasticsearch> bin/plugin install delete-by-query --verbose
    -> Installing delete-by-query...
    Trying https://download.elastic.co/elasticsearch/release/org/elasticsearch/plugin/delete-by-query/2.2.0/delete-by-query-2.2.0.zip ...
    Failed: ConnectException[Connection refused]
    ERROR: failed to download out of all possible locations..., use --verbose to get detailed information
    

    To see the option for install

    vinh@cinhtau:/opt/elasticsearch> bin/plugin install -h
    

    If you have to bypass a proxy you can use the Java System Properties to download it online. The installation requires a https connection.

    • https.proxyHost
    • https.proxyPort
    • https.proxyUser (required if auth is mandatory)
    • https.proxyPassword (required if auth is mandatory)

    Example for Elasticsearch on Windows

    C:\TEMP\elasticsearch-2.2.0>bin\plugin.bat -Dhttps.proxyPort=8080 -Dhttps.proxyHost=proxy.cinhtau.net -Dhttps.proxyUser=vinh -Dhttps.proxyPassword=secret install delete-by-query
    -> Installing delete-by-query...
    Trying https://download.elastic.co/elasticsearch/release/org/elasticsearch/plugin/delete-by-query/2.2.0/delete-by-query-2.2.0.zip ...
    Downloading ..DONE
    Verifying https://download.elastic.co/elasticsearch/release/org/elasticsearch/plugin/delete-by-query/2.2.0/delete-by-query-2.2.0.zip checksums if available ...
    Downloading .DONE
    Installed delete-by-query into C:\TEMP\elasticsearch-2.2.0\plugins\delete-by-query
    

    If you have no proxy auth or access to can also perform an offline installation. Just download the zip file from the given url and install it with

    vinh@cinhtau:/opt/elasticsearch> bin/plugin install file:///tmp/delete-by-query-2.2.0.zip
    -> Installing from file:/tmp/delete-by-query-2.2.0.zip...
    Trying file:/tmp/delete-by-query-2.2.0.zip ...
    Downloading .DONE
    Verifying file:/tmp/delete-by-query-2.2.0.zip checksums if available ...
    NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify)
    Installed delete-by-query into /var/opt/RiskShield/elasticsearch/plugins/delete-by-query
    

    Remember that the CLI API is slightly different in elasticsearch v1.x.

  64. 2015-10-16 - List and check alias in elasticsearch; Tags: List and check alias in elasticsearch
    Loading...

    List and check alias in elasticsearch

    This post gives an example how to check an alias or aliases in elasticsearch with the REST-API.

    As example we create an alias

    curl -XPOST 'localhost:9200/_aliases' -d '
    {
        "actions" : [
            { "add" : { "index" : "metrics*", "alias" : "stats" } }
        ]
    }'
    

    This checks for the index metrics which alias are available.

    curl -XGET 'localhost:9200/metrics/_alias/*'
    

    elasticsearch allwos you to use asterisks. If you want to check for all indices.

    dev@cinhtau:~> curl -XGET 'localhost:9200/*/_alias/*?pretty'
    {
        "metrics-2015.10.01": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.12": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.02": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.13": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.09": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.05": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.11": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.15": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.04": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.10": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.14": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.06": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2025.10.02": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.08": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.16": {
            "aliases": {
                "stats": {}
            }
        },
        "metrics-2015.10.07": {
            "aliases": {
                "stats": {}
            }
        }
    }