This role will install elasticsearch from the official repository.
You must define the elastic repository major version with this form: "N.x" where N is the major version, for example:
elastic_major_version: "7.x"
Currently the 3 supported versions are "7.x", "6.x" and "5.x", for other version, you need to create a new jvm.options template file.
This variable is also used for other elastic.co tools like logstash.
You also must define the elasticsearch clustername, please respect the naming convention (mode is either "prod" or "test"):
elasticsearch_clustername: "center-stats-mode-client"
The variable you are the most likely to change is the JAVA heap size, the default variable is:
elasticsearch_heap_size: "10g"
Other variables have sane default and should not be changed except if you know what you are doing:
elasticsearch_node_name: "${HOSTNAME}"
elasticsearch_node_name_path_data: "/var/lib/elasticsearch"
elasticsearch_node_name_path_logs: "/var/log/elasticsearch"
elasticsearch_node_name_network_host: "_site_"
elasticsearch_cluster_routing_allocation:
cluster_concurrent_rebalance: 4
node_concurrent_recoveries: 4
node_initial_primaries_recoveries: 8
An finaly, you can define any arbitrary variables using the object "elasticsearch_additional_config" like that for version 6.x:
elasticsearch_additional_config:
action.destructive_requires_name: "true"
script.painless.regex.enabled: "true"
script.max_compilations_rate: "120/1m" # this is specific for 6.x
or an other example for 5.x:
elasticsearch_additional_config:
action.destructive_requires_name: "true"
script.painless.regex.enabled: "true"
script.max_compilations_per_minute: "1000" # this is specific for 5.x
To perform an update, add this to the command line: --extra-vars '{ "elasticsearch_update_now" : true }'.
You still have to double check the different settings between major version if you are doing a major update. For minor ones, the update should be painless.
To modify the systemd service for elasticsearch, the official documentation (at https://www.elastic.co/guide/en/elasticsearch/reference/master/setting-system-settings.html) explains that a systemd override file must be used.
This role uses an override file to change the following default values:
LimitNOFILE: "655360" # same as ulimit -n
LimitNPROC: "4096" # same as ulimit -u
LimitMEMLOCK: "infinity" # same as ulimit -l
You can override any of those 3 settings with this variable (undefined variable will use the default from above):
elasticsearch_systemd_override:
LimitNOFILE: "655360"
LimitNPROC: "4096"
Your cluster must have an odd number of master nodes with a quorum of 1/2 + 1 node (minimum of 3 nodes, quorum of 2), this is necessary to avoid data loss. Look at the official documentation for more details.
To define a master only node, you must specify this:
(!) replace the expected_data_nodes with the number of data nodes that must be up to start the cluster, without replication, this is all your data nodes.
elasticsearch_node:
master: "true"
data: "false"
ingest: "false"
elasticsearch_gateway:
expected_data_nodes: "3"
For a data node, use this instead:
elasticsearch_node:
master: "false"
You also need to define on every node the cluster topology, with the DNS of every nodes and the minimum number of master to start the cluster (= the quorum), this is an example:
elasticsearch_additional_config:
discovery.zen.ping.unicast.hosts: '[ "center-stats-prod-o2k-1.cosium.com", "center-stats-prod-o2k-2.cosium.com", "center-stats-prod-o2k-3.cosium.com", "center-stats-prod-o2k-4.cosium.com", "center-stats-prod-o2k-5.cosium.com", "center-stats-prod-o2k-6.cosium.com" ]'
discovery.zen.minimum_master_nodes: "2"
If you already defined elasticsearch_additional_config, just add those settings to the already defined variables.
By default, there is absolutely no security restricting the access to the elasticsearch instance from anywhere. The only protection is the network.
To protect the instance, use iptables rule with the firewall role.
By default, JMX monitoring is active and listening on port 8301. You also must protect this port using a firewall rule because it is not protected by a login/pass. I tried to use the classic JMX login/password mechnism but for an unknown reason, this doesn't work.
You can deactivate the JMX monitoring by setting this variable to False:
elasticsearch_jvm_monitoring: False
General informations:
# curl http://localhost:9200/ {#curl-httplocalhost9200}
{
"name" : "infra-log-elasticsearch-1",
"cluster_name" : "infra-prod",
"cluster_uuid" : "kbEf8yXQT1amAZrKhGZbTg",
"version" : {
"number" : "7.5.0",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "e9ccaed468e2fac2275a3761849cbee64b39519f",
"build_date" : "2019-11-26T01:06:52.518245Z",
"build_snapshot" : false,
"lucene_version" : "8.3.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Show indices:
# curl http://localhost:9200/_cat/indices?v {#curl-httplocalhost9200-catindicesv}
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open filebeat-7.5.0-2019.12.09-000001 2rCN7-qPQrS-HKG1tGPwvQ 1 1 0 0 460b 230b
green open .kibana_task_manager_1 GFzoyVwfQvOaolx46qlCaw 1 1 2 1 32.5kb 16.2kb
green open .apm-agent-configuration zVcE8tJWT_63J-tX1zcx-A 1 1 0 0 566b 283b
green open .kibana_1 LxaUmUqpR6ibZOXlbrNmhw 1 1 1058 44 1mb 514kb
Show mappings:
curl http://localhost:9200/_mapping
curl http://localhost:9200/filebeat-7.5.0-2019.12.05-000001/_mapping | jq .
Delete an indice or sevferal indices:
curl -X DELETE "localhost:9200/filebeat-7.5.0?pretty"
curl -XDELETE 'http://localhost:9200/filebeat-*'
Import a template:
filebeat export template > filebeat.template.json
curl -XPUT -H 'Content-Type: application/json' http://localhost:9200/_template/filebeat-7.5.0 -d@filebeat.template.json
See the ILM status if it exists:
curl -s http://localhost:9200/filebeat-7.5.0-2019.12.09-000001/_ilm/explain| jq .
Show shards status:
curl -s 'http://localhost:9200/_cat/shards'
Explain shards allocation issues:
curl -s "http://localhost:9200/_cluster/allocation/explain" | jq .
Retry failed shards allocation:
curl -X POST -s 'http://localhost:9200/_cluster/reroute?retry_failed=true'
Rolling upgrade works and will allow full access to the cluster while updating but is very time consuming, method is here: https://www.elastic.co/guide/en/elasticsearch/reference/current/rolling-upgrades.html
Full cluster restart upgrade is faster, but it means shutting down all nodes in the cluster, method is here: https://www.elastic.co/guide/en/elasticsearch/reference/current/restart-upgrade.html
Summary for full cluster restart upgrade:
1/ disable allocation via
curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.enable": "primaries"
}
}
'
For cluster with not too many indices, you can also change the cluster configuration to not move index until a node is down for more than 10 minutes, be careful, this can take a while to apply because it will need to be applied to all indices:
curl -X PUT -u elastic:xxx "localhost:9200/_all/_settings?pretty" -H 'Content-Type: application/json' -d'
{
"settings": {
"index.unassigned.node_left.delayed_timeout": "10m"
}
}
'
2/ stop all nodes
3/ apt update && apt dist-upgrade && apt autoremove -y
4/ start all nodes
5/ wait for the status to turn yellow by checking curl -s http://localhost:9200/_cluster/health | jq and for curl -X GET "localhost:9200/_cat/recovery?pretty" to return existing_store on every lines
6/ re-enable shard allocation via:
curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.enable": null
}
}
'
If you changed the delayed_timeout value, reset it too:
curl -X PUT -u elastic:xxx "localhost:9200/_all/_settings?pretty" -H 'Content-Type: application/json' -d'
{
"settings": {
"index.unassigned.node_left.delayed_timeout": null
}
}
'
7/ Since ES version 7.X, ths cluster should come back resonably quickly.
By default, Elasticsearch is not secured via login/pass, only the firewall is protecting it.
Securing ElasticSearch via login/pass also allow the configuration of rights on kibana.
(!) Currently this step is not handled automatically by ansible.
This is mandatory: you must add certificate security first for internode communication.
Generate the CA for the internode communication:
/usr/share/elasticsearch/bin/elasticsearch-certutil ca
This will generate the CA at this location: /usr/share/elasticsearch/elastic-stack-ca.p12.
Then generate the certificate for the internode communication:
/usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca /usr/share/elasticsearch/elastic-stack-ca.p12
The certificate is generated to: /usr/share/elasticsearch/elastic-certificates.p12
Copy the file /usr/share/elasticsearch/elastic-certificates.p12 to /etc/elasticsearch/elastic-certificates.p12 on all nodes.
To enable x-pack:
xpack.security.enabled: "true"
xpack.security.transport.ssl.enabled: "true"
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
At this point, the cluster is NOT USABLE anymore. You must set up login and pass.
Use this command to generate the default login/password for elasticsearch:
/usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto
The admin user is elastic, use this user/pass for all super-admin actions.
To allow the monitoring to work, you need to use those variable:
elasticsearch_xpack_login: "elastic" # this is the default value, you can omit it
elasticsearch_xpack_password: "{{ lookup('hashi_vault', 'secret=cosium-kv/data/group_vars/name_of_group')['elastic'] }}"
If you are using kibana to access the cluster, you need to add the following to its configuration so that it can access the cluster using login/pass:
kibana_extra_config:
elasticsearch.username: "kibana_system"
elasticsearch.password: "{{ lookup('hashi_vault', 'secret=cosium-kv/xxxxxxxxxxxxxxxxxxx')['kibana_system'] }}"