Cleanup Elasticsearch indices
When you manage a bunch of elasticsearch clusters, one of the questions that is likely to arise: How do I cleanup old/unused indexes? The best way to achieve this is to use Curator.
Installation
python3 -m venv .venv3
source .venv3/bin/activate
pip install elasticsearch-curator==5.8.4
Configuration
Curator needs 2 files to be able to perform operations.
1/ config.yml
client:
hosts:
- 127.0.0.1
port: 9200
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
ssl_no_validate: False
username:
password:
timeout: 30
master_only: False
logging:
loglevel: INFO
logfile:
logformat: default
blacklist: ['elasticsearch', 'urllib3']
Here we define mainly the elasticsearch instances we want curator to target.
2/ action.yml
actions:
1:
action: delete_indices
description: >-
Delete indices older than 30 days
and indices starting with test-,
and exclude alias called logs.
options:
ignore_empty_list: True
disable_action: False
filters:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 30
- filtertype: pattern
kind: prefix
value: test-
- filtertype: alias
aliases: [ logs ]
exclude: True
In this example, we define 1 action (delete_indices) with multiple filters (see desc.).
We exclude a specific alias to avoid removing the corresponding index.
Execution
$ curator --dry-run --config config.yml action.yml
Preparing Action ID: 1, "delete_indices"
...
...
DRY-RUN MODE. No changes will be made.
...
DRY-RUN: delete_indices: test-001 with arguments: {}
DRY-RUN: delete_indices: test-002 with arguments: {}
DRY-RUN: delete_indices: test-003 with arguments: {}
DRY-RUN: delete_indices: test-004 with arguments: {}
Action ID: 1, "delete_indices" completed.
Job completed.
Once we are statisfy with the output we can remove the –dry-run flag and this job can be scheduled.
Note: From the same pip package it is possible to use curator_cli
where all parameters can be passed from the command line.
Final thoughts
Curator is the perfect tool when you want to manage indices in elasticsearch.
It can also perform snapshots, combine with repository plugins - it provides a perfect backup/restore solution.