Utility & Helper Methods

class curator.utils.TimestringSearch(timestring)

An object to allow repetitive search against a string, searchme, without having to repeatedly recreate the regex.

Parameters:timestring – An strftime pattern
get_epoch(searchme)

Return the epoch timestamp extracted from the timestring appearing in searchme.

Parameters:searchme – A string to be searched for a date pattern that matches timestring
Return type:int
curator.utils.byte_size(num, suffix='B')

Return a formatted string indicating the size in bytes, with the proper unit, e.g. KB, MB, GB, TB, etc.

Parameters:
  • num – The number of byte
  • suffix – An arbitrary suffix, like Bytes
Return type:

float

curator.utils.check_csv(value)

Some of the curator methods should not operate against multiple indices at once. This method can be used to check if a list or csv has been sent.

Parameters:value – The value to test, if list or csv string
Return type:bool
curator.utils.check_master(client, master_only=False)

Check if connected client is the elected master node of the cluster. If not, cleanly exit with a log message.

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:None
curator.utils.check_version(client)

Verify version is within acceptable range. Raise an exception if it is not.

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:None
curator.utils.chunk_index_list(indices)

This utility chunks very large index lists into 3KB chunks It measures the size as a csv string, then converts back into a list for the return value.

Parameters:indices – A list of indices to act on.
Return type:list
curator.utils.create_repo_body(repo_type=None, compress=True, chunk_size=None, max_restore_bytes_per_sec=None, max_snapshot_bytes_per_sec=None, location=None, bucket=None, region=None, base_path=None, access_key=None, secret_key=None, **kwargs)

Build the ‘body’ portion for use in creating a repository.

Parameters:
  • repo_type – The type of repository (presently only fs and s3)
  • compress – Turn on compression of the snapshot files. Compression is applied only to metadata files (index mapping and settings). Data files are not compressed. (Default: True)
  • chunk_size – The chunk size can be specified in bytes or by using size value notation, i.e. 1g, 10m, 5k. Defaults to null (unlimited chunk size).
  • max_restore_bytes_per_sec – Throttles per node restore rate. Defaults to 20mb per second.
  • max_snapshot_bytes_per_sec – Throttles per node snapshot rate. Defaults to 20mb per second.
  • location – Location of the snapshots. Required.
  • bucketS3 only. The name of the bucket to be used for snapshots. Required.
  • regionS3 only. The region where bucket is located. Defaults to US Standard
  • base_pathS3 only. Specifies the path within bucket to repository data. Defaults to value of repositories.s3.base_path or to root directory if not set.
  • access_keyS3 only. The access key to use for authentication. Defaults to value of cloud.aws.access_key.
  • secret_keyS3 only. The secret key to use for authentication. Defaults to value of cloud.aws.secret_key.
Returns:

A dictionary suitable for creating a repository from the provided arguments.

Return type:

dict

curator.utils.create_repository(client, **kwargs)

Create repository with repository and body settings

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • repository – The Elasticsearch snapshot repository to use
  • repo_type – The type of repository (presently only fs and s3)
  • compress – Turn on compression of the snapshot files. Compression is applied only to metadata files (index mapping and settings). Data files are not compressed. (Default: True)
  • chunk_size – The chunk size can be specified in bytes or by using size value notation, i.e. 1g, 10m, 5k. Defaults to null (unlimited chunk size).
  • max_restore_bytes_per_sec – Throttles per node restore rate. Defaults to 20mb per second.
  • max_snapshot_bytes_per_sec – Throttles per node snapshot rate. Defaults to 20mb per second.
  • location – Location of the snapshots. Required.
  • bucketS3 only. The name of the bucket to be used for snapshots. Required.
  • regionS3 only. The region where bucket is located. Defaults to US Standard
  • base_pathS3 only. Specifies the path within bucket to repository data. Defaults to value of repositories.s3.base_path or to root directory if not set.
  • access_keyS3 only. The access key to use for authentication. Defaults to value of cloud.aws.access_key.
  • secret_keyS3 only. The secret key to use for authentication. Defaults to value of cloud.aws.secret_key.
Returns:

A boolean value indicating success or failure.

Return type:

bool

curator.utils.create_snapshot_body(indices, ignore_unavailable=False, include_global_state=True, partial=False)

Create the request body for creating a snapshot from the provided arguments.

Parameters:
  • indices – A single index, or list of indices to snapshot.
  • ignore_unavailable (bool) – Ignore unavailable shards/indices. (default: False)
  • include_global_state (bool) – Store cluster global state with snapshot. (default: True)
  • partial (bool) – Do not fail if primary shard is unavailable. (default: False)
Return type:

dict

curator.utils.date_range(unit, range_from, range_to, epoch=None, week_starts_on='sunday')

Get the epoch start time and end time of a range of unit``s, reckoning the start of the week (if that's the selected unit) based on ``week_starts_on, which can be either sunday or monday.

Parameters:
  • unit – One of hours, days, weeks, months, or years.
  • range_from – How many unit (s) in the past/future is the origin?
  • range_to – How many unit (s) in the past/future is the end point?
  • epoch – An epoch timestamp used to establish a point of reference for calculations.
  • week_starts_on – Either sunday or monday. Default is sunday
Return type:

tuple

curator.utils.ensure_list(indices)

Return a list, even if indices is a single value

Parameters:indices – A list of indices to act upon
Return type:list
curator.utils.find_snapshot_tasks(client)

Check if there is snapshot activity in the Tasks API. Return True if activity is found, or False

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:bool
curator.utils.fix_epoch(epoch)

Fix value of epoch to be epoch, which should be 10 or fewer digits long.

Parameters:epoch – An epoch timestamp, in epoch + milliseconds, or microsecond, or even nanoseconds.
Return type:int
curator.utils.get_client(**kwargs)

NOTE: AWS IAM parameters aws_key, aws_secret_key, and aws_region are provided for future compatibility, should AWS ES support the /_cluster/state/metadata endpoint. So long as this endpoint does not function in AWS ES, the client will not be able to use curator.indexlist.IndexList, which is the backbone of Curator 4

Return an elasticsearch.Elasticsearch client object using the provided parameters. Any of the keyword arguments the elasticsearch.Elasticsearch client object can receive are valid, such as:

Parameters:
  • hosts (list) – A list of one or more Elasticsearch client hostnames or IP addresses to connect to. Can send a single host.
  • port (int) – The Elasticsearch client port to connect to.
  • url_prefix (str) – Optional url prefix, if needed to reach the Elasticsearch API (i.e., it’s not at the root level)
  • use_ssl (bool) – Whether to connect to the client via SSL/TLS
  • certificate – Path to SSL/TLS certificate
  • client_cert – Path to SSL/TLS client certificate (public key)
  • client_key – Path to SSL/TLS private key
  • aws_key – AWS IAM Access Key (Only used if the requests-aws4auth python module is installed)
  • aws_secret_key – AWS IAM Secret Access Key (Only used if the requests-aws4auth python module is installed)
  • aws_region – AWS Region (Only used if the requests-aws4auth python module is installed)
  • ssl_no_validate (bool) – If True, do not validate the certificate chain. This is an insecure option and you will see warnings in the log output.
  • http_auth (str) – Authentication credentials in user:pass format.
  • timeout (int) – Number of seconds before the client will timeout.
  • master_only (bool) – If True, the client will only connect if the endpoint is the elected master node of the cluster. This option does not work if `hosts` has more than one value. It will raise an Exception in that case.
  • skip_version_test – If True, skip the version check as part of the client connection.
Return type:

elasticsearch.Elasticsearch

curator.utils.get_date_regex(timestring)

Return a regex string based on a provided strftime timestring.

Parameters:timestring – An strftime pattern
Return type:str
curator.utils.get_datetime(index_timestamp, timestring)

Return the datetime extracted from the index name, which is the index creation time.

Parameters:
  • index_timestamp – The timestamp extracted from an index name
  • timestring – An strftime pattern
Return type:

datetime.datetime

curator.utils.get_indices(client)

Get the current list of indices from the cluster.

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:list
curator.utils.get_point_of_reference(unit, count, epoch=None)

Get a point-of-reference timestamp in epoch + milliseconds by deriving from a unit and a count, and an optional reference timestamp, epoch

Parameters:
  • unit – One of seconds, minutes, hours, days, weeks, months, or years.
  • unit_count – The number of units. unit_count * unit will be calculated out to the relative number of seconds.
  • epoch – An epoch timestamp used in conjunction with unit and unit_count to establish a point of reference for calculations.
Return type:

int

curator.utils.get_repository(client, repository='')

Return configuration information for the indicated repository.

Parameters:
Return type:

dict

curator.utils.get_snapshot(client, repository=None, snapshot='')

Return information about a snapshot (or a comma-separated list of snapshots) If no snapshot specified, it will return all snapshots. If none exist, an empty dictionary will be returned.

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • repository – The Elasticsearch snapshot repository to use
  • snapshot – The snapshot name, or a comma-separated list of snapshots
Return type:

dict

curator.utils.get_snapshot_data(client, repository=None)

Get _all snapshots from repository and return a list.

Parameters:
Return type:

list

curator.utils.get_version(client)

Return the ES version number as a tuple. Omits trailing tags like -dev, or Beta

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:tuple
curator.utils.get_yaml(path)

Read the file identified by path and import its YAML contents.

Parameters:path – The path to a YAML configuration file.
Return type:dict
curator.utils.health_check(client, **kwargs)

This function calls client.cluster.health and, based on the args provided, will return True or False depending on whether that particular keyword appears in the output, and has the expected value. If multiple keys are provided, all must match for a True response.

Parameters:client – An elasticsearch.Elasticsearch client object
curator.utils.is_master_node(client)

Return True if the connected client node is the elected master node in the Elasticsearch cluster, otherwise return False.

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:bool
curator.utils.name_to_node_id(client, name)

Return the node_id of the node identified by name

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:str
curator.utils.node_id_to_name(client, node_id)

Return the name of the node identified by node_id

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:str
curator.utils.node_roles(client, node_id)

Return the list of roles assigned to the node identified by node_id

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:list
curator.utils.parse_date_pattern(name)

Scan and parse name for time.strftime() strings, replacing them with the associated value when found, but otherwise returning lowercase values, as uppercase snapshot names are not allowed. It will detect if the first character is a <, which would indicate name is going to be using Elasticsearch date math syntax, and skip accordingly.

The time.strftime() identifiers that Curator currently recognizes as acceptable include:

  • Y: A 4 digit year
  • y: A 2 digit year
  • m: The 2 digit month
  • W: The 2 digit week of the year
  • d: The 2 digit day of the month
  • H: The 2 digit hour of the day, in 24 hour notation
  • M: The 2 digit minute of the hour
  • S: The 2 digit number of second of the minute
  • j: The 3 digit day of the year
Parameters:name – A name, which can contain time.strftime() strings
curator.utils.prune_nones(mydict)

Remove keys from mydict whose values are None

Parameters:mydict – The dictionary to act on
Return type:dict
curator.utils.read_file(myfile)

Read a file and return the resulting data.

Parameters:myfile – A file to read.
Return type:str
curator.utils.report_failure(exception)

Raise a FailedExecution exception and include the original error message.

Parameters:exception – The upstream exception.
Return type:None
curator.utils.repository_exists(client, repository=None)

Verify the existence of a repository

Parameters:
Return type:

bool

curator.utils.restore_check(client, index_list)

This function calls client.indices.recovery with the list of indices to check for complete recovery. It will return True if recovery of those indices is complete, and False otherwise. It is designed to fail fast: if a single shard is encountered that is still recovering (not in DONE stage), it will immediately return False, rather than complete iterating over the rest of the response.

Parameters:
curator.utils.rollable_alias(client, alias)

Ensure that alias is an alias, and points to an index that can use the _rollover API.

Parameters:
curator.utils.safe_to_snap(client, repository=None, retry_interval=120, retry_count=3)

Ensure there are no snapshots in progress. Pause and retry accordingly

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • repository – The Elasticsearch snapshot repository to use
  • retry_interval – Number of seconds to delay betwen retries. Default: 120 (seconds)
  • retry_count – Number of attempts to make. Default: 3
Return type:

bool

curator.utils.show_dry_run(ilo, action, **kwargs)

Log dry run output with the action which would have been executed.

Parameters:
curator.utils.single_data_path(client, node_id)

In order for a shrink to work, it should be on a single filesystem, as shards cannot span filesystems. Return True if the node has a single filesystem, and False otherwise.

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:bool
curator.utils.snapshot_check(client, snapshot=None, repository=None)

This function calls client.snapshot.get and tests to see whether the snapshot is complete, and if so, with what status. It will log errors according to the result. If the snapshot is still IN_PROGRESS, it will return False. SUCCESS will be an INFO level message, PARTIAL nets a WARNING message, FAILED is an ERROR, message, and all others will be a WARNING level message.

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • snapshot – The name of the snapshot.
  • repository – The Elasticsearch snapshot repository to use
curator.utils.snapshot_in_progress(client, repository=None, snapshot=None)

Determine whether the provided snapshot in repository is IN_PROGRESS. If no value is provided for snapshot, then check all of them. Return snapshot if it is found to be in progress, or False

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • repository – The Elasticsearch snapshot repository to use
  • snapshot – The snapshot name
curator.utils.snapshot_running(client)

Return True if a snapshot is in progress, and False if not

Parameters:client – An elasticsearch.Elasticsearch client object
Return type:bool
curator.utils.task_check(client, task_id=None)

This function calls client.tasks.get with the provided task_id. If the task data contains 'completed': True, then it will return True If the task is not completed, it will log some information about the task and return False

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • task_id – A task_id which ostensibly matches a task searchable in the tasks API.
curator.utils.test_client_options(config)

Test whether a SSL/TLS files exist. Will raise an exception if the files cannot be read.

Parameters:config – A client configuration file data dictionary
Return type:None
curator.utils.test_repo_fs(client, repository=None)

Test whether all nodes have write access to the repository

Parameters:
curator.utils.to_csv(indices)

Return a csv string from a list of indices, or a single value if only one value is present

Parameters:indices – A list of indices to act on, or a single value, which could be in the format of a csv string already.
Return type:str
curator.utils.validate_actions(data)

Validate an Action configuration dictionary, as imported from actions.yml, for example.

The method returns a validated and sanitized configuration dictionary.

Parameters:data – The configuration dictionary
Return type:dict
curator.utils.validate_filters(action, filters)

Validate that the filters are appropriate for the action type, e.g. no index filters applied to a snapshot list.

Parameters:
  • action – An action name
  • filters – A list of filters to test.
curator.utils.verify_client_object(test)

Test if test is a proper elasticsearch.Elasticsearch client object and raise an exception if it is not.

Parameters:test – The variable or object to test
Return type:None
curator.utils.verify_index_list(test)

Test if test is a proper curator.indexlist.IndexList object and raise an exception if it is not.

Parameters:test – The variable or object to test
Return type:None
curator.utils.verify_snapshot_list(test)

Test if test is a proper curator.snapshotlist.SnapshotList object and raise an exception if it is not.

Parameters:test – The variable or object to test
Return type:None
curator.utils.wait_for_it(client, action, task_id=None, snapshot=None, repository=None, index_list=None, wait_interval=9, max_wait=-1)

This function becomes one place to do all wait_for_completion type behaviors

Parameters:
  • client – An elasticsearch.Elasticsearch client object
  • action – The action name that will identify how to wait
  • task_id – If the action provided a task_id, this is where it must be declared.
  • snapshot – The name of the snapshot.
  • repository – The Elasticsearch snapshot repository to use
  • wait_interval – How frequently the specified “wait” behavior will be polled to check for completion.
  • max_wait – Number of seconds will the “wait” behavior persist before giving up and raising an Exception. The default is -1, meaning it will try forever.
class curator.SchemaCheck(config, schema, test_what, location)

Validate config with the provided voluptuous schema. test_what and location are for reporting the results, in case of failure. If validation is successful, the method returns config as valid.

Parameters:
  • config (dict) – A configuration dictionary.
  • schema (voluptuous.Schema) – A voluptuous schema definition
  • test_what (str) – which configuration block is being validated
  • location (str) – An string to report which configuration sub-block is being tested.