candle.ckpt_keras_utils.CandleCkptKeras

candle.ckpt_keras_utils.CandleCkptKeras#

class candle.ckpt_keras_utils.CandleCkptKeras(gParameters, logger='DEFAULT', verbose=True)#

Keras Callback for CANDLE-compliant Benchmarks to use for checkpointing Creates a JSON file alongside the weights and optimizer checkpoints that includes important metadata, particularly for restarting and tracking complex workflows.

__init__(gParameters, logger='DEFAULT', verbose=True)#
Parameters:
  • logger (Logger) – The logger to use. May be None to disable or “DEFAULT” to use the default.

  • verbose (boolean) – If True, more verbose logging Passed to helper_utils.set_up_logger(verbose) for this logger

Methods

__init__(gParameters[, logger, verbose])

param Logger logger:

The logger to use.

build_model(model_file)

checksum(dir_work)

Simple checksum dispatch dir_work: A pathlib.Path

checksum_file(filename)

Read file, compute checksum, return it as a string.

ckpt_epoch(epoch, direction, metric_value)

Note: We immediately increment epoch from index-from-0 to index-from-1 to match the TensorFlow output. Normally, ckpts/best is the best saved state, and ckpts/last is the last saved state. Procedure: 1. Write current state to ckpts/work 2. Rename ckpts/work to ckpts/epoch/NNN 3. If best, link ckpts/best to ckpts/epoch/NNN 4. Link ckpts/last to ckpts/epoch/NNN 5. Clean up old ckpts according to keep policy.

clean(epoch_now)

Clean old epoch directories

debug(message)

delete(epoch)

disabled(key)

Is this parameter set to False?

enabled(key)

Is this parameter set to True?

info(message)

keep(epoch, epoch_now, kept)

kept: Number of epochs already kept return True if we are keeping this epoch, else False

on_batch_begin(batch[, logs])

A backwards compatibility alias for on_train_batch_begin.

on_batch_end(batch[, logs])

A backwards compatibility alias for on_train_batch_end.

on_epoch_begin(epoch[, logs])

Called at the start of an epoch.

on_epoch_end(epoch[, logs])

Called at the end of an epoch.

on_predict_batch_begin(batch[, logs])

Called at the beginning of a batch in predict methods.

on_predict_batch_end(batch[, logs])

Called at the end of a batch in predict methods.

on_predict_begin([logs])

Called at the beginning of prediction.

on_predict_end([logs])

Called at the end of prediction.

on_test_batch_begin(batch[, logs])

Called at the beginning of a batch in evaluate methods.

on_test_batch_end(batch[, logs])

Called at the end of a batch in evaluate methods.

on_test_begin([logs])

Called at the beginning of evaluation or validation.

on_test_end([logs])

Called at the end of evaluation or validation.

on_train_batch_begin(batch[, logs])

Called at the beginning of a training batch in fit methods.

on_train_batch_end(batch[, logs])

Called at the end of a training batch in fit methods.

on_train_begin([logs])

Called at the beginning of training.

on_train_end([logs])

Called at the end of training.

param(key, dflt[, type_, allowed])

Pull key from parameters with type checks and conversions

param_allowed(key, value, allowed)

Check that the value is in the list of allowed values If allowed is None, there is no check, simply success

param_type_check(key, value, type_)

Check that value is convertable to given type:

param_type_check_bool(key, value)

param_type_check_float(key, value, type_)

param_type_check_int(key, value, type_)

relpath(p)

If Path p is relative to CWD, relativize it and return it.

report_final()

report_initial()

Simply report that we are ready to run

restart(model[, verbose])

Possibly restarts model from CheckpointCallback according to given settings and the ckpt-info.json

restart_json(directory)

save_check(epoch, direction, metric_value)

Make sure we want to save this epoch based on the model metrics in given logs Also updates epoch_best if appropriate. epoch: The current epoch (just completed) direction: either "+" (metric_value should increase) or "-" (should decrease) metric_value: The current ckpt metric value.

save_check_best(epoch, direction, metric_value)

scan_params(gParams)

Simply translate gParameters into instance fields

set_model(model)

model: The Keras model

set_params(params)

symlink(src, dst)

Like os.symlink, but overwrites dst and logs

write_json(jsonfile, epoch)

write_model(dir_work, epoch)

Do the I/O, report stats dir_work: A pathlib.Path

write_model_backend(model, epoch)