candle.ckpt_keras_utils.CandleCkptKeras#
- class candle.ckpt_keras_utils.CandleCkptKeras(gParameters, logger='DEFAULT', verbose=True)#
Keras Callback for CANDLE-compliant Benchmarks to use for checkpointing Creates a JSON file alongside the weights and optimizer checkpoints that includes important metadata, particularly for restarting and tracking complex workflows.
- __init__(gParameters, logger='DEFAULT', verbose=True)#
- Parameters:
logger (Logger) – The logger to use. May be None to disable or “DEFAULT” to use the default.
verbose (boolean) – If True, more verbose logging Passed to helper_utils.set_up_logger(verbose) for this logger
Methods
__init__
(gParameters[, logger, verbose])- param Logger logger:
The logger to use.
build_model
(model_file)checksum
(dir_work)Simple checksum dispatch dir_work: A pathlib.Path
checksum_file
(filename)Read file, compute checksum, return it as a string.
ckpt_epoch
(epoch, direction, metric_value)Note: We immediately increment epoch from index-from-0 to index-from-1 to match the TensorFlow output. Normally, ckpts/best is the best saved state, and ckpts/last is the last saved state. Procedure: 1. Write current state to ckpts/work 2. Rename ckpts/work to ckpts/epoch/NNN 3. If best, link ckpts/best to ckpts/epoch/NNN 4. Link ckpts/last to ckpts/epoch/NNN 5. Clean up old ckpts according to keep policy.
clean
(epoch_now)Clean old epoch directories
debug
(message)delete
(epoch)disabled
(key)Is this parameter set to False?
enabled
(key)Is this parameter set to True?
info
(message)keep
(epoch, epoch_now, kept)kept: Number of epochs already kept return True if we are keeping this epoch, else False
on_batch_begin
(batch[, logs])A backwards compatibility alias for on_train_batch_begin.
on_batch_end
(batch[, logs])A backwards compatibility alias for on_train_batch_end.
on_epoch_begin
(epoch[, logs])Called at the start of an epoch.
on_epoch_end
(epoch[, logs])Called at the end of an epoch.
on_predict_batch_begin
(batch[, logs])Called at the beginning of a batch in predict methods.
on_predict_batch_end
(batch[, logs])Called at the end of a batch in predict methods.
on_predict_begin
([logs])Called at the beginning of prediction.
on_predict_end
([logs])Called at the end of prediction.
on_test_batch_begin
(batch[, logs])Called at the beginning of a batch in evaluate methods.
on_test_batch_end
(batch[, logs])Called at the end of a batch in evaluate methods.
on_test_begin
([logs])Called at the beginning of evaluation or validation.
on_test_end
([logs])Called at the end of evaluation or validation.
on_train_batch_begin
(batch[, logs])Called at the beginning of a training batch in fit methods.
on_train_batch_end
(batch[, logs])Called at the end of a training batch in fit methods.
on_train_begin
([logs])Called at the beginning of training.
on_train_end
([logs])Called at the end of training.
param
(key, dflt[, type_, allowed])Pull key from parameters with type checks and conversions
param_allowed
(key, value, allowed)Check that the value is in the list of allowed values If allowed is None, there is no check, simply success
param_type_check
(key, value, type_)Check that value is convertable to given type:
param_type_check_bool
(key, value)param_type_check_float
(key, value, type_)param_type_check_int
(key, value, type_)relpath
(p)If Path p is relative to CWD, relativize it and return it.
report_final
()report_initial
()Simply report that we are ready to run
restart
(model[, verbose])Possibly restarts model from CheckpointCallback according to given settings and the ckpt-info.json
restart_json
(directory)save_check
(epoch, direction, metric_value)Make sure we want to save this epoch based on the model metrics in given logs Also updates epoch_best if appropriate. epoch: The current epoch (just completed) direction: either "+" (metric_value should increase) or "-" (should decrease) metric_value: The current ckpt metric value.
save_check_best
(epoch, direction, metric_value)scan_params
(gParams)Simply translate gParameters into instance fields
set_model
(model)model: The Keras model
set_params
(params)symlink
(src, dst)Like os.symlink, but overwrites dst and logs
write_json
(jsonfile, epoch)write_model
(dir_work, epoch)Do the I/O, report stats dir_work: A pathlib.Path
write_model_backend
(model, epoch)