candle.ckpt_pytorch_utils.CandleCkptPyTorch#
- class candle.ckpt_pytorch_utils.CandleCkptPyTorch(gParams, logger='DEFAULT', verbose=True)#
PyTorch Callback for CANDLE-compliant Benchmarks to use for checkpointing Creates a JSON file alongside the weights and optimizer checkpoints that includes important metadata, particularly for restarting and tracking complex workflows.
- __init__(gParams, logger='DEFAULT', verbose=True)#
- Parameters:
logger (Logger) – The logger to use. May be None to disable or “DEFAULT” to use the default.
verbose (boolean) – If True, more verbose logging Passed to helper_utils.set_up_logger(verbose) for this logger
Methods
__init__
(gParams[, logger, verbose])- param Logger logger:
The logger to use.
build_model
(model_file)checksum
(dir_work)Simple checksum dispatch dir_work: A pathlib.Path
checksum_file
(filename)Read file, compute checksum, return it as a string.
ckpt_epoch
(epoch, metric_value)The PyTorch training loop should call this each epoch
clean
(epoch_now)Clean old epoch directories
debug
(message)delete
(epoch)disabled
(key)Is this parameter set to False?
enabled
(key)Is this parameter set to True?
info
(message)keep
(epoch, epoch_now, kept)kept: Number of epochs already kept return True if we are keeping this epoch, else False
on_train_end
([logs])param
(key, dflt[, type_, allowed])Pull key from parameters with type checks and conversions
param_allowed
(key, value, allowed)Check that the value is in the list of allowed values If allowed is None, there is no check, simply success
param_type_check
(key, value, type_)Check that value is convertable to given type:
param_type_check_bool
(key, value)param_type_check_float
(key, value, type_)param_type_check_int
(key, value, type_)relpath
(p)If Path p is relative to CWD, relativize it and return it.
report_final
()report_initial
()Simply report that we are ready to run
restart
(model[, verbose])Possibly restarts model from CheckpointCallback according to given settings and the ckpt-info.json
restart_json
(directory)save_check
(epoch, direction, metric_value)Make sure we want to save this epoch based on the model metrics in given logs Also updates epoch_best if appropriate. epoch: The current epoch (just completed) direction: either "+" (metric_value should increase) or "-" (should decrease) metric_value: The current ckpt metric value.
save_check_best
(epoch, direction, metric_value)scan_params
(gParams)Simply translate gParameters into instance fields
set_model
(model)model: A dict with the model {'model':model, 'optimizer':optimizer}
symlink
(src, dst)Like os.symlink, but overwrites dst and logs
write_json
(jsonfile, epoch)write_model
(dir_work, epoch)Do the I/O, report stats dir_work: A pathlib.Path
write_model_backend
(model, epoch)