candle.uq_utils.compute_statistics_homoscedastic_summary

candle.uq_utils.compute_statistics_homoscedastic_summary#

candle.uq_utils.compute_statistics_homoscedastic_summary(df_data, col_true=0, col_pred=6, col_std_pred=7)#

Extracts ground truth, mean prediction, error and standard deviation of prediction from inference data frame. The latter includes the statistics over all the inference realizations.

Parameters:
  • df_data (pandas dataframe) – Data frame generated by current CANDLE inference experiments. Indices are hard coded to agree with current CANDLE version. (The inference file usually has the name: <model>_pred.tsv).

  • col_true (int) – Index of the column in the data frame where the true value is stored (Default: 0, index in current CANDLE format).

  • col_pred (int) – Index of the column in the data frame where the predicted value is stored (Default: 6, index in current CANDLE format).

  • col_std_pred (int) – Index of the column in the data frame where the standard deviation of the predicted values is stored (Default: 7, index in current CANDLE format).

Returns:

Tuple of numpy arrays

  • Ytrue (numpy array): Array with true (observed) values

  • Ypred_mean (numpy array): Array with predicted values (mean from summary).

  • yerror (numpy array): Array with errors computed (observed - predicted).

  • sigma (numpy array): Array with standard deviations learned with deep learning model. For homoscedastic inference this corresponds to the std value computed from prediction (and is equal to the following returned variable).

  • Ypred_std (numpy array): Array with standard deviations computed from regular (homoscedastic) inference.

  • pred_name (string): Name of data colum or quantity predicted (as extracted from the data frame using the col_true index).