candle.uq_utils.compute_statistics_quantile

candle.uq_utils.compute_statistics_quantile#

candle.uq_utils.compute_statistics_quantile(df_data, sigma_divisor=2.56, col_true=4, col_pred_start=6)#

Extracts ground truth, 50th percentile mean prediction, low percentile and high percentile mean prediction (usually 1st decile and 9th decile respectively), error (using 5th decile), standard deviation of prediction (using 5th decile) and predicted (learned) standard deviation from interdecile range in inference data frame. The latter includes all the individual inference realizations.

Parameters:
  • df_data (pandas dataframe) – Data frame generated by current quantile inference experiments. Indices are hard coded to agree with current version. (The inference file usually has the name: <model>.predicted_INFER_QTL.tsv).

  • sigma_divisor (float) – Divisor to convert from the intercedile range to the corresponding standard deviation for a Gaussian distribution. (Default: 2.56, consisten with an interdecile range computed from the difference between the 9th and 1st deciles).

  • col_true (int) – Index of the column in the data frame where the true value is stored (Default: 4, index in current QTL format).

  • col_pred_start (int) – Index of the column in the data frame where the first predicted value is stored. All the predicted values during inference are stored and are interspaced with other percentile predictions (Default: 6 index, step 3, in current QTL format).

Returns:

Tuple of numpy arrays

  • Ytrue (numpy array): Array with true (observed) values

  • Ypred (numpy array): Array with predicted values (based on the 50th percentile).

  • yerror (numpy array): Array with errors computed (observed - predicted).

  • sigma (numpy array): Array with standard deviations learned with deep learning model. This corresponds to the interdecile range divided by the sigma divisor.

  • Ypred_std (numpy array): Array with standard deviations computed from regular (homoscedastic) inference.

  • pred_name (string): Name of data colum or quantity predicted (as extracted from the data frame using the col_true index).

  • Ypred_Lp_mean (numpy array): Array with predicted values of the lower percentile (usually the 1st decile).

  • Ypred_Hp_mean (numpy array): Array with predicted values of the higher percentile (usually the 9th decile).