candle.viz_utils.plot_contamination

candle.viz_utils.plot_contamination#

candle.viz_utils.plot_contamination(y_true, y_pred, sigma, T=None, thresC=0.1, pred_name=None, figprefix=None)#

Functionality to plot results for the contamination model. This includes the latent variables T if they are given (i.e. if the results provided correspond to training results). Global parameters for the normal distribution are used for shading 80% confidence interval. If results for training (i.e. T available), samples determined to be outliers (i.e. samples whose probability of membership to the heavy tailed distribution (Cauchy) is greater than the threshold given) are highlighted. The plot(s) generated is(are) stored in a png file.

Parameters:
  • y_true (numpy array) – Array with observed values.

  • y_pred (numpy array) – Array with predicted values.

  • sigma (float) – Standard deviation of the normal distribution.

  • T (numpy array) – Array with latent variables (i.e. membership to normal and heavy-tailed distributions). If in testing T is not available (i.e. None)

  • thresC (float) – Threshold to label outliers (outliers are the ones with probability of membership to heavy-tailed distribution, i.e. T[:,1] > thresC).

  • pred_name (string) – Name of data colum or quantity predicted (e.g. growth, AUC, etc.).

  • figprefix (string) – String to prefix the filename to store the figures generated. A ‘_contamination.png’ string will be appended to the figprefix given.