candle.P1_utils.generalization_feature_selection

candle.P1_utils.generalization_feature_selection#

candle.P1_utils.generalization_feature_selection(data1, data2, measure, cutoff)#

This function uses the Pearson correlation coefficient to select the features that are generalizable between data1 and data2.

Parameters:#

data1: 2D numpy array of the first dataset with a size of (n_samples_1, n_features) data2: 2D numpy array of the second dataset with a size of (n_samples_2, n_features) measure: string. ‘pearson’ indicates the Pearson correlation coefficient;

‘ccc’ indicates the concordance correlation coefficient. Default is ‘pearson’.

cutoff: a positive number for selecting generalizable features. If cutoff < 1, this function selects

the features with a correlation coefficient >= cutoff. If cutoff >= 1, it must be an integer indicating the number of features to be selected based on correlation coefficient.

Returns:#

fid: 1-D numpy array containing the indices of selected features.