candle.data_preprocessing_utils.generate_cross_validation_partition

candle.data_preprocessing_utils.generate_cross_validation_partition#

candle.data_preprocessing_utils.generate_cross_validation_partition(group_label, n_folds=5, n_repeats=1, portions=None, random_seed=None)#

This function generates partition indices of samples for cross- validation analysis.

Parameters:
  • group_label – 1-D array or list of group labels of samples. If there are no groups in samples, a list of sample indices can be supplied for generating partitions based on individual samples rather than sample groups.

  • n_folds (int) – positive integer larger than 1, indicating the number of folds for cross-validation. Default is 5.

  • n_repeats (int) – positive integer, indicating how many times the n_folds cross-validation should be repeated. So the total number of cross-validation trials is n_folds * n_repeats. Default is 1.

  • portions – 1-D array or list of positive integers, indicating the number of data folds in each set (e.g. training set, testing set, or validation set) after partitioning. The summation of elements in portions must be equal to n_folds. Default is [1, n_folds - 1].

  • random_seed (int) – positive integer, the seed for random generator. Default is None.

Returns:

list of n_folds * n_repeats lists, each of which contains len(portions) sample index lists for a cross-validation trial.