DATA_UTILS

DATA_UTILS#

data_utils.to_categorical(y[, num_classes])

Converts a class vector (integers) to binary class matrix. E.g. for use with categorical_crossentropy. :param y: class vector to be converted into a matrix (integers from 0 to num_classes). :type y: numpy array :param num_classes: total number of classes. :type num_classes: int.

data_utils.convert_to_class(y_one_hot[, dtype])

Converts a one-hot class encoding (array with as many positions as total classes, with 1 in the corresponding class position, 0 in the other positions), or soft-max class encoding (array with as many positions as total classes, whose largest valued position is used as class membership) to an integer class encoding.

data_utils.scale_array(mat[, scaling])

Scale data included in numpy array.

data_utils.impute_and_scale_array(mat[, scaling])

Impute missing values with mean and scale data included in numpy array.

data_utils.drop_impute_and_scale_dataframe(df)

Impute missing values with mean and scale data included in pandas dataframe.

data_utils.discretize_dataframe(df, col[, ...])

Discretize values of given column in pandas dataframe.

data_utils.discretize_array(y[, bins])

Discretize values of given array.

data_utils.lookup(df, query, ret, keys[, match])

Dataframe lookup.

data_utils.load_X_data(train_file, test_file)

Load training and testing unlabeleled data from the files specified and construct corresponding training and testing pandas DataFrames.

data_utils.load_X_data2(train_file, test_file)

Load training and testing unlabeleled data from the files specified.

data_utils.load_Xy_one_hot_data(train_file, ...)

Load training and testing data from the files specified, with a column indicated to use as label.

data_utils.load_Xy_one_hot_data2(train_file, ...)

Load training and testing data from the files specified, with a column indicated to use as label.

data_utils.load_Xy_data2(train_file, test_file)

Load training and testing data from the files specified, with a column indicated to use as label.

data_utils.load_Xy_data_noheader(train_file, ...)

Load training and testing data from the files specified, with the first column to use as label.

data_utils.load_csv_data(train_path[, ...])

Load data from the files specified.