Cross-Validation Objectives

The cross-validation module provides convenient wrapper classes to perform K-fold cross-validation for various types of machine learning models. These objectives define the function that FCVOpt optimizes during hyperparameter tuning.

Overview

Cross-validation objectives wrap machine learning models and define how to:

  • Set up K-fold cross-validation splits

  • Train models with given hyperparameters

  • Evaluate performance on validation folds

  • Handle model-specific requirements (early stopping, etc.)

Models Supported

  • Scikit-learn models: Any sklearn estimator (RandomForest, SVM, etc.)

  • Neural Networks: Multi-layer perceptrons and Tabular ResNet architectures

  • Custom models: Extend the base CVObjective class

CVObjective Base Class

class fcvopt.crossvalidation.cvobjective.CVObjective(X, y, task, loss_metric, n_splits=5, n_repeats=1, stratified=True, num_jobs=1, rng_seed=None)[source]

Bases: object

Base class for cross-validation objective functions.

This class provides a framework for evaluating models via K-fold or stratified K-fold cross-validation, with support for:

  • Regression and classification tasks (classification labels encoded internally)

  • Optional holdout evaluation (evaluate only the first fold)

  • Optional per-fold output scaling (for regression)

  • Optional per-fold input preprocessing (fitted only on training split)

  • Parallel execution of folds

Intended usage: Subclasses must override:

  • fit_and_test(): Fit the model on one train/test split and return a loss

See SklearnCVObj for an example.

The callable interface (__call__()) runs cross-validation for a given set of hyperparameters and returns either:

  • A NumPy array of per-fold losses (if all=True), or

  • An aggregate (mean) loss over the selected folds (if all=False).

The folds used for evaluation can be:

  • All folds from the generated CV splits (default)

  • Only the first fold (if holdout=True)

  • An explicit subset via the fold_idxs argument to __call__()

Parameters:
  • X (Union[ndarray, DataFrame, _Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[bool | int | float | complex | str | bytes]]) – Feature data of shape (n_samples, n_features).

  • y (Union[ndarray, Series, _Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[bool | int | float | complex | str | bytes]]) – Target data of shape (n_samples,) or compatible. For classification, labels are encoded internally with sklearn.preprocessing.LabelEncoder.

  • task (str) – One of 'regression' or 'classification'.

  • loss_metric (Callable) – Callable that computes a loss given (y_true, y_pred).

  • n_splits (int) – Number of folds per CV repeat. Defaults to 5.

  • n_repeats (int) – Number of CV repeats. Defaults to 5.

  • stratified (bool) – If True and task is a classification type, use stratified CV splits. Defaults to True.

  • num_jobs (int) – Number of parallel jobs for fold evaluations. Defaults to 1.

  • rng_seed (Optional[int]) – Random seed for the model random state, if applicable.

Notes

  • The meaning of “loss” is determined entirely by the loss_metric you provide. If you want to optimize a score where higher is better, wrap it into a loss (e.g., lambda y, yhat: -roc_auc_score(y, yhat)).

  • Subclasses choose to aggregate per-fold results differently or compute additional statistics.

__call__(params, fold_idxs=None, return_all=False)[source]

Evaluate a hyperparameter configuration on selected CV folds.

By default, uses all generated fold. You can override the default by providing fold_idxs (indices into the internally stored fold list).

Computation is parallelized across folds according to num_jobs.

Parameters:
  • params (Dict) – Hyperparameters to pass to construct_model().

  • fold_idxs (Optional[List[int]]) – Optional list of fold indices to evaluate (e.g., [0, 3, 4]). If omitted, uses all folds

  • return_all (bool) – If True, return the per‑fold loss array; if False, return the aggregate (mean) loss over the selected folds.

Returns:

Mean loss across the selected folds (if return_all=False), otherwise an array of per‑fold losses.

Return type:

float or ndarray

Notes

  • fold_idxs refer to the order produced by the internal splitter (RepeatedKFold or RepeatedStratifiedKFold).

construct_model(params, **kwargs)[source]

Build and return an unfitted model for a given hyperparameter configuration.

Must be implemented by subclasses.

Parameters:
  • params (Dict) – Mapping from hyperparameter name to value.

  • **kwargs – Optional extras a subclass may accept for construction.

Returns:

An unfitted model instance compatible with this objective.

Raises:

NotImplementedError – If the subclass does not override this method.

cvloss(params, fold_idxs=None, return_all=False)[source]

Compute cross-validation loss for given hyperparameters (deprecated alias).

Deprecated since version 0.3.0: Use the callable interface of this object instead, e.g. losses = obj(params, fold_idxs=fold_idxs, all=all).

Parameters:
  • params (Dict) – Dictionary of hyperparameters.

  • fold_idxs (Optional[List[int]]) – Indices of folds to evaluate. Defaults to all folds if not provided

  • return_all (bool) – If True, return array of losses per fold; otherwise return mean loss. Defaults to False.

Returns:

Mean loss (if all=False) or array of per-fold losses.

Return type:

float or np.ndarray

fit_and_test(params, train_index, test_index)[source]

Fit/evaluate on a single CV split and return the loss for that split.

Must be implemented by subclasses. A typical implementation should:

  1. Slice X/y by train_index and test_index.

  2. Fit and apply input_preprocessor on the training slice only, then transform the test slice (to avoid leakage).

  3. If scale_output and regression task: standardize targets using training statistics, then fit.

  4. Train the model built by construct_model().

  5. Compute and return the scalar loss via loss_metric.

Parameters:
  • params (Dict) – Hyperparameter mapping for model construction.

  • train_index (List[int]) – Row indices for the training portion of this split.

  • test_index (List[int]) – Row indices for the testing portion of this split.

Return type:

float

Returns:

Scalar loss for this split (lower is better).

Raises:

NotImplementedError – If the subclass does not override this method.

Scikit-learn Wrappers

class fcvopt.crossvalidation.sklearn_cvobj.SklearnCVObj(estimator, X, y, task, loss_metric, needs_proba=False, n_splits=10, n_repeats=1, scale_output=False, input_preprocessor=None, stratified=False, num_jobs=1, rng_seed=None)[source]

Bases: CVObjective

Cross‑validation objective for general scikit‑learn estimators.

Wraps an unfitted scikit‑learn estimator and evaluates it using the fold infrastructure provided by CVObjective. The estimator must implement fit and predict; if you set needs_proba=True and your loss uses probabilities, it should also implement predict_proba.

See CVObjective for fold selection, aggregation behavior, and leakage safeguards.

Parameters:
  • estimator (BaseEstimator) – Unfitted model object conforming to scikit-learn estimator API. The estimator should implement fit and predict, and optionally predict_proba for classification tasks if the needs_proba flag is set to True and the loss metric requires probabilities.

  • X (Union[ndarray, DataFrame]) – Feature data of shape (n_samples, n_features).

  • y (Union[ndarray, Series]) – Target data of shape (n_samples,).

  • task (str) – One of ‘regression’ or ‘classification’.

  • loss_metric (Callable) – Function that computes a loss given (y_true, y_pred).

  • needs_proba (bool) – Whether the metric requires class probabilities. If True, the estimator’s predict_proba method is used instead of predict. Applicable only for classification tasks. Defaults to False.

  • n_splits (int) – Number of folds for cross-validation. Defaults to 10.

  • n_repeats (int) – Number of CV repeats. Defaults to 1.

  • scale_output (bool) – If True and task=’regression’, target values are standardized per training fold. Defaults to False.

  • input_preprocessor (Optional[TransformerMixin]) – Optional scikit-learn input transformer fit and applied per split. Defaults to None.

  • stratified (bool) – If True and task is ‘classification’, use stratified K-fold splits. Defaults to True.

  • num_jobs (int) – Number of parallel jobs for fold evaluations. Defaults to 1.

  • rng_seed (Optional[int]) – Random seed for the estimator random state, if applicable.

Example

from sklearn.ensemble import RandomForestClassifer
from sklearn.metrics import accuracy_score
from fcvopt.crossvalidation import SklearnCVObj

# loss metric: misclassification rate
def misclass_rate(y_true, y_pred):
    return 1 - accuracy_score(y_true, y_pred)


X, y = load_breast_cancer(return_X_y=True)
estimator = RandomForestClassifier()

# Create the cross-validation objective
# 1 repeat, 10 folds
cv_obj = SklearnCVObj(
    estimator, X, y,
    task='classification',
    loss_metric=misclass_rate,
    n_splits=10,
    rng_seed=42
)

# 10-fold cv loss for a set of hyperparameters
params = {'n_estimators': 100, 'max_depth': 5}
mcr = cv_obj(params)
print(f'Misclassification rate for hyperparameters {params}: {mcr:.4f}')

# per-fold misclassification rates
fold_losses = cv_obj(params, all=True)
print(f'Per-fold misclassification rates for hyperparameters {params}: {fold_losses}')
construct_model(params)[source]

Clone the base estimator, set provided hyperparameters, and (if supported) assign a deterministic random_state derived from rng_seed.

Cloning ensures no state leaks across folds. A distinct, reproducible seed is generated for each fit when the estimator exposes a random_state parameter.

Parameters:

params (Dict) – Hyperparameter name → value mapping.

Return type:

BaseEstimator

Returns:

A fresh, unfitted estimator configured with params (and possibly random_state).

fit_and_test(params, train_index, test_index)[source]

Fit on the training split and return the loss on the test split.

Steps performed:

  1. Slice X/y by the provided indices.

  2. If input_preprocessor is set, clone + fit on train only, then transform both train and test.

  3. If scale_output and regression, standardize targets using train statistics.

  4. Build a scorer via sklearn.metrics.make_scorer(), using probabilities when needs_proba=True.

  5. Fit the estimator and compute loss on the test slice.

Parameters:
  • params (Dict) – Hyperparameters forwarded to construct_model().

  • train_index (List[int]) – Row indices for the training portion of this split.

  • test_index (List[int]) – Row indices for the testing portion of this split.

Return type:

float

Returns:

Scalar loss for this split (lower is better).

Neural Network Wrappers

class fcvopt.crossvalidation.resnet_cvobj.ResNetCVObj(X, y, task, loss_metric, needs_proba=False, n_splits=10, n_repeats=1, stratified=True, scale_output=False, input_preprocessor=None, num_jobs=1, rng_seed=None, max_epochs=100, patience=10, optimizer='AdamW', batch_size=256, device='cpu')[source]

Bases: CVObjective

Cross-validation objective for tabular ResNet models (Gorishniy et al. (2021)).

Implements fit_and_test() with a self-contained PyTorch training loop that includes early stopping, gradient clipping, and learning-rate scheduling. No external training library is required.

Training details per fold:

  • A 10 % internal validation split is held out from the training data to monitor the validation loss.

  • Early stopping (configurable patience, default 10) restores the best checkpoint.

  • ReduceLROnPlateau (factor 0.1, patience 5, min_lr 1e-5) adjusts the learning rate during training.

  • Gradient norms are clipped at 5.0 to improve stability.

  • Singleton mini-batches are dropped to avoid BatchNorm1d failures.

Parameters:
  • X – Feature matrix of shape (n_samples, n_features). All features must be numeric; encode categoricals beforehand.

  • y – Target array of shape (n_samples,).

  • task (str) – One of 'regression', 'binary_classification', or 'classification'.

  • loss_metric – Callable (y_true, y_pred) -> float (lower is better).

  • needs_proba (bool) – If True, probabilities (sigmoid for binary, softmax for multiclass) are passed to loss_metric instead of hard labels. Ignored for regression. Defaults to False.

  • n_splits (int) – Number of CV folds. Defaults to 10.

  • n_repeats (int) – Number of CV repeats. Defaults to 1.

  • stratified (bool) – Use stratified splits for classification. Defaults to True.

  • scale_output (bool) – Standardize regression targets per fold (using training statistics only). Defaults to False.

  • input_preprocessor – Optional sklearn-compatible transformer fitted on each training fold and applied to both train and test. Defaults to None.

  • num_jobs (int) – Parallel fold evaluations. Defaults to 1.

  • rng_seed (Optional[int]) – Seed for reproducibility. Defaults to None.

  • max_epochs (int) – Maximum training epochs per fold. Defaults to 100.

  • patience (int) – Early-stopping patience (number of epochs without val-loss improvement before training halts). Defaults to 10.

  • optimizer (str) – Name of a torch.optim optimizer class (e.g. 'AdamW', 'Adam', 'SGD'). Defaults to 'AdamW'.

  • batch_size (int) – Mini-batch size. Pass this argument (not a hyperparameter) if you want a fixed batch size; include it in the config space only if you want to tune it. Defaults to 256.

  • device (str) – PyTorch device string, e.g. 'cpu' or 'cuda'. Defaults to 'cpu'.

Expected keys in params dict (passed to fit_and_test() via the optimizer):

Key

Description

n_hidden

Number of residual blocks

layer_size

Hidden width

normalization

'batchnorm' or 'layernorm'

hidden_factor

Expansion factor inside each block

hidden_dropout

Dropout rate inside blocks

residual_dropout

Dropout rate on residual output

lr

Learning rate

weight_decay

L2 regularization strength

momentum

Momentum (only when optimizer='SGD')

Example

from sklearn.datasets import make_classification
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import StandardScaler
from fcvopt.crossvalidation import ResNetCVObj
from fcvopt.optimizers import FCVOpt

X, y = make_classification(n_samples=1000, n_features=20, random_state=0)

cv_obj = ResNetCVObj(
    X=X, y=y,
    task='binary_classification',
    loss_metric=lambda yt, yp: 1 - roc_auc_score(yt, yp),
    needs_proba=True,
    n_splits=5,
    input_preprocessor=StandardScaler(),
    max_epochs=100,
)

config = cv_obj.get_recommended_configspace()
optimizer = FCVOpt(obj=cv_obj, n_folds=5, config=config, acq_function='LCB')
best = optimizer.optimize(n_trials=30)
construct_model(params)[source]

Build and return an uninitialized TabularResNet from params.

Parameters:

params (Dict) – Hyperparameter mapping; must contain n_hidden, layer_size, normalization, hidden_factor, hidden_dropout, residual_dropout.

Return type:

TabularResNet

Returns:

An initialized TabularResNet placed on self.device.

evaluate(model, X)[source]

Run a trained model on a feature array and return predictions as a CPU tensor.

Task-specific output transformations are applied so the result is ready to pass directly to loss_metric:

  • Regression: raw output, shape (N,)

  • Binary classification: sigmoid of the logit, shape (N,)

  • Multiclass classification: class probabilities via softmax, shape (N, n_classes)

Parameters:
  • model (TabularResNet) – A trained TabularResNet instance.

  • X (ndarray) – Feature array of shape (N, n_features); will be cast to float32.

Return type:

Tensor

Returns:

Prediction tensor on CPU.

fit_and_test(params, train_index, test_index)[source]

Train a TabularResNet on one CV fold and return the test loss.

Steps:

  1. Slice X/y by train_index / test_index.

  2. Apply input_preprocessor (fit on train only) if provided.

  3. Standardize regression targets using train statistics if scale_output=True.

  4. Hold out 10 % of the training data as an internal validation set.

  5. Train via a DataLoader (early stopping + gradient clipping + ReduceLROnPlateau).

  6. Restore the best checkpoint; compute the test metric via evaluate().

Parameters:
  • params (Dict) – Hyperparameter configuration.

  • train_index – Row indices for the training portion of this split.

  • test_index – Row indices for the testing portion of this split.

Return type:

float

Returns:

Scalar test loss for this fold (lower is better).

Recommended hyperparameter search space for Tabular ResNet.

Hyperparameters:
  • n_hidden: Integer, log-uniform in [1, 6]

  • layer_size: Integer, log-uniform in [8, 512]

  • normalization: Categorical in {‘batchnorm’, ‘layernorm’}

  • hidden_factor: Float in [1.0, 4.0]

  • hidden_dropout: Float in [0.0, 0.5]

  • residual_dropout: Float in [0.0, 0.5]

  • lr: Float, log-uniform in [1e-5, 1e-1]

  • weight_decay: Float, log-uniform in [1e-8, 1e-2]

Returns:

A config space ready to plug into your optimizer.

Return type:

ConfigurationSpace

Optuna Wrapper

fcvopt.crossvalidation.optuna_obj.get_optuna_objective(cvobj, config, start_fold_idxs=None, rng_seed=None)[source]

Utility function that wraps the cross-validation objective for use with Optuna.

Note

In each trial, a holdout loss for a single fold is returned. By default, a random fold is chosen from the folds available in the cross-validation object. If start_fold_idxs is provided, the first len(start_fold_idxs) trials will use the specified fold indices, and the remaining trials will choose a random fold from the available folds.

Parameters:
  • cvobj (CVObjective) – The cross-validation object that implements the __call__ method to compute the loss for a given hyperparameter configuration.

  • config (ConfigurationSpace) – The hyperparameter search space

  • start_fold_idxs (Optional[List]) – A list of integers that define the fold indices for each trial. If None, a random fold is chosen for each trial at start. After the first len(start_fold_idxs) trials, the remaining trials will choose a random fold. If None, a random fold is chosen for the initial trials as well.

  • rng_seed (Optional[int]) – an optional random seed for reproducibility.`

Return type:

Callable

Returns:

A function that takes in a trial object from optuna and returns the validation loss at a randomly chosen fold for the given hyperparameter configuration.

Utility classes

class fcvopt.crossvalidation.resnet_cvobj.TabularResNet(input_dim, output_dim, n_hidden=2, layer_size=64, normalization='batchnorm', hidden_factor=2.0, hidden_dropout=0.1, residual_dropout=0.05)[source]

Bases: Module

Tabular ResNet model.

A shallow fully connected stem followed by several residual blocks and a prediction head (Norm → ReLU → Linear).

Note

This implementation expects all features to be numeric. Preprocess categorical columns (e.g., one-hot or target encoding) beforehand.

See Gorishniy et al. (2021) for more details.

Parameters:
  • input_dim (int) – Input feature dimension.

  • output_dim (int) – Output dimension (1 for regression/binary classification, or number of classes for multiclass).

  • n_hidden (int) – Number of residual blocks (default: 2).

  • layer_size (int) – Width of the hidden representation (default: 64).

  • normalization (str) – 'batchnorm' or 'layernorm'.

  • hidden_factor (float) – Expansion factor inside each residual block (hidden width = floor(hidden_factor * layer_size)).

  • hidden_dropout (float) – Dropout rate inside residual blocks.

  • residual_dropout (float) – Dropout rate on the residual output.

Shape:
  • Input: (N, input_dim)

  • Output: (N, output_dim)

ff

Input stem (Linear(input_dim, layer_size)) followed by residual blocks.

prediction

Norm → ReLU → Linear head to output_dim.

forward(x)[source]

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fcvopt.crossvalidation.resnet_cvobj.ResNetBlock(input_dim, normalization, hidden_factor=2.0, hidden_dropout=0.1, residual_dropout=0.05)[source]

Bases: Module

Residual block for a feed-forward network with dropout (tabular data).

The block computes:

x + Dropout( Linear( Dropout( ReLU( Linear( Norm(x) ) ) ) ) )

where Norm is either batch normalization or layer normalization.

See Gorishniy et al. (2021) for details.

Parameters:
  • input_dim (int) – Last dimension of the input tensor.

  • normalization (str) – 'batchnorm' or 'layernorm'.

  • hidden_factor (float) – Hidden width inside the block is floor(hidden_factor * input_dim).

  • hidden_dropout (float) – Dropout rate inside the hidden path.

  • residual_dropout (float) – Dropout rate applied to the residual output.

forward(x)[source]

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.