Cross-Validation Objectives
The cross-validation module provides convenient wrapper classes to perform K-fold cross-validation for various types of machine learning models. These objectives define the function that FCVOpt optimizes during hyperparameter tuning.
Overview
Cross-validation objectives wrap machine learning models and define how to:
Set up K-fold cross-validation splits
Train models with given hyperparameters
Evaluate performance on validation folds
Handle model-specific requirements (early stopping, etc.)
Models Supported
Scikit-learn models: Any sklearn estimator (RandomForest, SVM, etc.)
Neural Networks: Multi-layer perceptrons and Tabular ResNet architectures
Custom models: Extend the base CVObjective class
CVObjective Base Class
- class fcvopt.crossvalidation.cvobjective.CVObjective(X, y, task, loss_metric, n_splits=5, n_repeats=1, stratified=True, num_jobs=1, rng_seed=None)[source]
Bases:
objectBase class for cross-validation objective functions.
This class provides a framework for evaluating models via K-fold or stratified K-fold cross-validation, with support for:
Regression and classification tasks (classification labels encoded internally)
Optional holdout evaluation (evaluate only the first fold)
Optional per-fold output scaling (for regression)
Optional per-fold input preprocessing (fitted only on training split)
Parallel execution of folds
Intended usage: Subclasses must override:
fit_and_test(): Fit the model on one train/test split and return a loss
See
SklearnCVObjfor an example.The callable interface (
__call__()) runs cross-validation for a given set of hyperparameters and returns either:A NumPy array of per-fold losses (if
all=True), orAn aggregate (mean) loss over the selected folds (if
all=False).
The folds used for evaluation can be:
All folds from the generated CV splits (default)
Only the first fold (if
holdout=True)An explicit subset via the
fold_idxsargument to__call__()
- Parameters:
X (
Union[ndarray,DataFrame,_Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – Feature data of shape(n_samples, n_features).y (
Union[ndarray,Series,_Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – Target data of shape(n_samples,)or compatible. For classification, labels are encoded internally withsklearn.preprocessing.LabelEncoder.task (
str) – One of'regression'or'classification'.loss_metric (
Callable) – Callable that computes a loss given(y_true, y_pred).n_splits (
int) – Number of folds per CV repeat. Defaults to5.n_repeats (
int) – Number of CV repeats. Defaults to5.stratified (
bool) – IfTrueandtaskis a classification type, use stratified CV splits. Defaults toTrue.num_jobs (
int) – Number of parallel jobs for fold evaluations. Defaults to1.rng_seed (
Optional[int]) – Random seed for the model random state, if applicable.
Notes
The meaning of “loss” is determined entirely by the
loss_metricyou provide. If you want to optimize a score where higher is better, wrap it into a loss (e.g.,lambda y, yhat: -roc_auc_score(y, yhat)).Subclasses choose to aggregate per-fold results differently or compute additional statistics.
- __call__(params, fold_idxs=None, return_all=False)[source]
Evaluate a hyperparameter configuration on selected CV folds.
By default, uses all generated fold. You can override the default by providing
fold_idxs(indices into the internally stored fold list).Computation is parallelized across folds according to
num_jobs.- Parameters:
params (
Dict) – Hyperparameters to pass toconstruct_model().fold_idxs (
Optional[List[int]]) – Optional list of fold indices to evaluate (e.g.,[0, 3, 4]). If omitted, uses all foldsreturn_all (
bool) – IfTrue, return the per‑fold loss array; ifFalse, return the aggregate (mean) loss over the selected folds.
- Returns:
Mean loss across the selected folds (if
return_all=False), otherwise an array of per‑fold losses.- Return type:
float or ndarray
Notes
fold_idxsrefer to the order produced by the internal splitter (RepeatedKFoldorRepeatedStratifiedKFold).
- construct_model(params, **kwargs)[source]
Build and return an unfitted model for a given hyperparameter configuration.
Must be implemented by subclasses.
- Parameters:
params (
Dict) – Mapping from hyperparameter name to value.**kwargs – Optional extras a subclass may accept for construction.
- Returns:
An unfitted model instance compatible with this objective.
- Raises:
NotImplementedError – If the subclass does not override this method.
- cvloss(params, fold_idxs=None, return_all=False)[source]
Compute cross-validation loss for given hyperparameters (deprecated alias).
Deprecated since version 0.3.0: Use the callable interface of this object instead, e.g.
losses = obj(params, fold_idxs=fold_idxs, all=all).- Parameters:
params (
Dict) – Dictionary of hyperparameters.fold_idxs (
Optional[List[int]]) – Indices of folds to evaluate. Defaults to all folds if not providedreturn_all (
bool) – If True, return array of losses per fold; otherwise return mean loss. Defaults to False.
- Returns:
Mean loss (if all=False) or array of per-fold losses.
- Return type:
float or np.ndarray
- fit_and_test(params, train_index, test_index)[source]
Fit/evaluate on a single CV split and return the loss for that split.
Must be implemented by subclasses. A typical implementation should:
Slice
X/ybytrain_indexandtest_index.Fit and apply
input_preprocessoron the training slice only, then transform the test slice (to avoid leakage).If
scale_outputand regression task: standardize targets using training statistics, then fit.Train the model built by
construct_model().Compute and return the scalar loss via
loss_metric.
- Parameters:
params (
Dict) – Hyperparameter mapping for model construction.train_index (
List[int]) – Row indices for the training portion of this split.test_index (
List[int]) – Row indices for the testing portion of this split.
- Return type:
float- Returns:
Scalar loss for this split (lower is better).
- Raises:
NotImplementedError – If the subclass does not override this method.
Scikit-learn Wrappers
- class fcvopt.crossvalidation.sklearn_cvobj.SklearnCVObj(estimator, X, y, task, loss_metric, needs_proba=False, n_splits=10, n_repeats=1, scale_output=False, input_preprocessor=None, stratified=False, num_jobs=1, rng_seed=None)[source]
Bases:
CVObjectiveCross‑validation objective for general scikit‑learn estimators.
Wraps an unfitted scikit‑learn estimator and evaluates it using the fold infrastructure provided by
CVObjective. The estimator must implementfitandpredict; if you setneeds_proba=Trueand your loss uses probabilities, it should also implementpredict_proba.See
CVObjectivefor fold selection, aggregation behavior, and leakage safeguards.- Parameters:
estimator (
BaseEstimator) – Unfitted model object conforming to scikit-learn estimator API. The estimator should implement fit and predict, and optionally predict_proba for classification tasks if the needs_proba flag is set to True and the loss metric requires probabilities.X (
Union[ndarray,DataFrame]) – Feature data of shape (n_samples, n_features).y (
Union[ndarray,Series]) – Target data of shape (n_samples,).task (
str) – One of ‘regression’ or ‘classification’.loss_metric (
Callable) – Function that computes a loss given (y_true, y_pred).needs_proba (
bool) – Whether the metric requires class probabilities. If True, the estimator’s predict_proba method is used instead of predict. Applicable only for classification tasks. Defaults to False.n_splits (
int) – Number of folds for cross-validation. Defaults to 10.n_repeats (
int) – Number of CV repeats. Defaults to 1.scale_output (
bool) – If True and task=’regression’, target values are standardized per training fold. Defaults to False.input_preprocessor (
Optional[TransformerMixin]) – Optional scikit-learn input transformer fit and applied per split. Defaults to None.stratified (
bool) – If True and task is ‘classification’, use stratified K-fold splits. Defaults to True.num_jobs (
int) – Number of parallel jobs for fold evaluations. Defaults to 1.rng_seed (
Optional[int]) – Random seed for the estimator random state, if applicable.
Example
from sklearn.ensemble import RandomForestClassifer from sklearn.metrics import accuracy_score from fcvopt.crossvalidation import SklearnCVObj # loss metric: misclassification rate def misclass_rate(y_true, y_pred): return 1 - accuracy_score(y_true, y_pred) X, y = load_breast_cancer(return_X_y=True) estimator = RandomForestClassifier() # Create the cross-validation objective # 1 repeat, 10 folds cv_obj = SklearnCVObj( estimator, X, y, task='classification', loss_metric=misclass_rate, n_splits=10, rng_seed=42 ) # 10-fold cv loss for a set of hyperparameters params = {'n_estimators': 100, 'max_depth': 5} mcr = cv_obj(params) print(f'Misclassification rate for hyperparameters {params}: {mcr:.4f}') # per-fold misclassification rates fold_losses = cv_obj(params, all=True) print(f'Per-fold misclassification rates for hyperparameters {params}: {fold_losses}')
- construct_model(params)[source]
Clone the base estimator, set provided hyperparameters, and (if supported) assign a deterministic
random_statederived fromrng_seed.Cloning ensures no state leaks across folds. A distinct, reproducible seed is generated for each fit when the estimator exposes a
random_stateparameter.- Parameters:
params (
Dict) – Hyperparameter name → value mapping.- Return type:
BaseEstimator- Returns:
A fresh, unfitted estimator configured with
params(and possiblyrandom_state).
- fit_and_test(params, train_index, test_index)[source]
Fit on the training split and return the loss on the test split.
Steps performed:
Slice
X/yby the provided indices.If
input_preprocessoris set, clone + fit on train only, then transform both train and test.If
scale_outputand regression, standardize targets using train statistics.Build a scorer via
sklearn.metrics.make_scorer(), using probabilities whenneeds_proba=True.Fit the estimator and compute loss on the test slice.
- Parameters:
params (
Dict) – Hyperparameters forwarded toconstruct_model().train_index (
List[int]) – Row indices for the training portion of this split.test_index (
List[int]) – Row indices for the testing portion of this split.
- Return type:
float- Returns:
Scalar loss for this split (lower is better).
Neural Network Wrappers
- class fcvopt.crossvalidation.resnet_cvobj.ResNetCVObj(X, y, task, loss_metric, needs_proba=False, n_splits=10, n_repeats=1, stratified=True, scale_output=False, input_preprocessor=None, num_jobs=1, rng_seed=None, max_epochs=100, patience=10, optimizer='AdamW', batch_size=256, device='cpu')[source]
Bases:
CVObjectiveCross-validation objective for tabular ResNet models (Gorishniy et al. (2021)).
Implements
fit_and_test()with a self-contained PyTorch training loop that includes early stopping, gradient clipping, and learning-rate scheduling. No external training library is required.Training details per fold:
A 10 % internal validation split is held out from the training data to monitor the validation loss.
Early stopping (configurable
patience, default 10) restores the best checkpoint.ReduceLROnPlateau(factor 0.1, patience 5, min_lr 1e-5) adjusts the learning rate during training.Gradient norms are clipped at 5.0 to improve stability.
Singleton mini-batches are dropped to avoid
BatchNorm1dfailures.
- Parameters:
X – Feature matrix of shape
(n_samples, n_features). All features must be numeric; encode categoricals beforehand.y – Target array of shape
(n_samples,).task (
str) – One of'regression','binary_classification', or'classification'.loss_metric – Callable
(y_true, y_pred) -> float(lower is better).needs_proba (
bool) – IfTrue, probabilities (sigmoid for binary, softmax for multiclass) are passed toloss_metricinstead of hard labels. Ignored for regression. Defaults toFalse.n_splits (
int) – Number of CV folds. Defaults to10.n_repeats (
int) – Number of CV repeats. Defaults to1.stratified (
bool) – Use stratified splits for classification. Defaults toTrue.scale_output (
bool) – Standardize regression targets per fold (using training statistics only). Defaults toFalse.input_preprocessor – Optional sklearn-compatible transformer fitted on each training fold and applied to both train and test. Defaults to
None.num_jobs (
int) – Parallel fold evaluations. Defaults to1.rng_seed (
Optional[int]) – Seed for reproducibility. Defaults toNone.max_epochs (
int) – Maximum training epochs per fold. Defaults to100.patience (
int) – Early-stopping patience (number of epochs without val-loss improvement before training halts). Defaults to10.optimizer (
str) – Name of atorch.optimoptimizer class (e.g.'AdamW','Adam','SGD'). Defaults to'AdamW'.batch_size (
int) – Mini-batch size. Pass this argument (not a hyperparameter) if you want a fixed batch size; include it in the config space only if you want to tune it. Defaults to256.device (
str) – PyTorch device string, e.g.'cpu'or'cuda'. Defaults to'cpu'.
Expected keys in
paramsdict (passed tofit_and_test()via the optimizer):Key
Description
n_hiddenNumber of residual blocks
layer_sizeHidden width
normalization'batchnorm'or'layernorm'hidden_factorExpansion factor inside each block
hidden_dropoutDropout rate inside blocks
residual_dropoutDropout rate on residual output
lrLearning rate
weight_decayL2 regularization strength
momentumMomentum (only when
optimizer='SGD')Example
from sklearn.datasets import make_classification from sklearn.metrics import roc_auc_score from sklearn.preprocessing import StandardScaler from fcvopt.crossvalidation import ResNetCVObj from fcvopt.optimizers import FCVOpt X, y = make_classification(n_samples=1000, n_features=20, random_state=0) cv_obj = ResNetCVObj( X=X, y=y, task='binary_classification', loss_metric=lambda yt, yp: 1 - roc_auc_score(yt, yp), needs_proba=True, n_splits=5, input_preprocessor=StandardScaler(), max_epochs=100, ) config = cv_obj.get_recommended_configspace() optimizer = FCVOpt(obj=cv_obj, n_folds=5, config=config, acq_function='LCB') best = optimizer.optimize(n_trials=30)
- construct_model(params)[source]
Build and return an uninitialized
TabularResNetfromparams.- Parameters:
params (
Dict) – Hyperparameter mapping; must containn_hidden,layer_size,normalization,hidden_factor,hidden_dropout,residual_dropout.- Return type:
- Returns:
An initialized
TabularResNetplaced onself.device.
- evaluate(model, X)[source]
Run a trained model on a feature array and return predictions as a CPU tensor.
Task-specific output transformations are applied so the result is ready to pass directly to
loss_metric:Regression: raw output, shape
(N,)Binary classification: sigmoid of the logit, shape
(N,)Multiclass classification: class probabilities via softmax, shape
(N, n_classes)
- Parameters:
model (
TabularResNet) – A trainedTabularResNetinstance.X (
ndarray) – Feature array of shape(N, n_features); will be cast to float32.
- Return type:
Tensor- Returns:
Prediction tensor on CPU.
- fit_and_test(params, train_index, test_index)[source]
Train a
TabularResNeton one CV fold and return the test loss.Steps:
Slice
X/ybytrain_index/test_index.Apply
input_preprocessor(fit on train only) if provided.Standardize regression targets using train statistics if
scale_output=True.Hold out 10 % of the training data as an internal validation set.
Train via a
DataLoader(early stopping + gradient clipping +ReduceLROnPlateau).Restore the best checkpoint; compute the test metric via
evaluate().
- Parameters:
params (
Dict) – Hyperparameter configuration.train_index – Row indices for the training portion of this split.
test_index – Row indices for the testing portion of this split.
- Return type:
float- Returns:
Scalar test loss for this fold (lower is better).
- get_recommended_configspace()[source]
Recommended hyperparameter search space for Tabular ResNet.
- Hyperparameters:
n_hidden: Integer, log-uniform in [1, 6]
layer_size: Integer, log-uniform in [8, 512]
normalization: Categorical in {‘batchnorm’, ‘layernorm’}
hidden_factor: Float in [1.0, 4.0]
hidden_dropout: Float in [0.0, 0.5]
residual_dropout: Float in [0.0, 0.5]
lr: Float, log-uniform in [1e-5, 1e-1]
weight_decay: Float, log-uniform in [1e-8, 1e-2]
- Returns:
A config space ready to plug into your optimizer.
- Return type:
Optuna Wrapper
- fcvopt.crossvalidation.optuna_obj.get_optuna_objective(cvobj, config, start_fold_idxs=None, rng_seed=None)[source]
Utility function that wraps the cross-validation objective for use with Optuna.
Note
In each trial, a holdout loss for a single fold is returned. By default, a random fold is chosen from the folds available in the cross-validation object. If start_fold_idxs is provided, the first len(start_fold_idxs) trials will use the specified fold indices, and the remaining trials will choose a random fold from the available folds.
- Parameters:
cvobj (
CVObjective) – The cross-validation object that implements the __call__ method to compute the loss for a given hyperparameter configuration.config (
ConfigurationSpace) – The hyperparameter search spacestart_fold_idxs (
Optional[List]) – A list of integers that define the fold indices for each trial. If None, a random fold is chosen for each trial at start. After the first len(start_fold_idxs) trials, the remaining trials will choose a random fold. If None, a random fold is chosen for the initial trials as well.rng_seed (
Optional[int]) – an optional random seed for reproducibility.`
- Return type:
Callable- Returns:
A function that takes in a trial object from optuna and returns the validation loss at a randomly chosen fold for the given hyperparameter configuration.
Utility classes
- class fcvopt.crossvalidation.resnet_cvobj.TabularResNet(input_dim, output_dim, n_hidden=2, layer_size=64, normalization='batchnorm', hidden_factor=2.0, hidden_dropout=0.1, residual_dropout=0.05)[source]
Bases:
ModuleTabular ResNet model.
A shallow fully connected stem followed by several residual blocks and a prediction head (Norm → ReLU → Linear).
Note
This implementation expects all features to be numeric. Preprocess categorical columns (e.g., one-hot or target encoding) beforehand.
See Gorishniy et al. (2021) for more details.
- Parameters:
input_dim (
int) – Input feature dimension.output_dim (
int) – Output dimension (1for regression/binary classification, or number of classes for multiclass).n_hidden (
int) – Number of residual blocks (default:2).layer_size (
int) – Width of the hidden representation (default:64).normalization (
str) –'batchnorm'or'layernorm'.hidden_factor (
float) – Expansion factor inside each residual block (hidden width= floor(hidden_factor * layer_size)).hidden_dropout (
float) – Dropout rate inside residual blocks.residual_dropout (
float) – Dropout rate on the residual output.
- Shape:
Input:
(N, input_dim)Output:
(N, output_dim)
- ff
Input stem (
Linear(input_dim, layer_size)) followed by residual blocks.
- prediction
Norm → ReLU → Linear head to
output_dim.
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses. :rtype:
TensorNote
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class fcvopt.crossvalidation.resnet_cvobj.ResNetBlock(input_dim, normalization, hidden_factor=2.0, hidden_dropout=0.1, residual_dropout=0.05)[source]
Bases:
ModuleResidual block for a feed-forward network with dropout (tabular data).
The block computes:
x + Dropout( Linear( Dropout( ReLU( Linear( Norm(x) ) ) ) ) )
where
Normis either batch normalization or layer normalization.See Gorishniy et al. (2021) for details.
- Parameters:
input_dim (
int) – Last dimension of the input tensor.normalization (
str) –'batchnorm'or'layernorm'.hidden_factor (
float) – Hidden width inside the block isfloor(hidden_factor * input_dim).hidden_dropout (
float) – Dropout rate inside the hidden path.residual_dropout (
float) – Dropout rate applied to the residual output.
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses. :rtype:
TensorNote
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.