Welcome to FCVOpt’s Documentation!

FCVOpt is a Python package for Fractional Cross-Validation in hyperparameter optimization. It implements the methodology from “Fractional cross-validation for optimizing hyperparameters of supervised learning algorithms” using hierarchical Gaussian processes to efficiently optimize ML models by evaluating only a fraction of CV folds.

Key Innovation: While K-fold cross-validation is more robust than holdout validation, it is computationally expensive since models must be fit K times at each hyperparameter configuration. FCVOpt addresses this by exploiting the correlation structure between folds across the hyperparameter space, requiring evaluation of only a single fold for many configurations.

🚀 Key Features

  • Efficient Optimization: Evaluate hyperparameters using only a subset of CV folds via hierarchical Gaussian processes

  • Standard Bayesian Optimization: Available for hyperparameter optimization with holdout loss and for general purpose optimization

  • Intelligent Fold Selection: Variance reduction strategy that selects which CV folds to evaluate at each configuration

  • MLflow Integration: Automatic experiment tracking and model versioning

  • Multiple Acquisition Functions: Knowledge Gradient, Lower Confidence Bound

  • Framework Support: Scikit-learn, XGBoost, Neural Networks, and more

📖 Quick Start

Install FCVOpt:

git clone https://github.com/syerramilli/fcvopt.git
cd fcvopt
pip install .

Basic usage:

from fcvopt.optimizers import FCVOpt
from fcvopt.crossvalidation import SklearnCVObj
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import zero_one_loss
from fcvopt.configspace import ConfigurationSpace
from ConfigSpace import Integer, Float


# Create CV objective
cv_obj = SklearnCVObj(
   estimator=RandomForestClassifier(),
   X=X, y=y,
   loss_metric=zero_one_loss,  # Metric to minimize
   task='binary-classification',
   n_splits=5, # 5-fold cross-validation
   rng_seed=42
)

# define the hyperparameter configuration space
config = ConfigurationSpace()
config.add([
   Integer('n_estimators', bounds=(10,1000), log=True),
   Integer('max_depth', bounds=(1,12), log=True),
   Float('max_features', bounds=(0.1, 1), log=True),
])
config.generate_indices()

# Initialize optimizer
optimizer = FCVOpt(
   obj=cv_obj.cvloss,
   n_folds=cv_obj.cv.get_n_splits(),
   config=config,
   acq_function = 'LCB', # 'KG' gives better results but is slower
   tracking_dir='./hp_opt_runs/', # for mlflow tracking,
   experiment_name='rf_hpt'
)

# run optimization with a budget of 50 trials
best_conf = optimizer.optimize(n_trials=50)

🔬 Research Background

FCVOpt implements the algorithm described in:

“Fractional cross-validation for optimizing hyperparameters of supervised learning algorithms” Suraj Yerramilli and Daniel W. Apley Published in Technometrics (2025) DOI: 10.1080/00401706.2025.2515926

📚 Documentation Contents

Indices and Tables