Sign in
Log inSign up

Hyperparameter Tuning with HYPEROPT

A Simple Guide to Get Started with Hyperopt

Sameer's photo
Sameer
·Oct 23, 2021·

3 min read

What is a Hyperparameter?

In machine learning, a hyperparameter is a parameter whose value is used to control the learning process. By contrast, the values of other parameters (typically node weights) are derived via training(Ref. 1).

Why tune them?

Motivation

Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel approaches to feature learning(Ref. 2).

How to tune them?

HYPEROPT

Hyperopt is a Python library for serial and parallel optimisation over awkward search spaces, which may include real-valued, discrete, and conditional dimensions.(Ref. 3).

Building Blocks

There are 3 building blocks to using hyperopt. Let's look at them one by one.

  1. Define an Objective Function
  2. Define a Hyperspace
  3. Choose a Hyperopt algorithm

Demonstration

1. Defining an Objective Function

An objective function calculates the accuracy/loss for the model. I am using XGBRegressor model to build the objective function.

def objective(space):
    pipeline = get_pipeline(space)

    pipeline.fit(X_train, y_train)
    predictions = pipeline.predict(X_valid)
    score = (-1 * cross_val_score(pipeline, X_valid, y_valid,
                              cv=5,
                              scoring='neg_mean_absolute_error')).mean()

    print ("SCORE:", score)

    return { 'loss': score, 'status': STATUS_OK }

Notice, how all the parameters are coming via a dictionary called space.

def get_pipeline(space):
    model = XGBRegressor(
        n_estimators = int(space['n_estimators']), 
        max_depth = int(space['max_depth']),
        learning_rate = space['learning_rate'], 
        colsample_bytree = space['colsample_bytree'],
        subsample = space['subsample'],
        reg_alpha = space['reg_alpha'],
        random_state = 1,
        tree_method = 'gpu_hist')

    pipeline = Pipeline(steps=[
        ('Preprocesing', preprocessor),
        ('Model', model)
    ])

    return pipeline

2. Define a Hyperspace

A hyperspace, is the range(space) within which the hyperparameters can fall.

Instead of us guessing the best values for these hyperparameters of XGBRegressor for our unique dataset, we provide a range of value based on typical range recommended for the model.

from hyperopt import fmin, tpe, hp, STATUS_OK, Trials

space = {
    'n_estimators': hp.quniform('n_estimators', 10, 1500, 1), 
    'max_depth': hp.quniform('max_depth', 3, 18, 1),
    'learning_rate': hp.uniform('learning_rate', 0.14, 1), 
    'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1),
    'subsample': hp.uniform('subsample', 0.99, 1),
    'reg_alpha': hp.uniform('reg_alpha', 1, 30),
}

Hyperopt provides various range creators like the quniform and uniform that I have used above. For more Ref 4.

3. Choose a Hyperopt algorithm

Once, we have the objective function and hyperspace ready. We let Hyperopt find the best combination of all the hyperparameters. We need to specify which algorithm to be used for zoning into the best hyperparameters. We are using tpe.suggest here.

trials = Trials()

best_hyperparams = fmin(fn = objective,
                        space = space,
                        algo = tpe.suggest,
                        max_evals = 100,
                        trials = trials)

print(best_hyperparams)

Result

These hyperparamters value will look like below. Once we have the best hyperparameters we can create the best model using these hyperparameters.

best_hyperparams = {'colsample_bytree': 0.894979665982678, 'learning_rate': 0.23740393006670762, 'max_depth': 3.0, 'n_estimators': 240.0, 'reg_alpha': 16.040384642014846, 'subsample': 0.9968732023279113}

best_pipeline = get_pipeline(best_hyperparams)
best_pipeline.fit(pd.concat([X_train, X_valid]), pd.concat([y_train, y_valid]))
predictions_test = best_pipeline.predict(X_test)

That's all folks!

References

  1. Hyperparameter on Wikipedia
  2. Algorithms for Hyper-Parameter Optimization
  3. Hyperopt GIT
  4. Various range creators supported by hyperopt
  5. My Kaggle Notebook using hyperopt