Hyperparameter Tuning with HYPEROPT
A Simple Guide to Get Started with Hyperopt
What is a Hyperparameter?
In machine learning, a hyperparameter is a parameter whose value is used to control the learning process. By contrast, the values of other parameters (typically node weights) are derived via training(Ref. 1).
Why tune them?
Motivation
Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel approaches to feature learning(Ref. 2).
How to tune them?
HYPEROPT
Hyperopt is a Python library for serial and parallel optimisation over awkward search spaces, which may include real-valued, discrete, and conditional dimensions.(Ref. 3).
Building Blocks
There are 3 building blocks to using hyperopt. Let's look at them one by one.
- Define an Objective Function
- Define a Hyperspace
- Choose a Hyperopt algorithm
Demonstration
1. Defining an Objective Function
An objective function calculates the accuracy/loss for the model. I am using XGBRegressor model to build the objective function.
def objective(space):
pipeline = get_pipeline(space)
pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_valid)
score = (-1 * cross_val_score(pipeline, X_valid, y_valid,
cv=5,
scoring='neg_mean_absolute_error')).mean()
print ("SCORE:", score)
return { 'loss': score, 'status': STATUS_OK }
Notice, how all the parameters are coming via a dictionary called space.
def get_pipeline(space):
model = XGBRegressor(
n_estimators = int(space['n_estimators']),
max_depth = int(space['max_depth']),
learning_rate = space['learning_rate'],
colsample_bytree = space['colsample_bytree'],
subsample = space['subsample'],
reg_alpha = space['reg_alpha'],
random_state = 1,
tree_method = 'gpu_hist')
pipeline = Pipeline(steps=[
('Preprocesing', preprocessor),
('Model', model)
])
return pipeline
2. Define a Hyperspace
A hyperspace, is the range(space) within which the hyperparameters can fall.
Instead of us guessing the best values for these hyperparameters of XGBRegressor for our unique dataset, we provide a range of value based on typical range recommended for the model.
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
space = {
'n_estimators': hp.quniform('n_estimators', 10, 1500, 1),
'max_depth': hp.quniform('max_depth', 3, 18, 1),
'learning_rate': hp.uniform('learning_rate', 0.14, 1),
'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1),
'subsample': hp.uniform('subsample', 0.99, 1),
'reg_alpha': hp.uniform('reg_alpha', 1, 30),
}
Hyperopt provides various range creators like the quniform and uniform that I have used above. For more Ref 4.
3. Choose a Hyperopt algorithm
Once, we have the objective function and hyperspace ready. We let Hyperopt find the best combination of all the hyperparameters. We need to specify which algorithm to be used for zoning into the best hyperparameters. We are using tpe.suggest here.
trials = Trials()
best_hyperparams = fmin(fn = objective,
space = space,
algo = tpe.suggest,
max_evals = 100,
trials = trials)
print(best_hyperparams)
Result
These hyperparamters value will look like below. Once we have the best hyperparameters we can create the best model using these hyperparameters.
best_hyperparams = {'colsample_bytree': 0.894979665982678, 'learning_rate': 0.23740393006670762, 'max_depth': 3.0, 'n_estimators': 240.0, 'reg_alpha': 16.040384642014846, 'subsample': 0.9968732023279113}
best_pipeline = get_pipeline(best_hyperparams)
best_pipeline.fit(pd.concat([X_train, X_valid]), pd.concat([y_train, y_valid]))
predictions_test = best_pipeline.predict(X_test)
That's all folks!