.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/hpo_ngb.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_hpo_ngb.py: ================= 2.3 hpo ngboost ================= This file shows how to optimize hyperparameters of ngboost model. .. GENERATED FROM PYTHON SOURCE LINES 7-23 .. code-block:: Python from typing import Union import os import numpy as np from ngboost import NGBRegressor from ngboost.distns import Exponential, Normal, LogNormal from ai4water import Model from ai4water.utils import TrainTestSplit from ai4water.utils.utils import get_version_info from ai4water.utils.utils import dateandtime_now, jsonize from ai4water.hyperopt import HyperOpt, Categorical, Real, Integer from utils import read_data .. GENERATED FROM PYTHON SOURCE LINES 24-27 .. code-block:: Python for lib,ver in get_version_info().items(): print(lib, ver) .. rst-class:: sphx-glr-script-out .. code-block:: none python 3.9.20 (main, Nov 5 2024, 16:07:55) [GCC 11.4.0] os posix ai4water 1.07 easy_mpl 0.21.4 SeqMetrics 2.0.0 tensorflow 2.10.1 keras.api._v2.keras 2.10.0 numpy 1.21.6 pandas 1.5.3 matplotlib 3.7.1 h5py 3.13.0 sklearn 1.3.1 seaborn 0.13.2 .. GENERATED FROM PYTHON SOURCE LINES 28-36 .. code-block:: Python data = read_data(target='Area (ABD) Mean') input_features = data.columns.tolist()[0:-1] output_features = data.columns.tolist()[-1:] print(input_features) .. rst-class:: sphx-glr-script-out .. code-block:: none ['Time (min)', 'Ini. CC', 'Sonic. PD', 'h20 Conc.', 'Volume (mL)', 'Solution pH'] .. GENERATED FROM PYTHON SOURCE LINES 37-39 .. code-block:: Python print(output_features) .. rst-class:: sphx-glr-script-out .. code-block:: none ['Area (ABD) Mean'] .. GENERATED FROM PYTHON SOURCE LINES 40-41 split the data into training and test. The **test data will not be used druing hpo**. .. GENERATED FROM PYTHON SOURCE LINES 41-49 .. code-block:: Python TrainX, TestX, TrainY, TestY = TrainTestSplit(seed=313).split_by_random( data[input_features], data[output_features] ) print(TrainX.shape, TestX.shape, TrainY.shape, TestY.shape) .. rst-class:: sphx-glr-script-out .. code-block:: none (219, 6) (95, 6) (219, 1) (95, 1) .. GENERATED FROM PYTHON SOURCE LINES 50-65 .. code-block:: Python DISTS = { "Normal": Normal, "LogNormal": LogNormal, "Exponential": Exponential } ITER = 0 VAL_SCORES = [] SUGGESTIONS = [] num_iterations = 150 # number of hyperparameter iterations SEP = os.sep PREFIX = f"hpo_{dateandtime_now()}" # folder name where to save the results algorithm = "bayes" .. GENERATED FROM PYTHON SOURCE LINES 66-67 define parameter space .. GENERATED FROM PYTHON SOURCE LINES 67-75 .. code-block:: Python param_space = [ Categorical(["Normal", "LogNormal", "Exponential"], name="Dist"), Integer(100, 1000, name="n_estimators"), Real(0.001, 0.5, name="learning_rate"), #Real(0.4, 1.0, name="minibatch_frac"), #Real(0.4, 1.0, name="col_sample") ] .. GENERATED FROM PYTHON SOURCE LINES 76-77 initial values of hyperparameters .. GENERATED FROM PYTHON SOURCE LINES 77-81 .. code-block:: Python x0 = ["Normal", 100, 0.01, #1.0, 1.0 ] .. GENERATED FROM PYTHON SOURCE LINES 82-83 define objective function .. GENERATED FROM PYTHON SOURCE LINES 83-140 .. code-block:: Python def objective_fn( return_model:bool = False, **suggestions )->Union[float, Model]: """ The output of this function will be minimized :param return_model: whether to return the trained model or the validation score. This will be set to True, after we have optimized the hyperparameters :param suggestions: contains values of hyperparameters at each iteration :return: the scalar value which we want to minimize. If return_model is True then it returns the trained model """ global ITER suggestions = jsonize(suggestions) SUGGESTIONS.append(suggestions) dist = suggestions.pop("Dist") # build the model ngb = NGBRegressor(Dist=DISTS[dist], verbose=False, **suggestions) model = Model( model=ngb, mode="regression", category="ML", cross_validator={"KFold": {"n_splits": 5}}, input_features=input_features, output_features=output_features, verbosity=-1 ) if return_model: model.fit(TrainX.values, TrainY.values, validation_data=(TestX, TestY.values)) model.evaluate(TestX, TestY, metrics=["r2", "r2_score"]) return model # get the cross validation score which we will minimize val_score_ = model.cross_val_score(TrainX.values, TrainY.values)[0] # since cross val score is r2_score, we need to subtract it from 1. Because # we are interested in increasing r2_score, and HyperOpt algorithm always # minizes the objective function val_score = 1 - val_score_ VAL_SCORES.append(val_score) best_score = round(np.nanmin(VAL_SCORES).item(), 2) bst_iter = np.argmin(VAL_SCORES) ITER += 1 print(f"{ITER} {round(val_score, 2)} {round(val_score_, 2)}. Best was {best_score} at {bst_iter}") return val_score .. GENERATED FROM PYTHON SOURCE LINES 141-142 initialize the hpo .. GENERATED FROM PYTHON SOURCE LINES 142-152 .. code-block:: Python optimizer = HyperOpt( algorithm=algorithm, objective_fn=objective_fn, param_space=param_space, x0=x0, num_iterations=num_iterations, process_results=False, # we can turn it False if we want post-processing of results opt_path=f"results{SEP}{PREFIX}" ) .. GENERATED FROM PYTHON SOURCE LINES 153-154 run the hpo .. GENERATED FROM PYTHON SOURCE LINES 154-157 .. code-block:: Python # res = optimizer.fit() .. GENERATED FROM PYTHON SOURCE LINES 158-159 print optimized hyperparameters .. GENERATED FROM PYTHON SOURCE LINES 159-162 .. code-block:: Python # print(optimizer.best_paras()) .. GENERATED FROM PYTHON SOURCE LINES 163-164 plot convergence .. GENERATED FROM PYTHON SOURCE LINES 164-167 .. code-block:: Python # optimizer.plot_convergence(show=True) .. GENERATED FROM PYTHON SOURCE LINES 168-171 .. code-block:: Python # optimizer.plot_convergence(original=True, show=True) .. GENERATED FROM PYTHON SOURCE LINES 172-173 plot explored hyperparameters as explored during hpo .. GENERATED FROM PYTHON SOURCE LINES 173-176 .. code-block:: Python # optimizer.plot_parallel_coords(show=True) .. GENERATED FROM PYTHON SOURCE LINES 177-178 build and train the model with optimized hyperparameters .. GENERATED FROM PYTHON SOURCE LINES 178-180 .. code-block:: Python # best_model = objective_fn(return_model=True, **optimizer.best_paras()) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.014 seconds) .. _sphx_glr_download_auto_examples_hpo_ngb.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: hpo_ngb.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: hpo_ngb.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: hpo_ngb.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_