baybe icon indicating copy to clipboard operation
baybe copied to clipboard

CrabNet benchmarking

Open ritalyu17 opened this issue 1 year ago • 7 comments

Work in progress CrabNet hyperparameters benchmarking task.

To do:

CrabNet has a fairly strict constraint requirement, may need to replace simulate_scenarios with below code to force constraints:

# create campaign
campaign = Campaign(searchspace=searchspace, objective=objective)

# set up the optimization loop
from copy import deepcopy
random_seed_list = [23, 42, 87, 131, 518]

results = pd.DataFrame()
for i in range(len(random_seed_list)):
    set_random_seed(random_seed_list[i])

    # copy the campaign
    campaign_i = deepcopy(campaign)

    for k in range(N_DOE_ITERATIONS): 
        recommendation = campaign_i.recommend(batch_size=BATCH_SIZE)
        # select the numerical columns
        numerical_cols = recommendation.select_dtypes(include='number')
        # replace values less than 1e-6 with 0 in numerical columns
        numerical_cols = numerical_cols.map(lambda x: 0 if x < 1e-6 else x)
        # update the original DataFrame
        recommendation.update(numerical_cols)
        
        # if x6+x15 >1.0, round x6 and x15 to 4 decimal places
        if recommendation['x6'].item() + recommendation['x15'].item() > 1.0: 
            recommendation['x6'] = np.round(recommendation['x6'].item(), 4)
            recommendation['x15'] = np.round(recommendation['x15'].item(), 4)

        # if x19 >= x20, subtract 1e-6 from x19 and add 1e-6 to x20
        if recommendation['x19'].item() >= recommendation['x20'].item():
            recommendation['x19'] = recommendation['x19'].item() - 1e-6
            # if recommendation['x19'] < 0, assign 0 to x19
            if recommendation['x19'].item() < 0:
                recommendation['x19'] = 0
            recommendation['x20'] = recommendation['x20'].item() + 1e-6
            if recommendation['x20'].item() > 1:
                recommendation['x20'] = 1

        # target value are looked up via the botorch wrapper
        target_values = []
        for index, row in recommendation.iterrows():
            target_values.append(WRAPPED_FUNCTION(**row.to_dict()))

        recommendation["Target"] = target_values
        campaign_i.add_measurements(recommendation)   
    results = pd.concat([results, campaign_i.measurements])

ritalyu17 avatar Dec 03 '24 06:12 ritalyu17

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

:white_check_mark: ritalyu17
:x: pre-commit-ci[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Dec 03 '24 06:12 CLAassistant

CrabNet hyperparameters is a benchmarking with 20 continuous and 3 categorical inputs. To avoid the constraints error, the 20 continuous parameters are treated as discrete values. This resolves the error raise by constraints not meet, and decrease run time significantly.

Ready for review, and thanks for providing the feedbacks.

ritalyu17 avatar Dec 17 '24 06:12 ritalyu17

Just FYI. I figured out the issue with using continuous variables, see #454 . I will review here next year/mid of January :)

AVHopp avatar Dec 19 '24 16:12 AVHopp

@ritalyu17 Just wanted to mention that the issue that we discussed should now be fixed on main since #441 was now merged (see #454 ). Hence, please rebase onto main and adjust the example and see if it now works without the workarounds, then I'll give it a full review :)

AVHopp avatar Jan 08 '25 14:01 AVHopp

Hi @AVHopp I have rebased to main. Though, now I got the error at simulate_scenarios. For example, when I run the small example you construct at #454 , I receive TypeError here:

df_result = simulate_scenarios(
    {"Default Recommender": Campaign(searchspace=searchspace, objective=objective)},
    adv_opt,
    batch_size=1,
    n_doe_iterations=3,
    n_mc_iterations=2,
)

TypeError: adv_opt() missing 6 required positional arguments: 'c2', 'c3', 'x6', 'x15', 'x19', and 'x20'

For this small example, adv_opt takes 7 arguments and the TypeError says missing 6 arguments. For the full CrabNet adv_opt, it takes 23 arguments and the TypeError states missing 22 arguments:

TypeError: adv_opt() missing 22 required positional arguments: 'c2', 'c3', 'x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'x7', 'x8', 'x9', 'x10', 'x11', 'x12', 'x13', 'x14', 'x15', 'x16', 'x17', 'x18', 'x19', and 'x20'

Any thoughts?

ritalyu17 avatar Jan 23 '25 09:01 ritalyu17

Hi @AdrianSosic, coding convention and some minor changes are updated. Similarly, transfer learning with different initial data size need attention. Line 333-349 in CrabNet benchmark.

ritalyu17 avatar Mar 03 '25 04:03 ritalyu17

Hi @AdrianSosic, I have incorporated the changes and rebased to main similar to the other benchmarks.

Though, when running transfer learning for CrabNet, I got the baybe.exceptions.NothingToSimulateError. The set up for CrabNet is similar to Hardness, and Hardness transfer learning work. Do you have any idea?

ritalyu17 avatar Mar 05 '25 02:03 ritalyu17

@ritalyu17 @AdrianSosic @sgbaird @Scienfitz what is the status here?

AVHopp avatar Aug 05 '25 08:08 AVHopp

@AVHopp we need an assessment whether this code is even integratable and an assessment of what still needs to be done OR simply abandon it in the absence of OP

Scienfitz avatar Aug 15 '25 10:08 Scienfitz

inactive

Scienfitz avatar Sep 01 '25 10:09 Scienfitz

Thanks for following up on this. The code was ready for review in March, and I’ve been keeping it on hold as I understood priorities were elsewhere. Happy to pick this back up and adjust as needed.

ritalyu17 avatar Sep 07 '25 19:09 ritalyu17