mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Mergekit-Evolve with vLLM enabled causes error if merge_method is linear

Open umiyuki opened this issue 1 year ago • 2 comments

As stated in the title, if a method that does not require a base_model specification, such as linear, is specified for merge_method in the merge config, an error is triggered when Mergekit-Evolve is run with vLLM enabled. After reviewing the error message, it appears that the error is caused by trying to reference a base_model value that does not exist in the merge config. Currently I am using a silly workaround. I specify base_model in the merge config even though merge_method is linear. This causes an assertion, but by commenting out that part I can run without error.

Traceback (most recent call last): File "/home/umiyuki/anaconda3/envs/mergekit/bin/mergekit-evolve", line 8, in sys.exit(main()) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/click/core.py", line 1157, in call return self.main(*args, **kwargs) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/home/umiyuki/mergekit-evolve-elyzatask100/mergekit/mergekit/scripts/evolve.py", line 305, in main xbest, es = cma.fmin2( File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/cma/evolution_strategy.py", line 4392, in fmin2 res = fmin(objective_function, x0, sigma0, File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/cma/evolution_strategy.py", line 4818, in fmin X, fit = es.ask_and_eval(parallel_objective or objective_function, File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/cma/evolution_strategy.py", line 2479, in ask_and_eval fit_first = func(X_first, *args) File "/home/umiyuki/mergekit-evolve-elyzatask100/mergekit/mergekit/scripts/evolve.py", line 264, in parallel_evaluate res = strat.evaluate_genotypes(x) File "/home/umiyuki/mergekit-evolve-elyzatask100/mergekit/mergekit/evo/strategy.py", line 100, in evaluate_genotypes return list( File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/ray/util/actor_pool.py", line 113, in get_generator yield self.get_next() File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/ray/util/actor_pool.py", line 309, in get_next return ray.get(future) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper return fn(*args, **kwargs) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/ray/_private/worker.py", line 2623, in get values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout) File "/home/umiyuki/anaconda3/envs/mergekit/lib/python3.10/site-packages/ray/_private/worker.py", line 861, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::InMemoryMergeEvaluator.evaluate_genotype() (pid=179137, ip=172.19.199.244, actor_id=7e3ae39061e77bc716bca6a401000000, repr=<mergekit.evo.actors.InMemoryMergeEvaluator object at 0x7f327e414d00>) File "/home/umiyuki/mergekit-evolve-elyzatask100/mergekit/mergekit/evo/actors.py", line 303, in evaluate_genotype return self.evaluate(genotype) File "/home/umiyuki/mergekit-evolve-elyzatask100/mergekit/mergekit/evo/actors.py", line 236, in evaluate self._maybe_init_model(config) File "/home/umiyuki/mergekit-evolve-elyzatask100/mergekit/mergekit/evo/actors.py", line 202, in _maybe_init_model self.genome.definition.base_model.model.path, use_fast=True AttributeError: 'NoneType' object has no attribute 'model'

umiyuki avatar May 07 '24 04:05 umiyuki

+1

hammoudhasan avatar May 12 '24 12:05 hammoudhasan

They can raise errors whether base model is defined or not. I modified this code evo/actors.py line 202

 tok = transformers.AutoTokenizer.from_pretrained(
      self.genome.definition.base_model.model.path, use_fast=True
  )

to

if self.genome.definition.base_model is not None:
    tok = transformers.AutoTokenizer.from_pretrained(
        self.genome.definition.base_model.model.path, use_fast=True
    )
else:
    tok = transformers.AutoTokenizer.from_pretrained(
        self.genome.definition.models[0].model.path, use_fast=True
    )

sangmandu avatar May 23 '24 09:05 sangmandu