jiafu zhang

Results 10 issues of jiafu zhang

In latest commit, https://huggingface.co/mosaicml/mpt-7b/commit/67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984, BigDL throws below error when generate text. INFO 2024-02-20 06:41:05,962 proxy 172.17.0.2 0dfd2310-daba-4d40-8b27-6ccbbd608fd2 /mpt-7b-bigdl router.py:959 - Using router . INFO 2024-02-20 06:41:05,978 proxy 172.17.0.2 router.py:496 -...

user issue

In bigdl-llm 2.4, transformers version is 4.31. But codellama example needs 4.34.1. In bigdl-llm 2.5.0b20240130, transformers version is still 4.31. @gc-fu told me bigdl-llm 2.5.0b20240130 is compatible with transformers 4.36....

user issue

Here are options provided by llama.cpp. --prompt-cache FNAME file to cache prompt state for faster startup (default: none) --prompt-cache-all if specified, saves user input and generations to cache as well....

user issue

we need to prepend jobid to hardcoded prefix log file, java-worker-*.log instead of getting prefix from the "ray.logging.file-prefix" system property since ray hasn't applied the fix (https://github.com/ray-project/ray/pull/33665) yet.

There is a security issue report, https://github.com/oap-project/raydp/security/dependabot/6. > Package protobuf > > Affected versions >= 3.19.0, < 3.19.5 > > Patched version 3.19.5 > >  protobuf-cpp and protobuf-python have potential...

As stated in https://github.com/ray-project/ray/pull/33797, it's not necessary for ray monitoring ray JVM logs. This ticket is to make corresponding changes in raydp by setting JVM log file prefix to 'raydp-java-worker'....

Take sendBuf as example, the parameter buffer is not put to fi_context2. After sendBuf method return, the pointer may be get released even the buffer pointer was passed to fi_send...

` model_name = "meta-llama/Llama-2-7b-chat-hf" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = Model() model.init(model_name, use_quant=True, weight_dtype="int4", compute_dtype="int8") tokens = tokenizer("What's your favorite animal?", return_tensors='pt').input_ids outputs = model.generate(tokens, num_beams=2, do_sample=False, max_new_tokens=10) text =...