machinelearning Add GenAI packages

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like The GenAI packages will provide torchsharp implementation for a series of popular GenAI models. The goal is to load the same weight from the corresponding python regular model.

[x] Add design doc (#7170)
[x] Add Microsoft.ML.GenAI.Core (#7177)

The following models will be added in the first wave

[x] Phi-3 (Microsoft.ML.GenAI.Phi) #7184
- [x] Add README to Microsoft.ML.GenAI.Phi project #7206
[x] LLaMA (Microsoft.ML.GenAI.LLaMA) #7220
[ ] Mistral (Microsoft.ML.GenAI.Mistral)
- [x] Mistral-7b-instruct v3
- [ ] Mistral-nemo
[x] Generate Embedding from CausalLMModel #7227
[ ] Stable Diffusion (Microsoft.ML.GenAI.StableDiffusion)

MEAI intergration

[ ] Add CausalLMPipelineChatClient #7270

Along with the benchmark

[ ] Benchmark for Phi-3
[ ] Flash Attention support #7238

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Jun 06 '24 19:06 LittleLittleCloud

Can you guys publish a preview for Microsoft.ML.GenAI.LLaMA package?

Sep 14 '24 22:09 lostmsu

@lostmsu You should be able to consume it from the daily build below

https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json

Oh, just notice that the GenAI package hasn't been set to IsPackable to true so it's not available on daily build. Will publish a PR to enable the package flag

Sep 15 '24 20:09 LittleLittleCloud

Can you please publish a preview for Microsoft.ML.GenAI.Core package? It is not available

https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json

The sample Microsoft.ML.GenAI.Samples/Llama/LLaMA3_1.cs is broken without it .

Furthermore, the sample has hard coded weight folder
var weightFolder = @"C:\Users\xiaoyuz\source\repos\Meta-Llama-3.1-8B-Instruct"; I have downloaded the model and config from Meta site. May be a few comments will be helpful.

Sep 27 '24 10:09 aforoughi1

Can you please publish a preview for Microsoft.ML.GenAI.Core package? It is not available

https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json

The sample Microsoft.ML.GenAI.Samples/Llama/LLaMA3_1.cs is broken without it .

Furthermore, the sample has hard coded weight folder var weightFolder = @"C:\Users\xiaoyuz\source\repos\Meta-Llama-3.1-8B-Instruct"; I have downloaded the model and config from Meta site. May be a few comments will be helpful.

Oh, sorry, I'll make the fix

Sep 27 '24 15:09 LittleLittleCloud

I am getting System.IO.FileNotFoundException couldn't find model.safetensors.index.json calling at Microsoft.ML.GenAI.LLaMA.LlamaForCausalLM.FromPretrained(String modelFolder, String configName, String checkPointName, ScalarType torchDtype, String device) I can't get the example working, please explain where/what this file is?

Oct 01 '24 09:10 aforoughi1

@aforoughi1 Which llama, I suppose you are runnning llama 3.2 1B?

Oct 01 '24 15:10 LittleLittleCloud

Llama3.1-8B

Oct 01 '24 17:10 aforoughi1

@aforoughi1

The error basically say it can't find the {ModelFolder}/model.safetensors.index.json, could you share the full code to call the model, stacktrace and a screenshot of the llama 3.1 8B model folder

Oct 01 '24 18:10 LittleLittleCloud

model folder

// issue 7169 //Meta-Llama-3.1-8B-Instruct/orginial string weightFolder = @"C:\Users\abbas.llama\checkpoints\Llama3.1-8B"; string configName = "params.json"; string modelFile = "tokenizer.model";

TiktokenTokenizer tokenizer = LlamaTokenizerHelper.FromPretrained(weightFolder, modelFile); LlamaForCausalLM model = LlamaForCausalLM.FromPretrained(weightFolder, configName, layersOnTargetDevice: -1 ,targetDevice: "cpu"); Console.WriteLine("Loading Llama from model weight folder");

var pipeline = new CausalLMPipeline<TiktokenTokenizer, LlamaForCausalLM>(tokenizer, model, "cpu");

System.IO.FileNotFoundException HResult=0x80070002 Message=Could not find file 'C:\Users\abbas.llama\checkpoints\Llama3.1-8B\model.safetensors.index.json'. Source=System.Private.CoreLib StackTrace: at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options) at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.FileStreamHelpers.ChooseStrategy(FileStream fileStream, String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, Int64 preallocationSize) at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize) at System.IO.File.InternalReadAllText(String path, Encoding encoding) at System.IO.File.ReadAllText(String path) at TorchSharp.PyBridge.PyBridgeModuleExtensions.load_checkpoint(Module module, String path, String checkpointName, Boolean strict, IList1 skip, Dictionary2 loadedParameters, Boolean useTqdm) at Microsoft.ML.GenAI.LLaMA.LlamaForCausalLM.FromPretrained(String modelFolder, String configName, String checkPointName, ScalarType torchDtype, String device) at Test.GenAITest.LLaMATest1() in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\GenAITest.cs:line 35 at Test.Program.GenAI() in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\Program.cs:line 425 at Test.Program.Main(String[] args) in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\Program.cs:line 54

Oct 01 '24 18:10 aforoughi1

@aforoughi1 LlamaForCausalLM loads .safetensor model weight while in your code, you are targeting the original .pth model weight folder.

The .safetensor model weight should be located in Meta-Llama-3.1-8B-Instruct, maybe update the weight folder to that path when loading LlamaForCausalLM?

LlamaForCausalLM model = LlamaForCausalLM.FromPretrained("Meta-Llama-3.1-8B-Instruct", configName, layersOnTargetDevice: -1 ,targetDevice: "cpu");

Oct 01 '24 19:10 LittleLittleCloud

I sorted the following missing files and the directory structure: model.safetensors.index model-00004-of-00004.safetensors model-00001-of-00004.safetensors model-00002-of-00004.safetensors model-00003-of-00004.safetensors model-00004-of-00004.safetensors

The model is loaded successfully ONLY if I use the defaults layersOnTargetDevice: -1, quantizeToInt8: false quantizeToInt4 = false

Setting layersOnTargetDevice: 26, quantizeToInt8: true causes memory corruptions exception.

The example also missing stopWatch.Stop();

I also don't see RegisterPrintMessage(), print any messages to the console.

Oct 07 '24 10:10 aforoughi1

@aforoughi1 Are you using nightly build or trying the example from main branch

Oct 07 '24 16:10 LittleLittleCloud

Nightly buildOn 7 Oct 2024, at 17:20, Xiaoyun Zhang @.***> wrote: @aforoughi1 Are you using nightly build or trying the example from main branch

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Oct 07 '24 16:10 aforoughi1

@aforoughi1 And your GPU device/platform?

Oct 07 '24 16:10 LittleLittleCloud

Device is set

torch.InitializeDeviceType(DeviceType.CPU);

microsoft.ml.genai.llama\0.22.0-preview.24477.3\

microsoft.ml.torchsharp\0.21.1\

torchsharp-cpu\0.103.0\

Processor 12th Gen Intel(R) Core(TM) i5-1235U 2.50 GHz

Installed RAM 16.0 GB (15.8 GB usable)

System type 64-bit operating system, x64-based processor

Edition Windows 11 Home

Version 23H2

OS build 22631.4249

Experience Windows Feature Experience Pack 1000.22700.1041.0

From: Xiaoyun Zhang @.> Sent: 07 October 2024 17:26 To: dotnet/machinelearning @.> Cc: Abbas Foroughi @.>; Mention @.> Subject: Re: [dotnet/machinelearning] Add GenAI packages (Issue #7169)

@aforoughi1 https://github.com/aforoughi1 And your GPU device/platform?

— Reply to this email directly, view it on GitHub https://github.com/dotnet/machinelearning/issues/7169#issuecomment-2397380734 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ALUPR57BV5YWNQBCJISZCBDZ2KYZTAVCNFSM6AAAAABI5KARSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJXGM4DANZTGQ . You are receiving this because you were mentioned.Message ID: @.***>

Oct 07 '24 17:10 aforoughi1

The layersOnTargetDevice is for GPU, so I haven't test values other than -1 in CPU scenario. For the quantizeToInt8 and quantizeToInt4, you probably also won't gain benefits on CPU scenarios. So maybe just keep it as false.

Oct 07 '24 18:10 LittleLittleCloud