LLamaSharp
LLamaSharp copied to clipboard
System.Runtime.InteropServices.MarshalDirectiveException: 'Method's type signature is not PInvoke compatible.'
I use LLamaSharp 0.9.1,and can not run well,it error occured var model = LLamaWeights.LoadFromFile(parameters); results in an error at runtime: System.Runtime.InteropServices.MarshalDirectiveException: 'Method's type signature is not PInvoke compatible.' I found issue https://github.com/SciSharp/LLamaSharp/issues/14,this problem is like me,but no solution.my environment is: vs2019 netframework4.8.0 LLamaSharp 0.9.1 LLamaSharp.Backend.Cpu ver0.9.1)
2024-Feb-03: I've encountered the same issue: System.Runtime.InteropServices.MarshalDirectiveException: 'Method's type signature is not PInvoke compatible.' The exception occurs in LoadFromFile(...) So, the exe stops dead in the first steps. I hope this is fixable. Thank you.
Variations leading to the same error:
- built the solution as Debug
- built the solution as Release
- Project menu -> [project] Properties
- -> Build ->
- checked/unchecked: "Prefer 32-bit", "Allow unsafe code";
- selected from Platform target: "Any CPU", "x86" (useless), "x64"
System: x64 Intel Visual Studio 2022 CE .NET 4.8 LLamaSharp 0.9.1 LLamaSharp.Backend.Cpu 0.9.1
I think that means that one of the native llama.cpp methods which LLamaSharp is not quite compatible with the .NET Framework pinvoke system. That can probably be worked around, if either of you can track down exactly which method this is I can take a look at it.
To track it down just stick down a breakpoint on the LoadFromFile method and step into it, eventually you'll get to a method which crashes when called. That method is the one that'll need to be modified to work around this.
Thank you for responding.
Will do.
Could it be that certain processors handle that method fine, while others don't?.. This _could explain why _some users only signal this issue, coulda, woulda, shoulda.
I think anyone running LLamaSharp on .NET Framework would have this issue. But framework is pretty rare to use these days.
I found the culprit, so it seems; by way of F11 Step Into. Hope this helps. A bit.
The way to crash'n burn was:
1 of 2)
the line:
using (@params.ToLlamaModelParams(out result))
in method:
public static LLamaWeights LoadFromFile(IModelParams @params)
{
LLamaModelParams result;
**using (@params.ToLlamaModelParams(out result))**
{
SafeLlamaModelHandle safeLlamaModelHandle = SafeLlamaModelHandle.LoadFromFile(@params.ModelPath, result);
foreach (LoraAdapter loraAdapter in @params.LoraAdapters)
{
if (!string.IsNullOrEmpty(loraAdapter.Path) && !(loraAdapter.Scale <= 0f))
{
safeLlamaModelHandle.ApplyLoraFromFile(loraAdapter.Path, loraAdapter.Scale, @params.LoraBase);
}
}
return new LLamaWeights(safeLlamaModelHandle);
}
}
2 of 2)
That line shouts to
public static class IModelParamsExtensions
where:
something happened in the line:
result = NativeApi.llama_model_default_params();
- where F11 Stepped Into and remained there
in method (well, here's the whole litany):
public unsafe static IDisposable ToLlamaModelParams(this IModelParams @params, out LLamaModelParams result)
{
//IL_0089: Unknown result type (might be due to invalid IL or missing references)
//IL_008e: Unknown result type (might be due to invalid IL or missing references)
//IL_0093: Unknown result type (might be due to invalid IL or missing references)
//IL_00e0: Unknown result type (might be due to invalid IL or missing references)
//IL_00e5: Unknown result type (might be due to invalid IL or missing references)
//IL_00ea: Unknown result type (might be due to invalid IL or missing references)
if (@params.UseMemoryLock && !NativeApi.llama_mlock_supported())
{
throw new NotSupportedException("'UseMemoryLock' is not supported (llama_mlock_supported() == false)");
}
if (@params.UseMemorymap && !NativeApi.llama_mmap_supported())
{
throw new NotSupportedException("'UseMemorymap' is not supported (llama_mmap_supported() == false)");
}
GroupDisposable groupDisposable = new GroupDisposable();
**result = NativeApi.llama_model_default_params();**
result.main_gpu = @params.MainGpu;
result.n_gpu_layers = @params.GpuLayerCount;
result.use_mlock = @params.UseMemoryLock;
result.use_mmap = @params.UseMemorymap;
result.vocab_only = @params.VocabOnly;
MemoryHandle val = groupDisposable.Add(@params.TensorSplits.Pin());
result.tensor_split = (float*)((MemoryHandle)(ref val)).Pointer;
if (@params.MetadataOverrides.Count == 0)
{
result.kv_overrides = (LLamaModelMetadataOverride*)(void*)IntPtr.Zero;
}
else
{
LLamaModelMetadataOverride[] array = new LLamaModelMetadataOverride[@params.MetadataOverrides.Count + 1];
val = groupDisposable.Add(MemoryExtensions.AsMemory<LLamaModelMetadataOverride>(array).Pin());
result.kv_overrides = (LLamaModelMetadataOverride*)((MemoryHandle)(ref val)).Pointer;
for (int i = 0; i < @params.MetadataOverrides.Count; i++)
{
MetadataOverride metadataOverride = @params.MetadataOverrides[i];
LLamaModelMetadataOverride lLamaModelMetadataOverride = default(LLamaModelMetadataOverride);
lLamaModelMetadataOverride.Tag = metadataOverride.Type;
LLamaModelMetadataOverride dest = lLamaModelMetadataOverride;
metadataOverride.WriteValue(ref dest);
System.ReadOnlySpan<char> chars = MemoryExtensions.AsSpan(metadataOverride.Key);
EncodingExtensions.GetBytes(output: new System.Span<byte>((void*)dest.key, 128), encoding: Encoding.UTF8, chars: chars);
array[i] = dest;
}
}
return groupDisposable;
}
So you think it's probably the llama_model_default_params
method that's the issue?
Yes, looks like it, and I stop at this conclusion, since I cannot-I do not know how to change anything.
Thanks for putting in the effort to narrow it down :)
No problem :) Happy! to help. My knowledge of C++ stops at VS6. Anyway, I may be wrong, but as I see it... result = NativeApi.llama_model_default_params(); provides a default value, to be used before parameter micromanagement below it. edit: could it be prepared in another way, able for receiving the parameters below. Perhaps (edit: a usable value) should be moved in a catch, to be used if the micromanagement goes wrong in the try? Shot in the dark...
Hi there, I don't know why this issue does not get the attention that it seemingly deserves, since I speaking for myself cannot even start a program by loading a model, using any of these, in all possible combinations:
- .NET Frameworks 2.0, 4.0, 4.5, 4.7, 4.8, 6.0, 8.0
- llamasharp 0.9 and 0.10 with backend CPU
- Console Apps and WinForms Apps The code simply stops at the line where a model is being loaded (not) with the forwarded parameters, in the method NativeApi.llama_model_default_params(), and as such Nothing can be done from that point onwards. Please, someone provide some info about this Critical issue. It's been almost one month since signaling it. Thank you.
I've been trying with 0.10 in the hope that it works; alas, it is not the case.
VS2022Community, version 17.9.1.
1 of 2) Console App
tried with Target: .NET 8.0 .NET 7.0 .NET 6.0
The result is the same for all versions above:
The code is simple, made even more simple with a single parameter, yet the problem is the same.
var parameters = new ModelParams(modelPath)
{
ContextSize = 4096,
Seed = 1337,
GpuLayerCount = 1
};
this^ hails to ModelParams.cs:
[JsonConstructor]
public ModelParams(string modelPath)
{
ModelPath = modelPath;
}
which in turn calls TensorSplitsCollection.cs:
public TensorSplitsCollection()
{
}
which FWIW ends the .exe process with code -1073741795.
2 of 2) WinForms App
tried with Target: .NET 4.8
The code:
string modelPath = "D:\\whatever.gguf";
// Load a model
var parameters = new ModelParams(modelPath)
{
ContextSize = 4096,
Seed = 1337,
GpuLayerCount = 1
};
var model = LLamaWeights.LoadFromFile(parameters);
-> 1 first pass (1st parameter, I guess, "ContextSize"):
this^ hails to ModelParams.cs:
[JsonConstructor]
public ModelParams(string modelPath)
{
ModelPath = modelPath;
}
which calls TensorSplitsCollection.cs:
public TensorSplitsCollection()
{
}
-> OK -> 2 second pass (2nd parameter, I guess, "Seed"):
goes to Nullable.cs, in:
/// <summary>Initializes a new instance of the <see cref="T:System.Nullable`1" /> structure to the specified value.</summary>
/// <param name="value">A value type.</param>
[NonVersionable]
[__DynamicallyInvokable]
public Nullable(T value)
{
this.value = value;
hasValue = true;
}
-> OK -> 3 third pass (3rd parameter, I guess, "GpuLayerCount"):
exits the declaration above of var parameters = new ModelParams(modelPath) {... ...} and goes to the next line:
var model = LLamaWeights.LoadFromFile(parameters);
which ofc is in LLamaWeights.cs:
public static LLamaWeights LoadFromFile(IModelParams @params)
{
LLamaModelParams result;
using (@params.ToLlamaModelParams(out result)) //here it starts
{
SafeLlamaModelHandle safeLlamaModelHandle = SafeLlamaModelHandle.LoadFromFile(@params.ModelPath, result);
foreach (LoraAdapter loraAdapter in @params.LoraAdapters)
{
if (!string.IsNullOrEmpty(loraAdapter.Path) && !(loraAdapter.Scale <= 0f))
{
safeLlamaModelHandle.ApplyLoraFromFile(loraAdapter.Path, loraAdapter.Scale, @params.LoraBase);
}
}
return new LLamaWeights(safeLlamaModelHandle);
}
}
and from here to: IModelParamsExtensions.cs, in method:
public unsafe static IDisposable ToLlamaModelParams
(this IModelParams @params, out LLamaModelParams result)
where it crashes on the line:
result = NativeApi.llama_model_default_params();
with the PInvoke error message in the title here.
Looks like the exit point depends on the type of Application and the version of .NET.
I don't know why this issue does not get the attention that it seemingly deserves
Any issue only gets attention because a motivated developer contributes their spare time and expertise to work on it! No one is being paid to work on LLamaSharp.
As I understood it, this issue was about .NET Framework. However, from what you've said in your most recent message it looks like it's also not working for you with .NET 6/7/8. Is that correct?
Yes indeed - it does not work with any of these: 2.0, 4.8, 6.0, 7.0, 8.0.
The fact that others do not encounter this problem makes it even more frustrating; stopping at step 1 - when loading a model, before anything else, yet not everywhere.
Are you absolutely certain that you correctly configured the dotnet version? I've had a look through the docs and I can't see how this error could happen for you on NET8, but not for others.
I think so, given that I've followed the instructions for installing the 0.10.0 packages of llamasharp and backend.cpu - anyway, I have no idea how/what to do otherwise.
(interesting how in a Console App on .NET 8.0 it doesn't even get to the PInvoke error like it does in a WinForms App on lower .NETs)
Models tested on:
- downloaded from: https://gpt4all.io/index.html
-
- mistral-7b-instruct-v0.1.Q4_0.gguf
-
- gpt4all-falcon-newbpe-q4_0.gguf
These files were placed in a directory (D:\llms\filename.gguf"), and also in a subdirectory ("D:\llms\llms_files\filename.gguf") and also in the top ("D:\filename.gguf") and also in the directory where the .exe is In the paths, deleted the underscores, dashes and dots, to no avail.
The entire Console App / .NET 8.0 is the example below taken from: https://blog.chuckbeasley.com/321/ modified with ContextSize=4096 instead of 32768, GpuLayerCount=4 instead of 35, Temperature=0.1f instead of 0.6f, and the commented out session-save line.
The program starts, displays a log:
[LLamaSharp Native] [Info] NativeLibraryConfig Description:
- Path:
- PreferCuda: True
- PreferredAvxLevel: NoAVX
- AllowFallback: True
- SkipCheck: False
- Logging: True
- SearchDirectories and Priorities: { ./ }
[LLamaSharp Native] [Info] Detected OS Platform: WINDOWS
[LLamaSharp Native] [Info] ./runtimes/win-x64/native/llama.dll is selected and loaded successfully.
D:\Programs\Programs NET 2022\programs_net_2022_desktop\ConsoleApp1\bin\Debug\net8.0\ConsoleApp1.exe (process 7516) exited with code -1073741795.
To automatically close the console when debugging stops, enable Tools->Options->Debugging->Automatically close the console when debugging stops.
Press any key to close this window . . .
then it crashes in the Tensor area when entering the block of parameters, or _right after receiving the model's path:
[JsonConstructor]
public ModelParams(string modelPath)
{
ModelPath = modelPath;
}
->
public TensorSplitsCollection()
{ //here
}
using LLama.Common;
using LLama;
using LLama.Native;
namespace ConsoleApp1
{
internal class Program
{
static async Task Main(string[ ] args)
{
Program p = new Program();
string smodelpath = "mistral-7b-instruct-v0.1.Q4_0.gguf";
var prompt = "Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\r\n\r\nUser: Hello, Bob.\r\nBob: Hello. How may I help you today?\r\nUser: Please tell me the largest city in Europe.\r\nBob: Sure. The largest city in Europe is Moscow, the capital of Russia.\r\nUser:"; // use the "chat-with-bob" prompt here.
NativeLibraryConfig.Instance.WithLogs();
// Load a model
var parameters = new ModelParams(smodelpath)
{ //here it all starts
ContextSize = 4096,
GpuLayerCount = 4,
UseMemoryLock = true,
UseMemorymap = true
}; //and it doesn't get farther than that
using var model = LLamaWeights.LoadFromFile(parameters);
// Initialize a chat session
using var context = model.CreateContext(parameters);
var ex = new InteractiveExecutor(context);
ChatSession session = new ChatSession(ex);
// show the prompt
Console.WriteLine();
Console.Write(prompt);
// run the inference in a loop to chat with LLM
while (prompt != "stop")
{
await foreach (
var text in session.ChatAsync(
new ChatHistory.Message(AuthorRole.User, prompt),
new InferenceParams
{
Temperature = 0.1f,
AntiPrompts = new List<string> { "User:" }
}))
{
Console.Write(text);
}
prompt = Console.ReadLine() ?? "";
}
//
//
//// save the session
//session.SaveSession("SavedSessionPath");
}
} //end of class Program
} //end of namespace ConsoleApp1
OK now...
It turns out that it was a question of RAM, if not of processor; I suspect that the RAM is the reason:
- the problem starts with the model path
- which means the model file
- which /must be loaded into RAM
- where it should fit
so:
I've tried the Apps and the code above on these low-end systems:
-
Intel E7500, 4 GB (yes) RAM: -- the Apps crash at the instructions signaled in this thread: PInvoke for WinForms Apps and TensorSplits for Console Apps --- this should've ringed alarm bells if linked to the fact that GPT4All doesn't _load said models either, but nope... -- Log says: NO AVX
-
Intel i5 Core 8265U, 8 GB RAM: -- the Console App on .NET 8.0 runs OK --- like GPT4All does -- Log says: AVX2
The next one to use/test is an (i5 + 12 GB RAM), but I presume it's clear... Haven't tested a WinForms on the i5, but - likewise:
Not enough RAM to load a model in = App crash.
As such, the fact gets explained that not many users experience this, since such low-end systems are indeed lo-tech nowadays.
I guess some technical writing about this simple situation is in order... like, basic requirements... my apologies if such info does exist.
I for myself got mindlessly caught in this all too visible trap, and have dragged there you too, so please accept my sincere apologies! I cannot give you back the time spent on this. Time is the fire in which we burn...
So this should be closed and possibly documented somewhere (including within the code itself), sort of a table with Model | RAM needed... and the error X being caused by this and not by that... until cleared too... cause the errors are still here... or a message, Thine System doth not suffer this for it doth not possess enough memory, so pay heed to thou shalt not! try it at thine abode/lair.
Thank you! for your Time and interest in this issue.
unvoluntary_tester_orsomesuch signing_off.
Aha, good job narrowing that down!