GELU does not appear to support approximate tanh
The optional algorithm for GELU is to internally use tanh
See more here: https://pytorch.org/docs/stable/generated/torch.nn.GELU.html#torch.nn.GELU
I was expecting this to just work:
var gelu = nn.GELU(approximate: "tanh");
When the approximate argument is ‘tanh’, GELU is estimated differently. The default is rather different.
Is it possible, since this is supported natively, to include the "approximate" property for TorchSharp's GELU?
Is there a way for me to do it without requiring the difficulty of pushing new versions of the library?
I'm guessing perhaps this could be an option
[DllImport("LibTorchSharp")]
internal static extern IntPtr THSNN_GELU_ctor(string approximate, out IntPtr pBoxedModule);
and then perhaps replace the current GELU calling function, or add an overload (either way seems similar)
public static GELU GELU(string approximate = "none")
{
IntPtr boxedHandle;
IntPtr intPtr = NativeMethods.THSNN_GELU_ctor(approximate, out boxedHandle);
if (intPtr == IntPtr.Zero)
{
torch.CheckForErrors();
}
return new GELU(intPtr, boxedHandle);
}
Two options:
- Fix the code and send us a much-appreciated PR. The
approximateargument should be an enumeration instead of a string. - Implement your own GELU module using the available mathematical primitives in TorchSharp.
Sorry if this seems obvious, just trying to make sure it's right.
I'm definitely willing to try the PR approach for this (and anything else I could help with).
- I am unsure what naming conventions for enums are used within TorchSharp, and what the appropriate namespace or scope would be.
- I forked the repo, but I am new to issuing PR's so any guidance would be appreciated (or if the CONTRIBUTING.md explanation fully applies I will just try that approach)
Would then enum reside within the same GELU.cs file? Perhaps the changes could look like:
PInvoke change:
[DllImport("LibTorchSharp")]
internal static extern IntPtr THSNN_GELU_ctor(TorchSharp.Modules.ApproxType approximate, out IntPtr pBoxedModule);
GELU.cs change:
(within the Modules namespace)
public enum ApproxType
{
none,
tanh
}
the updated constructor:
public static GELU GELU(ApproxType approximate = ApproxType.none)
{
var handle = THSNN_GELU_ctor(approximate, out var boxedHandle);
if (handle == IntPtr.Zero) { torch.CheckForErrors(); }
return new GELU(handle, boxedHandle);
}
I tried the previous code, but it causes an exception when calling the ctor. If I use string instead of the enum it works, so perhaps the implicit conversion of ApproxType.tanh to 1 is causing the problem. Unsure how or where the enum would be brought back to a string to satisfy the approximate parameter.
Perhaps a blend of the two?
[DllImport("LibTorchSharp")]
internal static extern IntPtr THSNN_GELU_ctor(string approximate, out IntPtr pBoxedModule);
public enum ApproxType
{
none,
tanh
}
public static GELU GELU(ApproxType approximate = ApproxType.none)
{
var handle = THSNN_GELU_ctor(approximate.ToString("f"), out var boxedHandle);
if (handle == IntPtr.Zero) { torch.CheckForErrors(); }
return new GELU(handle, boxedHandle);
}