Nathan Goldbaum
Nathan Goldbaum
> see https://github.com/huggingface/tokenizers/pull/1809 We ended up closing this. I opened https://github.com/huggingface/tokenizers/pull/1864 with a more minimal approach and I'm still waiting on review. hf-xet is making some movement to support the...
I think we'd need a struct like this: ```rust #[repr(transparent)] #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)] pub struct PyVariableWidthString(pub &[u8]); ``` `T_NA` would parameterize over the Python type...
Also in the future there might be different encodings besides UTF-8, so maybe &[u8] isn't the right type but also UTF-8 encoded bytes is all numpy supports right now, so...
> How would the const generic work in practice? Oh wait, you're totally right. That only makes sense for the fixed-width DTypes. Sorry... I edited the posts above.
Maybe actually this is better, since we know it's valid UTF-8: ```rust pub struct PyVariableWidthString(pub &str); ``` I'm not sure if there are subtleties around a &str which is actually...
I guess you could add e.g. a wrapper for the `npy_static_string` struct, which right now is just 16 opaque bytes (on 64 bit architectures) and then make the `npy_static_string` wrapper...
> Anyone with cpython development experience seeing this: It'd be awesome if someone could help/take over working on this. I think because you're adding a new dunder method, this probably...
> This should not be merged this late in a release cycle, so please make it 2.5.0 not 2.4.0 Fair enough - I cleared the 2.4.0 milestone. @hellerve - if...
> Looks like that has the same issue, since now the assertion error is: AssertionError: ignore filters should only be used in tests; found in /numpy/venv/lib/python3.11/site-packages/numpy/typing/tests/data/pass/ufunclike.py on line 37. So...
> As part of this PR or a different one (since it’s more or less unconcerned with this particular deprecation)? Doing it in this one is fine. I doubt it...