Show an example of generic numeric expressions
This could be more of a polars-rs issue, but is there a generic way to get any integer type Series?
let left: &Int64Chunked = inputs[0].i64()?;
is shown in most of the examples, How do I get a ChunkedArray that implements the PolarsIntegerType trait? Do I have to write a big match statement?
Apologies if I'm not using the proper rust terminology. I'm learning the language, this tutorial made it seem within reach... Great work
Do I have to write a big match statement?
yup. or make a macro
I'm learning the language, this tutorial made it seem within reach... Great work
thanks!
I was able to get this far succesfully
fn hash_i64_chunked(cb: &Int64Chunked) -> u64 {
let mut hasher = XxHash64::with_seed(SEED);
for val in cb.iter() {
match val {
Some(val) => {hasher.write(&val.to_le_bytes())}
_ => {hasher.write(b" ")}
}
}
hasher.finish()
}
fn hash_u64_chunked(cb: &UInt64Chunked) -> u64 {
let mut hasher = XxHash64::with_seed(SEED);
for val in cb.iter() {
match val {
Some(val) => {hasher.write(&val.to_le_bytes())}
_ => {hasher.write(b" ")}
}
}
hasher.finish()
}
#[polars_expr(output_type=UInt64)]
fn hash_series(inputs: &[Series]) -> PolarsResult<Series> {
let chunks = &inputs[0];
if let Ok(ichunks) = chunks.i64() {
let hash = hash_i64_chunked(ichunks);
return Ok(Series::new("hash".into(), vec![hash]));
}
if let Ok(ichunks) = chunks.u64() {
let hash = hash_u64_chunked(ichunks);
return Ok(Series::new("hash".into(), vec![hash]));
}
return Err(PolarsError::ComputeError("couldn't compute for type".into()));
}
I'm having a lot of trouble writing a hash_generic_chunked function.
So far I am this close:
fn hash_generic_chunked<T> (cb: &ChunkedArray <T>) -> u64
where
T: PolarsNumericType
{
let mut hasher = XxHash64::with_seed(SEED);
for val in cb.iter() {
match val {
Some(val) => {hasher.write(&val.to_le_bytes())}
_ => {hasher.write(b" ")}
}
}
hasher.finish()
}
this fails though with the following error messages
error[E0308]: mismatched types
--> src/expressions.rs:69:40
|
69 | Some(val) => {hasher.write(&val.to_le_bytes())}
| ----- ^^^^^^^^^^^^^^^^^^ expected `&[u8]`, found `&<... as NativeType>::Bytes`
| |
| arguments to this method are incorrect
|
= note: expected reference `&[u8]`
found reference `&<<T as polars::prelude::PolarsNumericType>::Native as NativeType>::Bytes`
= help: consider constraining the associated type `<<T as polars::prelude::PolarsNumericType>::Native as NativeType>::Bytes` to `[u8]`
= note: for more information, visit https://doc.rust-lang.org/book/ch19-03-advanced-traits.html
note: method defined here
--> /Users/paddy/.rustup/toolchains/nightly-2025-05-21-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/hash/mod.rs:358:8
|
358 | fn write(&mut self, bytes: &[u8]);
| ^^^^^
It looks like some of the types around PolarsNumericType subtly changed between 0.49 to 0.51.
This is looking like some very complex type gymnastics, I wonder if I'm better off writing concrete implementations for all of the base types.
I could see how to_le_bytes is a rarely used method that has specific restrictions so it isn't suited to generics, however there are many ChunkedArray computations that are generic for at a minimum Int, UInt, Float`, and an example for the tutorial around writing these could be helpful.
well I wrote my first Rust macro and got the tests to pass.
macro_rules! hash_func {
($a:ident, $b:ty, $type_num:expr) => {
fn $a(cb: $b) -> u64 {
let mut hasher = XxHash64::with_seed(SEED);
hasher.write(&hardcode_bytes($type_num));
let mut count:u64 = 0;
for val in cb.iter() {
count += 1;
match val {
Some(val) => {hasher.write(&val.to_le_bytes())}
_ => {hasher.write(NAN_SEPERATOR);}
}
hasher.write(&count.to_le_bytes());
}
hasher.finish()
}
};
}
hash_func!(hash_i64_chunked, &Int64Chunked, 1);
hash_func!(hash_i32_chunked, &Int32Chunked, 2);
this expands to
// non macro implementation for reference
fn hash_i64_chunked(cb: &Int64Chunked) -> u64 {
let mut hasher = XxHash64::with_seed(SEED);
hasher.write(&hardcode_bytes(1));
let mut count: u64 = 0;
for val in cb.iter() {
count += 1;
match val {
Some(val) => { hasher.write(&val.to_le_bytes()) }
_ => { hasher.write(NAN_SEPERATOR); }
}
hasher.write(&count.to_le_bytes());
}
hasher.finish()
}
macro expansion was checked with
cargo rustc --profile=check -- -Zunpretty=expanded
Just leaving this here for reference if other people are trying to figure out a solution
For me you can close this issue and use it as reference. I will probably write a blog post about my experience writing a polars plugin, it will include this part about the macros. Would a section like this be appropriate for the tutorial? I might be able to submit a PR if you're interested. The tutorial section would probably just genercize the sum function.
yeah a section on macros may be useful