Expressions as parameters in StringNamespace functions
Hi and sorry again, I think I found a global issue in StringNamespace implementation. Comparing with this example of slice implementation between string an list
let df = pl.DataFrame(
{
A: ["apple"],
fruits: [["apple", "banana", "cherry", "dragon fruit"]],
}
)
df = df.select(
"A",
"fruits",
// "cars",
pl.col("fruits").lst.slice(0, 2).alias("lst_slice_js"),
pl.col("fruits").lst.slice(pl.lit(0), pl.lit(2)).alias("lst_slice_expr"),
pl.col("A").str.slice(0,2).alias("str_slice_js"),
pl.col("A").str.slice(pl.list(0),pl.lit(2)).alias("str_slice_expr"),
)
This raises the following error for the last expression .str.slice(pl.list(0),pl.lit(2)) :
<...>\node_modules\nodejs-polars\bin\lazy\expr\string.js:10
return (0, expr_1._Expr)(_expr[method](...args));
^
Error: Failed to recover `JsExpr` type from napi value
at wrap (<...>\node_modules\nodejs-polars\bin\lazy\expr\string.js:10:47)
at Object.slice (<...>\node_modules\nodejs-polars\bin\lazy\expr\string.js:88:20)
Then I compared the code on rust side and js side for both slice implementations
#[napi(catch_unwind)]
pub fn list_slice(&self, offset: &JsExpr, length: Option<&JsExpr>) -> JsExpr {
let length = match length {
Some(i) => i.inner.clone(),
None => dsl::lit(i64::MAX),
};
self.inner
.clone()
.list()
.slice(offset.inner.clone(), length)
.into()
...
#[napi(catch_unwind)]
pub fn str_slice(&self, offset: &JsExpr, length: &JsExpr) -> JsExpr {
self.inner
.clone()
.str()
.slice(offset.inner.clone(), length.inner.clone())
.into()
}
Except the difference for optional length (which is not an issue, nothing to highlight) Then in JS:
list:
slice(offset, length) {
return wrap(
"listSlice",
exprToLitOrExpr(offset)._expr,
exprToLitOrExpr(length)._expr,
);
},
string:
slice(start, length?) {
if (!Expr.isExpr(start)) {
start = lit(start)._expr;
}
if (!Expr.isExpr(length)) {
length = lit(length)._expr;
}
return wrap("strSlice", start, length);
},
The processing of the parameter seems to differ between both implementations.
Sorry in advance if I am wrong...
OK fixed this locally on my computer for str.slice directly inside bin/.lazy/expr/string.js
slice(start, length) {
// if (!expr_1.Expr.isExpr(start)) {
// start = (0, functions_1.lit)(start)._expr;
// }
// if (!expr_1.Expr.isExpr(length)) {
// length = (0, functions_1.lit)(length)._expr;
// }
// return wrap("strSlice", start, length);
return wrap("strSlice", (0, expr_1.exprToLitOrExpr)(start)._expr, (0, expr_1.exprToLitOrExpr)(length)._expr);
},
And looking at the code, I think there are many places where we have the same type of "error/confusion".
For some of them, the rust code should be also reviewed. Exemple with str_replace function where pat and val should be &JsExpr instead of String and then calling pat.inner.clone() and val.inner.clone() and then calling ca.replace(pat.onner.clone(), val.inner.clone().
#[napi(catch_unwind)]
pub fn str_replace(&self, pat: String, val: String) -> JsExpr {
let function = move |s: Column| {
let ca = s.str()?;
match ca.replace(&pat, &val) {
Ok(ca) => Ok(Some(ca.into_column())),
Err(e) => Err(PolarsError::ComputeError(format!("{:?}", e).into())),
}
};
self.clone()
.inner
.map(function, GetOutput::same_type())
.with_fmt("str.replace")
.into()
}
Only a "guess", shall be confirmed by the maintainers and eventually changed inside all functions accepting parameters (TBC), as I am not yet able to locally test modifications of rust code for now.
Thanks
Error: Failed to recover JsExpr type from napi value happens when we try to pass a string instead of a expression.
Some functions expect a string, others can handle string, regex or expression.
I am sure we have cases when thing fail apart due to so many options. We just need to improve our test coverage.
Feel free to submit a PR to add more test cases.
Thanks a lot, OK, I will do my best to list and detail all missing test cases I encounter. Currently trying to rebuild everything on my computer, but a bit hard as you are referring git versions in your toml and not "standard" packages, my personal network configuration is a bit (let's say) restrictive with that but I will manage. Also... the nightly toolchain to be installed apparently... Any news on 0.46 polars support that brings a bunch of new functions startsWith, endsWith... ?
- Git reference is needed b/c one of the json packages was not been converted to a crate by Ritche
- Nightly toolchain is used by the core Polars team, hence needed
- 0.46 PR is kind of blocked ATM b/c the core team has remove utf8 serialization, not sure how to fix it ATM
- Be sure to lint everything before submitting a PR and test with both yarn and bun
@fbx31 Polars rs-0.46 has been merged and released as v0.19.