narwhals fix: unify `ColumnNotFound` for `duckdb` and `pyspark`

What type of PR is this? (check all applicable)

[ ] 💾 Refactor
[ ] ✨ Feature
[x] 🐛 Bug Fix
[ ] 🔧 Optimization
[ ] 📝 Documentation
[ ] ✅ Test
[ ] 🐳 Other

Related issues

Closes #2472

Checklist

[x] Code follows style guide (ruff)
[x] Tests added
[x] Documented the changes

If you have comments or can explain your changes, please do so below

May 04 '25 20:05 EdAbati

I think I can make some other clean-up of repetitive code. I'll try tomorrow morning

May 04 '25 21:05 EdAbati

I made a followup PR #2495 with the cleanup :)

May 05 '25 07:05 EdAbati

I finally had a few minutes to go back to this 🥲

Few notes:

catch_{}_exception made it cleaner thanks for the tip
sqlframe is tricky because it raises at collect time. Also the error will be different based on the backend. Do we have a way to know which type of sqlframe we are dealing with. For duckdb we could use the catch_duckdb_exception but we need to make sure duckdb is available. Any ideas? Could we think about it in a follow up?
regarding simple_select, it should already be covered by the below. We are already testing it too: https://github.com/narwhals-dev/narwhals/blob/b5f72dd36926d8be9506754cd9e92f8a855a42b8/narwhals/dataframe.py#L165-L181 Would you like to do something else for lazy backends?

May 23 '25 07:05 EdAbati

sqlframe is tricky because it raises at collect time. Also the error will be different based on the backend. Do we have a way to know which type of sqlframe we are dealing with?

@EdAbati IIRC there's some import-related functions in _spark_like.utils that may be helpful for you?

I'm assuming this would work in a similar way, where you either need to use some Base* class or use the string in the concrete type in an import path

May 24 '25 14:05 dangotbanned

Found some time to update this. (sorry for the late reply)

@EdAbati IIRC there's some import-related functions in _spark_like.utils that may be helpful for you?

The problem IMO is that since sqlframe lets backend raise their errors we can only intercepts the ones of the backends we also support (i.e. pyspark and duckdb) Not sure if the best solution would be to let sqlframe do their error handling or to intercept the errors just for the backend we support. Maybe it should be discussed in an issue/follow-up?

@MarcoGorelli is there anything that you think is missing now? :)

Jun 12 '25 06:06 EdAbati

thanks! i think the logic looks right, the tests look a little complex but maybe that's ok. will get back to this shortly

Jun 24 '25 10:06 MarcoGorelli

The problem IMO is that since sqlframe lets backend raise their errors we can only intercepts the ones of the backends we also support (i.e. pyspark and duckdb) Not sure if the best solution would be to let sqlframe do their error handling or to intercept the errors just for the backend we support. Maybe it should be discussed in an issue/follow-up?

Yeah that's fine - I think in general it's ok to aim for "we try to unify what we can, but there may be some differences that we have no control over"

Jun 27 '25 16:06 MarcoGorelli

Thank you @MarcoGorelli for the review and for the patience with this one 🥲❤️

Jul 16 '25 17:07 EdAbati

@EdAbati @MarcoGorelli does this error seem related to this PR?

https://github.com/narwhals-dev/narwhals/actions/runs/16328848063/job/46125919696

I haven't touched anything I swear 😂

Jul 16 '25 20:07 dangotbanned

does this error seem related to this PR?

https://github.com/narwhals-dev/narwhals/actions/runs/16328848063/job/46125919696

yeah It looks like the "stricter" regex (that start with ^) are not ok for polars 0.20.22 :(

I have some time to update this tonight, I'll make a PR if no one does it before me :)

Jul 17 '25 09:07 EdAbati