fastnumbers
fastnumbers copied to clipboard
Proposal: Do not raise an exception on None
Hi @SethMMorton,
It could be possible for fastnumbers to handle None instead of rising and exception?
import fastnumbers
fastnumbers.fast_float(None)
TypeError: float() argument must be a string or a number, not 'NoneType'
Maybe return None?
Can you explain your use case?
I am using fastnumbers to detect the number of float in a Dask series. This series can have millions of element and it can have n None
elements. Using try-except to handle the exception can have a big impact on the performance.
If you are trying to just detect floats, would not isfloat
be a better choice? But I'm guessing there is more to it than you describe.
Since it sounds like you are using a Pandas-like structure, it would probably be better to do something roughly equivalent to s[s == None] = float("nan")
. Even if this functionality is supported, using a vectorized methodology should outperform fastnumbers
.
I'm not saying this functionality would not be useful, I just am wondering if the fact that fastnumbers
is not vectorized will actually be a hindrance to the types of manipulations you are trying to optimize.
Before you ask, I would not be against having fastnumbers
be vectorized, but I simply do not have the time to implement that myself - I would accept a pull request enabling it, though. But that's another issue that would need to be filed, separate from this issue.
Yes, I am trying to convert a string pandas series to numbers.
For the record in pandas 1.0.1 I tested to_numeric
to convert string to float and fast numbers seems to be almost 2x faster.
import pandas as pd
pdf= pd.read_csv("https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/crime.csv")
%timeit pdf["OFFENSE_CODE"].map(fast_float)
86.2 ms ± 2.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit pd.to_numeric(pdf["OFFENSE_CODE"], errors="coerce")
162 ms ± 7.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
BTW, I would be amazing If we can vectorize fast_numbers. I would be more than happy to contribute.
BTW, I would be amazing If we can vectorize fast_numbers. I would be more than happy to contribute.
@argenisleon I realize so much time has passed you no longer need/want this, but fastnumbers
is now vectorized.