Randy Olson
Randy Olson
Hello, What project is this from? I don’t recognize it. On Thu, Aug 12, 2021 at 6:35 AM edithwangu ***@***.***> wrote: > ------------------------------ > > KeyError Traceback (most recent call...
What project is this question for?
Oh, I see. I used tree-based algorithms there because they're good default algorithms to use (--> not much need for data preprocessing), have good performance, and are easy to tune.
The example data science notebook: https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects/blob/master/example-data-science-notebook/Example%20Machine%20Learning%20Notebook.ipynb
Looking forward to this in TPOT. I think it will clean up the interface quite a bit.
@teaearlgraycold, taking your example, it could work like this: ```python my_text = "...".split("\n") class_labels = [1, 0, 0, 1, 1, ..., 0, 0] # Assuming len(my_text) == len(class_labels) my_tpot =...
Is text classification such a fundamentally different problem type that it requires a new TPOT class? Once the text is converted to a bag-of-[words, ngrams, etc.] representation, we're working with...
This sklearn issue seems relevant to our conversations here: https://github.com/scikit-learn/scikit-learn/pull/9012 Maybe a better way to accomplish what we want in the sklearn Pipeline architecture.
@weixuanfu2016, can you please confirm that this issue is addressed in the 0.7 release? IIRC models that failed to finish evaluating due to timeouts have their "timeout score" recorded in...
I like this idea. I'd like to explore it after we have regression integrated into TPOT. We should explore additional metrics for scoring unsupervised results as well.