Maciej Skorski

Results 25 issues of Maciej Skorski

**Setup** Python=3.8 + azureml-core=1.36.0 + azureml-dataprep=2.26.0 + pyarrow=7.0.0 + pandas=1.4 **Summary** `to_pandas_dataframe` **wrongly reads certain Parquet datasets**. Data of some columns appears to be internally shuffled. This was already [reported...

### Pandas version checks - [X] I have checked that the issue still exists on the latest versions of the docs on `main` [here](https://pandas.pydata.org/docs/dev/) ### Location of the documentation [dev...

Docs
Algos
Needs Triage

# Pull Request Check List Resolves: #3444 #4959 #4958 #4965 - [x] Added **tests** for changed code. - [ ] Updated **documentation** for changed code.

# Pull Request Check List Resolves: #4952 #4670 #6118 #5121 - [x] Added **tests** for changed code. - [x] Updated **documentation** for changed code.

I see recurring problems when trying covariances different than "full", that is diag, spherical etc. The code to reproduce from smm import SMM from sklearn import datasets data = datasets.load_iris()...

# Environment **Delta-rs version**: 0.13.0 **Binding**: Python **Environment**: - **Cloud provider**: - **OS**: - **Other**: *** # Bug **What happened**: It's unclear how to update datatypes, and if this is...

bug
binding/python
binding/rust

This PR upgrades the multi-core implementation of `LDA` to use callbacks 💪. Callbacks are critical for model evaluation in general, and [have been requested in past](https://gensim.narkive.com/CGUSGGU5/11838-how-to-display-callbacks-training-progress-using-ldamulticore) for Gensim's model in...

Often `partial_fit` is invoked multiple times to process a large dataset. It seems that the attribute `representative_docs_` is not populated then. Is there an easy way to get representative docs...

With error logs: ``` Waiting for build to start... Picked Git content provider. Cloning into '/tmp/repo2docker4odte8kr'... HEAD is now at 7af2a0b Bump pyyaml from 5.1 to 5.4 (#24) Python version...

**Describe the bug** The GPU-accelerated implementation from cuml can give **much worse results than the CPU alternative from the package [umap](https://umap-learn.readthedocs.io/en/latest/index.html) on a simple dataset**. By visual inspection, we see...

bug
? - Needs Triage