Han Wang

Results 43 issues of Han Wang

**Describe the bug** mmlspark.lightgbm._LightGBMClassifier does not exist **To Reproduce** I git cloned the repo and sys.path.append the mmlspark python path, `import mmlspark` has no issue, but the classifier inside can't...

awaiting response
by design
area/lightgbm
area/documentation
installation

**Is your feature request related to a problem? Please describe.** Look at [here](https://github.com/fugue-project/fugue/blob/838fdaa794c62e8bdc7f1474818d9491d5d39ed7/fugue_spark/execution_engine.py#L513) If taking just one row with our sorting, we may use `GROUP BY` and `FIRST` to solve...

enhancement
good first issue
spark
core feature
low priority

Triad already has the standard solution for plugin mode: https://github.com/fugue-project/triad/pull/85 We need to migrate all Fugue plugins to this standard approach. We also need to keep the old way working....

behavior change
refactoring

_Originally posted by @keiranmraine in https://github.com/fugue-project/fugue/issues/331#issuecomment-1160126031_ ```console File c:\Users\XXX\.venv\lib\site-packages\fugue\workflow\workflow.py:1518, in FugueWorkflow.run(self, *args, **kwargs) 1516 if ctb is None: # pragma: no cover 1517 raise -> 1518 raise ex.with_traceback(ctb) 1519 self._computed...

bug

Many CSV files contain column names with special characters. Fugue will raise exceptions because it has more strict [rules](https://github.com/fugue-project/triad/blob/00e395a33fb09b4bb5b1ce9dbf168c3a14b8b474/triad/utils/string.py#L13) for column names. So we should have an option when reading...

enhancement
behavior change
core feature
IO

Currently, each execution engine has its own separate implementation for loading files. This is messy with a lot of duplications. We should create a unified IO engine just like SQLEngine...

enhancement
refactoring
core feature
IO

**Describe the solution you'd like** ```python #schema *,a:int def t(df:pd.DataFrame) -> pd.DataFrame: # do something and return ``` Currently this can only be used on transformer, but it should be...

enhancement
programming interface
core feature

These two SQLs have different behavior on HAVING ```sql CREATE [[1, 2], [NULL, 2], [NULL, 1], [3, 4], [NULL, 4]] SCHEMA a:double,b:int SELECT a, SUM(b) AS b GROUP BY a...

Fugue SQL
notice

**Describe the solution you'd like** Change the documents to Furo theme

documentation
enhancement

**Describe the solution you'd like** Currently, for avro IO, dask is still using the local version implementations, we should use [this](https://docs.dask.org/en/latest/_modules/dask/bag/avro.html) instead to utilize the distributed system

enhancement
dask