Jimmy Stammers issues

Results 9 issues of


                                            Jimmy Stammers

Enable a dashboard interface for model performance evaluation

The CLI for this app is incredibly useful and allows the user to carry out a lot of analysis. Is there any planned work to develop a dashboard to visualise...

feat M

Unable to specify save format for SparkHIveDataSet

## Description The implementation for `SparkHiveDataSet` allows the user to specify additional save arguments. This should enable a delta table to be saved which is done using the following pyspark...

Issue: Bug Report 🐞

Community

[BUG] SarimaxModel fails with to fit with exogenous data

**Describe the bug** I am failing to fit a SarimaxModel because the model reports that the exogenous and endogenous dataframes do not have the same index despite coming from the...

bug

Fix issue with specifying format for SparkHiveDataSet

## Description The current implementation of `SparkHiveDataSet` contains a bug that prevents a user from specifying a save format. This issue is discussed in #1528 . ## Development notes The...

[BUG] CROSS Join fails for SparkExecutionEgnine

**Minimal Code To Reproduce** ```python df1_s = spark.createDataFrame([[1,2]], schema=StructType([StructField("a", IntegerType()), StructField("b",IntegerType())])) df1_s = spark.createDataFrame([[1,2]], schema=StructType([StructField("c", IntegerType()), StructField("d",IntegerType())])) dag= FugueWorkflow() df1 = dag.df(df1_s) df2 = dag.df(df1_s) df2 = df1.join(df2, how='cross') dag.run(engine='spark')...

better error

[QUESTION] How to use a CoTransformer on data frames with shared non-key columns

I have a function that aims to implement an SCD2 merge on two dataframes. In my example, I am attempting to merge two dataframes together, using a single column as...

[BUG] fugue_sql intermittently throwing segmentation fault errors

**Minimal Code To Reproduce** **Describe the bug** I have a set of unit tests that check the functionality of code that uses the `fugue_sql` API with a DuckDB backend. When...

feat: support pyarrow UDFs for pyspark backend

### Is your feature request related to a problem? Pyspark now supports [Arrow UDFs](https://spark.apache.org/docs/latest/api/python/user_guide/sql/arrow_pandas.html#arrow-python-udfs) that facilitate efficient row-by-row executions using Arrow as a backend e.g. ```python import pandas as pd...

feature

udf

pyspark

bug: Unable to cache table after creating geometry column

### What happened? I'm trying to create a table with a `geometry` column using given lat/long coodinates. I can successfully create this column, but when calling `.cache()`, I get a...

bug

geospatial