DataFrame icon indicating copy to clipboard operation
DataFrame copied to clipboard

DataFrame in Pharo - tabular data structures for data analysis

Results 50 DataFrame issues
Sort by recently updated
recently updated
newest added

At the moment, we have three methods in DataFrame's `sorting` protocol: ```Smalltalk DataFrame >> sortBy: aColumnName. DataFrame >> sortBy: aColumnName using: aBlock. DataFrame >> sortDescendingBy: aColumnName. ``` We need to...

api
Difficulty: Easy
to be discussed

``` Smalltalk a := DataSeries withValues: #(1 2 2 3 5). a makeCategorical. a isCategorical. "true" b := a select: [ :each | each = 2]. b isCategorical. "false" ```...

bug
Difficulty: Easy
Novi Sad

Currently DataFrame only expects a FileReference as parameter to create an instance: ```smalltalk DataFrame readFromCsv: '/dir1/dir2/myfile.csv' asFileReference ``` It could be a nice time saver to let the DataFrame actually...

new feature
api
Difficulty: Easy

The "Very simple Example" section of the readme has an example of how to create a simple DataFrame, but no simple example of manipulating/querying it. Adding such example would make...

documentation
Difficulty: Easy

It would be nice to have groups defined in the BaselineOfDataFrame, to avoid loading tests.

baseline
Difficulty: Medium
to be discussed

I think that DataFrame is now mature enough to have its own logo :) Check out the logos of: - PolyMath: https://github.com/PolyMathOrg/PolyMath - Cormas: https://github.com/cormas/cormas - Pandas: https://pandas.pydata.org/ - Multiple...

documentation
to be discussed
idea

To inspect a DataFrame, we create a Spec table: ```st inspectionItems: aBuilder | table | table := aBuilder newTable. table addColumn: (SpIndexTableColumn new title: '#'; sortFunction: #yourself ascending; beNotExpandable; yourself)....

ui
Difficulty: Medium
to be discussed
fun and creative
idea

It is possible to do operations such as divide between 2 DataSeries. In the case of a DataSerie containing nils, we consider that the return of the operation should be...

new feature
api
Difficulty: Easy

Currently data types are saved in a collection of the DataFrame. IMO it would make more sense to save it in each data series.

Difficulty: Easy
Novi Sad