Deedle
Deedle copied to clipboard
Improve docs, particularly for C#
One especially central operation that’s easy to forget and hard to discover is how to dice a frame on one of its indices, e.g., in C#:
frame.Rows[ … ]
frame.Columns[ … ]
While the formulation is elegant, I think the mental hurdle is that the operation is on a property of the frame, and not on the frame itself.
I think it's fine to keep, but we should promote this use in the docs.
I'm also thinking maybe to have a C# and F# cookbook (combined? separate?) that serves as a reference that's a bit more use-case oriented than the API reference, but not as didactic as the tutorials.
Problem solving oriented Cookbook (like the one for pandas) is a great idea!! I feel the current form of the Deedle documentation and function signature is a little intimidating to ppl like myself who is not experienced.
I completely agree on the comments on row based and column based calculation. Its very hard to have a clear mental picture how it works by just read the document. its a lot easier to actually just poking around with a simple cookbook style example.
The following is text copied from the an ticket i raised with suggestion on improving documentation.
In area like economics often time series are combined into newer time series (set based combination). When I read the Deedle C# API signatures and document page there is few mention on such application. however after some poking around its surprisingly easy to do such calculation. I feel it is even easier to do such process in C# than F# (sorry to say that).
The following is an example I hope i could have found on the documentation page:
CalculateDeedleFrameRowAverageWithMissingValues
Things to pay attention to are:
- The rows function didn't remove the row while both columns' value are missing
- The mean function is smart enough to calculate mean based on available data point.
Thanks for opening source such a great library
CalculateDeedleFrameRowAverageWithMissingValues
using System.Text;
using System.Threading.Tasks;
using Deedle;
namespace CalculateDeedleFrameRowAverageWithMissingValues
{
class Program
{
static void Main(string[] args)
{
var s1 = new SeriesBuilder<DateTime, double>(){
{DateTime.Today.Date.AddDays(-5),10.0},
{DateTime.Today.Date.AddDays(-4),9.0},
{DateTime.Today.Date.AddDays(-3),8.0},
{DateTime.Today.Date.AddDays(-2),double.NaN},
{DateTime.Today.Date.AddDays(-1),6.0},
{DateTime.Today.Date.AddDays(-0),5.0}
}.Series;
var s2 = new SeriesBuilder<DateTime, double>(){
{DateTime.Today.Date.AddDays(-5),10.0},
{DateTime.Today.Date.AddDays(-4),double.NaN},
{DateTime.Today.Date.AddDays(-3),8.0},
{DateTime.Today.Date.AddDays(-2),double.NaN},
{DateTime.Today.Date.AddDays(-1),6.0}
}.Series;
var f = Frame.FromColumns(new KeyValuePair<string, Series<DateTime, double>>[] {
KeyValue.Create("s1",s1),
KeyValue.Create("s2",s2)
});
s1.Print();
f.Print();
f.Rows.Select(kvp => kvp.Value).Print();
// 29/05/2015 12:00:00 AM -> series [ s1 => 10; s2 => 10]
// 30/05/2015 12:00:00 AM -> series [ s1 => 9; s2 => <missing>]
// 31/05/2015 12:00:00 AM -> series [ s1 => 8; s2 => 8]
// 1/06/2015 12:00:00 AM -> series [ s1 => <missing>; s2 => <missing>]
// 2/06/2015 12:00:00 AM -> series [ s1 => 6; s2 => 6]
// 3/06/2015 12:00:00 AM -> series [ s1 => 5; s2 => <missing>]
f.Rows.Select(kvp => kvp.Value.As<double>().Mean()).Print();
// 29/05/2015 12:00:00 AM -> 10
// 30/05/2015 12:00:00 AM -> 9
// 31/05/2015 12:00:00 AM -> 8
// 1/06/2015 12:00:00 AM -> <missing>
// 2/06/2015 12:00:00 AM -> 6
// 3/06/2015 12:00:00 AM -> 5
//Console.ReadLine();
}
}
}
Perhaps the Titanic example could also be co-written in c#? It's not exactly clear from the F# example, i wanted to do it as a quick way to test Deedle out, but it's not going to be a "quick test" as i expected.
If i succeed at creating it, i'll share it here.