Enhance data sharing from SQL Kernels to C#
Is your feature request related to a problem? Please describe. One of the selling points of Polyglot Notebooks is the ability to code in whatever language I want and share data between kernels when t makes sense to use another language for the next step of a processing task. A key example of this would be querying data from a SQL kernel and then sharing the data to a C# kernel.
The SQL query and storing results are straightforward:
#!sql-dad-jokes --name JokesTable
SELECT
j.Id,
j.Joke,
j.Punchline
FROM
Jokes j
Right now the stored results of a SQL query are in an array of TabularDataResource objects registered to the SQL kernel. Given that resource, I want to share it with the .NET kernel, so I try a #!set with value and name to share the data to the C# kernel.
#!set --value @sql-dad-jokes:JokesTable --name JokesTable
Now C# has that result, but it's suddenly a JsonDocument with an unwiieldy schema and the data I want hiding deeper in the structure.
I can get this data using the following C# code:
using System.Text.Json;
JsonElement data = JokesTable.RootElement[0].GetProperty("data");
If I add a using statement I can then use the ToTabularDataResource extension method you offer:
using Microsoft.DotNet.Interactive.Formatting;
TabularDataResource res = data.ToTabularDataResource();
That's an hour or so of troubleshooting and a few lines of code to get back to a TabularDataResource, but wait, we're not done because what I actually wanted was a Microsoft.Data.Analysis.DataFrame to work effectively with the data.
The best way I've found to get this is to install the Extension lab NuGet package:
#r "nuget:Microsoft.DotNet.Interactive.ExtensionLab,1.0.0-beta.24164.1"
This then gives me my DataFrame through the ToDataFrame extension method:
using Microsoft.DotNet.Interactive.ExtensionLab;
using Microsoft.Data.Analysis;
DataFrame dfJokes = res.ToDataFrame();
That's a lot of work just to get a SQL query in one kernel into a DataFrame in another. Also, best I can tell, nobody has done this and documented it, so this issue stands as documentation that this is even possible.
Describe the solution you'd like Give me a better way of getting SQL data into the C# kernel. The JsonDocument format isn't enough. If you start with a TabularDataResource and that's a .NET type, then that's great; I'd like to keep that and work with it. This may be a different magic command or different syntax when executing #!set
Alternatively, give me some extension methods that make it easier to get a TabularDataResource back out of a JsonDocument.
I also really like the ToDataFrame extension method and hope it doesn't go away. I get nervous over the permanence of anything in a "Lab" proejct.
Describe alternatives you've considered See description. This is something I think people will want to do, but it didn't seem terribly documented that it was even possible anywhere. I'm including it in my book as a supported workflow, but I wish the experience was better on the developer.
There's been a proposal to automatically deserialize tabular data resource JSON into a DataFrame.
Related: #3538