Etl.Net icon indicating copy to clipboard operation
Etl.Net copied to clipboard

Running a stored procedure only once after all records in a CSV file have been saved to the database

Open mickvikt opened this issue 1 year ago • 2 comments

Let's say we have a data pipeline like this:

        var stream = contextStream
            .ToSqlCommand("Create tables", Queries.CreateTables)
            .CrossApplyFolderFiles("Get data file", "data.csv", true)
            .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
                i => new
                {
                    code = i.ToColumn(0),
                    name = i.ToColumn(1)
                }).IsColumnSeparated(','))
            .SqlServerSave("Populate data table", o => o
                .SeekOn(i => code)
                .ToTable("[dbo].[some_table]")
            )
	   .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);

A CSV is parsed, its records are saved to a table on a SQL Server database and then I want to run a stored procedure which depends on that data.

If I try to run it and the CSV file contains 5 rows, stored procedure is being run 5 times, and I only need to run it once.

I tried to create a second data pipeline to run the stored procedure like this:

        var stream = contextStream
            .ToSqlCommand("Create tables", Queries.CreateTables)
            .CrossApplyFolderFiles("Get data file", "data.csv", true)
            .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
                i => new
                {
                    code = i.ToColumn(0),
                    name = i.ToColumn(1)
                }).IsColumnSeparated(','))
            .SqlServerSave("Populate data table", o => o
                .SeekOn(i => code)
                .ToTable("[dbo].[some_table]")
            );

       contextStream
	   .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);         

but when I try to execute it, stored procedure is not being run after whole CSV import is complete and rows are stored in the table, it's being run somewhere in between and the stored procedure can't get the data it needs since it has not been saved yet.

How can I signal to the pipeline, that I need to run the stored procedure just once? I tried to create a separate DefineProcess method for running SP, but that seems really clumsy.

What would you advice? Thank you.

mickvikt avatar Jun 08 '23 08:06 mickvikt

I suggest you to use WaitWhenDone operator the following way:

        var stream = contextStream
            .ToSqlCommand("Create tables", Queries.CreateTables)
            .CrossApplyFolderFiles("Get data file", "data.csv", true)
            .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
                i => new
                {
                    code = i.ToColumn(0),
                    name = i.ToColumn(1)
                }).IsColumnSeparated(','))
            .SqlServerSave("Populate data table", o => o
                .SeekOn(i => code)
                .ToTable("[dbo].[some_table]")
            );

       contextStream
	   .WaitWhenDone("wait everything is done", stream)
	   .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);

paillave avatar Jun 12 '23 20:06 paillave

Thanks, Stéphane.

On Mon, Jun 12, 2023, 23:04 Stéphane Royer @.***> wrote:

I suggest you to use WaitWhenDone operator the following way:

    var stream = contextStream
        .ToSqlCommand("Create tables", Queries.CreateTables)
        .CrossApplyFolderFiles("Get data file", "data.csv", true)
        .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
            i => new                {                    code = i.ToColumn(0),                    name = i.ToColumn(1)                }).IsColumnSeparated(','))
        .SqlServerSave("Populate data table", o => o                .SeekOn(i => code)                .ToTable("[dbo].[some_table]")
        );

   contextStream
 .WaitWhenDone("wait everything is done", stream)
 .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);

— Reply to this email directly, view it on GitHub https://github.com/paillave/Etl.Net/issues/445#issuecomment-1588007134, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGQZ2B4P33ITNUCEK6ZYCTXK5Y6FANCNFSM6AAAAAAY657HGY . You are receiving this because you authored the thread.Message ID: @.***>

mickvikt avatar Jun 12 '23 20:06 mickvikt