yuniql icon indicating copy to clipboard operation
yuniql copied to clipboard

Error during bulk data import: 42601 syntax error at or near ")"

Open kirill-gerasimenko opened this issue 4 years ago • 15 comments

Hi,

I'm evaluating bulk import functionality. My database is Postgres 12.3 (running via docker).

I've created a version folder using vnext command, and put there two files:

  1. test_data.csv (i tried to include id to the csv file - the result is the same)
"first_name","last_name"
"Bob","Green"
"Robert","Red"
  1. script.sql
create table test_data(
  id SERIAL PRIMARY KEY,
  first_name varchar(20),
  last_last varchar(20)
)

When I run run command against the database I get this error:

....
INF   2020-06-19T16:41:56.2163740Z   PostgreSqlBulkImportService: Started copying data into destination table public.test_data
INF   2020-06-19T16:41:56.2254700Z   PostgreSqlBulkImportService: COPY public.test_data () FROM STDIN (FORMAT BINARY)
ERR   2020-06-19T16:41:56.2408861Z   Failed to execute run function. 
Npgsql.PostgresException (0x80004005): 42601: syntax error at or near ")"
   at Npgsql.NpgsqlConnector.<>c__DisplayClass160_0.<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Npgsql.NpgsqlConnector.<>c__DisplayClass160_0.<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Npgsql.NpgsqlConnector.ReadMessage(DataRowLoadingMode dataRowLoadingMode)
   at Npgsql.NpgsqlBinaryImporter..ctor(NpgsqlConnector connector, String copyFromCommand)
   at Npgsql.NpgsqlConnection.BeginBinaryImport(String copyFromCommand)
   at Yuniql.PostgreSql.PostgreSqlBulkImportService.BulkCopyWithDataTable(IDbConnection connection, IDbTransaction transaction, String schemaName, String tableName, DataTable dataTable) in C:\projects\yuniql\yuniql-platforms\postgresql\PostgreSqlBulkImportService.cs:line 113
   at Yuniql.PostgreSql.PostgreSqlBulkImportService.Run(IDbConnection connection, IDbTransaction transaction, String fileFullPath, String bulkSeparator, Nullable`1 bulkBatchSize, Nullable`1 commandTimeout) in C:\projects\yuniql\yuniql-platforms\postgresql\PostgreSqlBulkImportService.cs:line 56
   at Yuniql.Core.MigrationServiceBase.<>c__DisplayClass16_0.<RunBulkImport>b__1(String csvFile) in C:\projects\yuniql\yuniql-core\MigrationServiceBase.cs:line 181
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at Yuniql.Core.MigrationServiceBase.RunBulkImport(IDbConnection connection, IDbTransaction transaction, String workingPath, String scriptDirectory, String bulkSeparator, Nullable`1 bulkBatchSize, Nullable`1 commandTimeout, String environmentCode) in C:\projects\yuniql\yuniql-core\MigrationServiceBase.cs:line 183
   at Yuniql.Core.MigrationService.<>c__DisplayClass10_0.<RunVersionScripts>b__2(String versionDirectory) in C:\projects\yuniql\yuniql-core\MigrationService.cs:line 300
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at Yuniql.Core.MigrationService.RunVersionScripts(IDbConnection connection, IDbTransaction transaction, List`1 dbVersions, String workingPath, String targetVersion, NonTransactionalContext nonTransactionalContext, List`1 tokenKeyPairs, String bulkSeparator, String metaSchemaName, String metaTableName, Nullable`1 commandTimeout, Nullable`1 bulkBatchSize, String appliedByTool, String appliedByToolVersion, String environmentCode) in C:\projects\yuniql\yuniql-core\MigrationService.cs:line 315
   at Yuniql.Core.MigrationService.<Run>g__RunAllInternal|9_2(IDbConnection connection, IDbTransaction transaction, <>c__DisplayClass9_0& ) in C:\projects\yuniql\yuniql-core\MigrationService.cs:line 212
   at Yuniql.Core.MigrationService.Run(String workingPath, String targetVersion, Nullable`1 autoCreateDatabase, List`1 tokenKeyPairs, Nullable`1 verifyOnly, String bulkSeparator, String metaSchemaName, String metaTableName, Nullable`1 commandTimeout, Nullable`1 bulkBatchSize, String appliedByTool, String appliedByToolVersion, String environmentCode, Nullable`1 resumeFromFailure) in C:\projects\yuniql\yuniql-core\MigrationService.cs:line 149
   at Yuniql.CLI.CommandLineService.RunMigration(RunOption opts) in C:\projects\yuniql\yuniql-cli\CommandLineService.cs:line 146
  Exception data:
    Severity: ERROR
    SqlState: 42601
    MessageText: syntax error at or near ")"
    Position: 24
    File: scan.l
    Line: 1149
    Routine: scanner_yyerror

Please advise.

kirill-gerasimenko avatar Jun 19 '20 16:06 kirill-gerasimenko

Hi @kirill-gerasimenko , thanks again for reaching out. The tool can only get better with valuable feedback like this.

I was able to reproduce your case. I found that this is because the schema definition of the destination table is not yet available when the creation of tables and import of CSV shared the same transaction. This mean putting the .sql file and .csv files in the same version is not supported in this build.

This describes the possible workaround here for now. I will investigate this further on how we can support DDL and bulk import in the same transaction.

Deploy local database container

docker run -d --name yuniql-postgresql -e POSTGRES_USER=sa -e POSTGRES_PASSWORD=P@ssw0rd! -e POSTGRES_DB=helloyuniql -p 5432:5432 postgres

Set environment variable

SETX YUNIQL_CONNECTION_STRING "Host=localhost;Port=5432;Username=sa;Password=P@ssw0rd!;Database=yuniqldb"

Prepare and run the baseline first

md c:\temp\issue-129
cd c:\temp\issue-129

yuniql init
code c:\temp\issue-129\v0.00\sample.sql

create table test_data(
  first_name varchar(20),
  last_name varchar(20)
)

yuniql run -a -t v0.00 --platform postgresql --debug

Prepare and run the csv import

yuniql vnext -f test_data.csv
code c:\temp\issue-129\v0.01\test_data.csv

"first_name","last_name"
"Bob","Green"
"Robert","Red"

yuniql run --platform postgresql --debug

rdagumampan avatar Jun 20 '20 06:06 rdagumampan

Thanks for the update! So basically at this stage I create schema in one version and import the data in the next one. It makes sense with what you've described.

kirill-gerasimenko avatar Jun 20 '20 21:06 kirill-gerasimenko

@kirill-gerasimenko, yes unfortunately you would have to call yuniql run twice to split the transaction. Each migration is one atomic transaction so far. I think I will create a new feature --transaction-mode per-version to simplify this so each version is one transaction in itsellf even when its executed in single migration run.

rdagumampan avatar Jun 20 '20 21:06 rdagumampan

run twice to split the transaction

do you mean I have to run yuniql run command twice? Just having two separate versions (v0.01 having create script and v0.02 one - csv file) won't be sufficient?

kirill-gerasimenko avatar Jun 20 '20 21:06 kirill-gerasimenko

Each migration is one atomic transaction so far.

What you are saying is that for example if I have to update the database which for example has version v0.1 up to the version v.5 - all the versions in between (0.2, 0.3, 0.4) along with the final version - are all executed in the single transaction? Am I correct here?

kirill-gerasimenko avatar Jun 20 '20 21:06 kirill-gerasimenko

Yes. This is not pretty and we'll prioritize this per version transaction support. I think its more pressing to implement this feature now after your feedback.

In the sample above I had to run per version so make sure each version is an independent transaction.

# runs only upto v0.00
yuniql run -a -t v0.00 --platform postgresql --debug

# runs only upto latest available version
yuniql run  --platform postgresql --debug

rdagumampan avatar Jun 20 '20 21:06 rdagumampan

Each migration is one atomic transaction so far.

What you are saying is that for example if I have to update the database which for example has version v0.1 up to the version v.5 - all the versions in between (0.2, 0.3, 0.4) along with the final version - are all executed in the single transaction? Am I correct here?

Yes, they will be single transaction when you call yuniql run. It's an all or nothing situation and this means they would be sharing the transaction and will cause the CSV import to fail.

The only way so far we can split them is to make targetted run such as we specify the target version such as yuniql run -t v0.01, yuniql run -t v0.05

rdagumampan avatar Jun 20 '20 21:06 rdagumampan

Got it, thanks for the explanation!

I wonder how well this plays with CI/CD as well.

The current approach is kind of makes sense, since you don't want to "partially" update up to the the version, which doesn't fail (say you've applied v1, v2, v3 but v4 failed, so you revert everything back to v0). It's like all or nothing.

With the change you've mentioned, when each version is applied it is executed in the separate transaction - it means that database might not be updated to the latest version (and from my example above it would be applied only up to v3).

kirill-gerasimenko avatar Jun 20 '20 21:06 kirill-gerasimenko

Did you think about having some compensation scripts, when you kind of undo the version? With this per-version transaction this might help to clean up the version that has been applied up to the failed one (in my example).

kirill-gerasimenko avatar Jun 20 '20 21:06 kirill-gerasimenko

Yes, I forsee different transaction modes full (default),per-version,none. The full is the default for all platforms supporting transactional DDL, the per-version best fit in your scenario where some scripts really depents on a committed state, the third none is for versions that require optimal execution without transaction.

rdagumampan avatar Jun 20 '20 21:06 rdagumampan

Did you think about having some compensation scripts, when you kind of undo the version? With this per-version transaction this might help to clean up the version that has been applied up to the failed one (in my example).

Yes, I have thoughts on yuniql undo where we can create a folder _undo in each version and any script here will be executed to revert the version but this idea is still very rough...

I think if some versions failed that is OK, as long as all earlier versions were OK. What's not acceptable is if it skips versions because it would put the schema state in unreliable state. If the version failed, the user must fix that version until it reaches the latest version

rdagumampan avatar Jun 20 '20 22:06 rdagumampan

Yes, I forsee different transaction modes full (default),per-version,none. The full is the default for all platforms supporting transactional DDL, the per-version best fit in your scenario where some scripts really depents on a committed state, the third is for versions that require optimal execution without transaction.

What if I want full mode for everything, but for some particular version I'd like to do it in a separate transaction? (sorry for such questions, I'm trying to think of the use-cases which would be relevant for me).

In CICD scenarion I'd assume that I just choose one one transaction mode... Could this transaction mode be per version?

Or for example:

  • there is a default setting full as you mentioned
  • it can be overridden in the cli
  • if the version has some way of overriding this (like .overrides config file inside of the version's folder), where I can set transaction to be per-version
    • in this case up to this version everything is executed in full mode
    • the version itself uses the overridden value
    • after version is applied - it get's back to the default full

This is a raw idea. I bet it's missing something or too complex :man_shrugging:

kirill-gerasimenko avatar Jun 20 '20 22:06 kirill-gerasimenko

I think the choice of transaction mode largely depends of the nature of pending versions to be applied. This also is limited to cases where you have mixed .sql and .csv that uses the table created in the same version. If its a pure sql migration it should always run smoothly with full.

In your case, I think the default is per-version.

rdagumampan avatar Jun 20 '20 22:06 rdagumampan

Everything are based on CLI parameters so its always customizable. The azure pipelines tasks of yuniql also uses CLI to run the tasks and we can make a variable the we can change in the pipelines to customize the transaction mode. It's just like you described.

I like your idea of .overrides as it fits with yuniql's convention-based philosophy, let sleep on it... It seems nice to put something like .transaction-per-version.overrides, .transaction-full.overrides

rdagumampan avatar Jun 20 '20 22:06 rdagumampan

Hi! I have the same error but with two different trunsactions:

  1. in /v0.00/crm/_crm_dev/ I have:
-- v0.00
CREATE TABLE "FilterOperators" (
	"Id" uuid NOT NULL,
	"DisplayValue" text NULL,
	"Operation" text NULL,
	"SupportsList" bool NOT NULL,
	"Order" int4 NOT NULL DEFAULT 0,
	CONSTRAINT "PK_FilterOperators" PRIMARY KEY ("Id")
);

so I run: yuniql.exe run --path "." --platform "postgresql" --connection-string "User ID=postgres;Password=changeme;Host=localhost;Port=5432;Database=crmdev;Pooling=true;" --auto-create-db --environment "crm_dev" --meta-table "_schema_version" --debug --bulk-separator "," --target-version "v0.00"

Then 2. in v0.01/crm/_crm_dev/ I have:

"Id","DisplayValue","Operation","SupportsList","Order"
"e520a9e3-c5c3-491d-bd95-ddebe1ff2237","Или","OR","false","0"
"0a03801e-710b-42e8-b65f-cf8ac7c81cd9","Не Выставлено","IS NULL","false","0"
"bb7650bd-56cd-4709-8a07-2a7573385ba2","Выставлено","IS NOT NULL","false","0"
"4d763591-09c2-4607-baac-deb5ce8eddc4","Выставлено","IS NOT NULL","false","0"
"8637fe30-270b-4641-a4f5-2c62ebe43b97","И","AND","false","0"
"11a5b14c-2c34-43e1-9ebe-5509823346af","Равно","IS NOT DISTINCT FROM","true","2"
"20bd53c2-3549-477a-b51e-fd7f13846e98","Не Содержит","NOT ILIKE","false","9"
"47d2062f-7066-46ff-aa04-cdcb896a39d0","Меньше","<","false","6"
"4d6f232e-2aef-40f1-9cfb-1366f925018d","Содержит","ILIKE","false","8"
"57cf19ed-8c16-4d56-be2a-36951c5bb75f","Больше",">","false","4"
"73b54e59-15f0-4b00-bd53-2ef20a82ec58","Меньше Или Равно","<=","false","7"
"d9d6b47c-0a65-407e-a551-6da355507a73","Не Равно","IS DISTINCT FROM","true","3"
"e8fc97a4-fd4f-4c00-ba81-3c800eaca899","Больше Или Равно",">=","false","5"
"f41d12d7-969b-404d-9cf4-5f844707cfba","Не Выставлено","IS NULL","false","1"

CSV File in UTF-8 Encoding.

so I run: yuniql.exe run --path "." --platform "postgresql" --connection-string "User ID=postgres;Password=changeme;Host=localhost;Port=5432;Database=crmdev;Pooling=true;" --auto-create-db --environment "crm_dev" --meta-table "_schema_version" --debug --bulk-separator "," --target-version "v0.01"

But it's failed with error:

...
INF   2021-05-12 14:14:01Z   Found 1 script files on .\v0.01\crm\_crm_debug_dev
  + FilterOperators.csv
INF   2021-05-12 14:14:01Z   PostgreSqlBulkImportService: Started copying data into destination table public.filteroperators
INF   2021-05-12 14:14:01Z   PostgreSqlBulkImportService: COPY public.filteroperators () FROM STDIN (FORMAT BINARY)
ERR   2021-05-12 14:14:01Z   Failed to execute run function. 42601: syntax error at or near ")"
Diagnostics stack trace captured a Npgsql.PostgresException (0x80004005):
   at Npgsql.NpgsqlConnector.<>c__DisplayClass160_0.<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Npgsql.NpgsqlConnector.<>c__DisplayClass160_0.<<DoReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Npgsql.NpgsqlConnector.ReadMessage(DataRowLoadingMode dataRowLoadingMode)
   at Npgsql.NpgsqlBinaryImporter..ctor(NpgsqlConnector connector, String copyFromCommand)
   at Npgsql.NpgsqlConnection.BeginBinaryImport(String copyFromCommand)
   at Yuniql.PostgreSql.PostgreSqlBulkImportService.BulkCopyWithDataTable(IDbConnection connection, IDbTransaction transaction, String schemaName, String tableName, DataTable dataTable)
   at Yuniql.PostgreSql.PostgreSqlBulkImportService.Run(IDbConnection connection, IDbTransaction transaction, String fileFullPath, String bulkSeparator, Nullable`1 bulkBatchSize, Nullable`1 commandTimeout)
   at Yuniql.Core.MigrationService.<>c__DisplayClass17_0.<RunBulkImportScripts>b__1(String csvFile)
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at Yuniql.Core.MigrationService.RunBulkImportScripts(IDbConnection connection, IDbTransaction transaction, String workspace, String scriptDirectory, String bulkSeparator, Nullable`1 bulkBatchSize, Nullable`1 commandTimeout, String environment)
   at Yuniql.Core.MigrationService.<>c__DisplayClass12_1.<RunVersionDirectories>b__4(String scriptSubDirectory)
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at Yuniql.Core.MigrationService.<>c__DisplayClass12_0.<RunVersionDirectories>g__RunVersionDirectoriesInternal|3(IDbConnection connection, IDbTransaction transaction, List`1 scriptSubDirectories, String scriptDirectory, String versionDirectory, Stopwatch stopwatch)
   at Yuniql.Core.MigrationService.<>c__DisplayClass12_0.<RunVersionDirectories>b__2(String versionDirectory)
   at System.Collections.Generic.List`1.ForEach(Action`1 action)
   at Yuniql.Core.MigrationService.RunVersionDirectories(IDbConnection connection, IDbTransaction transaction, List`1 appliedVersions, String workspace, String targetVersion, TransactionContext transactionContext, List`1 tokens, String bulkSeparator, String metaSchemaName, String metaTableName, Nullable`1 commandTimeout, Nullable`1 bulkBatchSize, String appliedByTool, String appliedByToolVersion, String environment, String transactionMode)
   at Yuniql.Core.MigrationService.<Run>g__RunAllInternal|11_3(IDbConnection connection, IDbTransaction transaction, Boolean isRequiredClearedDraft, <>c__DisplayClass11_0& )
   at Yuniql.Core.MigrationService.Run(String workspace, String targetVersion, Nullable`1 isAutoCreateDatabase, List`1 tokens, Nullable`1 isVerifyOnly, String bulkSeparator, String metaSchemaName, String metaTableName, Nullable`1 commandTimeout, Nullable`1 bulkBatchSize, String appliedByTool, String appliedByToolVersion, String environment, Nullable`1 isContinueAfterFailure, String transactionMode, Boolean isRequiredClearedDraft)
   at Yuniql.Core.MigrationService.Run()
   at Yuniql.CLI.CommandLineService.RunRunOption(RunOption opts)
  Exception data:
    Severity: ERROR
    SqlState: 42601
    MessageText: syntax error at or near ")"
    Position: 30
    File: scan.l
    Line: 1127
    Routine: scanner_yyerror
If you think this is a bug, please report an issue here https://github.com/rdagumampan/yuniql/issues.

I tested on windows 10 with:

 > yuniql.exe --version
yuniql 1.1.55

if instead of csv, I use sql in v0.01 It works.

INSERT INTO public."FilterOperators" ("Id","DisplayValue","Operation","SupportsList","Order") VALUES
	 ('e520a9e3-c5c3-491d-bd95-ddebe1ff2237'::uuid,'Или','OR',false,0),
	 ('0a03801e-710b-42e8-b65f-cf8ac7c81cd9'::uuid,'Не Выставлено','IS NULL',false,0),
	 ('bb7650bd-56cd-4709-8a07-2a7573385ba2'::uuid,'Выставлено','IS NOT NULL',false,0),
	 ('4d763591-09c2-4607-baac-deb5ce8eddc4'::uuid,'Выставлено','IS NOT NULL',false,0),
	 ('8637fe30-270b-4641-a4f5-2c62ebe43b97'::uuid,'И','AND',false,0),
	 ('11a5b14c-2c34-43e1-9ebe-5509823346af'::uuid,'Равно','IS NOT DISTINCT FROM',true,2),
	 ('20bd53c2-3549-477a-b51e-fd7f13846e98'::uuid,'Не Содержит','NOT ILIKE',false,9),
	 ('47d2062f-7066-46ff-aa04-cdcb896a39d0'::uuid,'Меньше','<',false,6),
	 ('4d6f232e-2aef-40f1-9cfb-1366f925018d'::uuid,'Содержит','ILIKE',false,8),
	 ('57cf19ed-8c16-4d56-be2a-36951c5bb75f'::uuid,'Больше','>',false,4),
	 ('73b54e59-15f0-4b00-bd53-2ef20a82ec58'::uuid,'Меньше Или Равно','<=',false,7),
	 ('d9d6b47c-0a65-407e-a551-6da355507a73'::uuid,'Не Равно','IS DISTINCT FROM',true,3),
	 ('e8fc97a4-fd4f-4c00-ba81-3c800eaca899'::uuid,'Больше Или Равно','>=',false,5),
	 ('f41d12d7-969b-404d-9cf4-5f844707cfba'::uuid,'Не Выставлено','IS NULL',false,1);

I think I have different error, but with the same (+/-) stacktrace.

zverev-iv avatar May 12 '21 15:05 zverev-iv