Azurite Multiple fixes for Table and Refactoring

Tests, fixes and refactoring for table API.

Fixes issues using and querying GUID types
- Fixes #1504
Removes odata Timestamp type from entities when accept is set to minimalmetadata.
- Fixes #1162
Ensures no entities are returned when queries use $top=0
- Fixes #1428
Fixes issues querying for binary values
- Fixes #1367

May 31 '22 19:05 edwin-huber

/azp run

Jun 15 '22 07:06 blueww

Azure Pipelines successfully started running 1 pipeline(s).

Jun 15 '22 07:06 azure-pipelines[bot]

/azp run

Jun 17 '22 09:06 blueww

Azure Pipelines successfully started running 1 pipeline(s).

Jun 17 '22 09:06 azure-pipelines[bot]

@XiaoningLiu : Regarding potential breaking change for GUID type representation in the persistence layer

Having reviewed options here, IMHO it would be best to provide a script to update databases for those cases where users do not want to re-create the tables containing GUID values. By including checks for a legacy version of the data representation, we would slow performance, while at the same time reverting the change which we are making to correct the behavior of the data type.

If we consider the Edm type's representation, it will take the format below:

"guidprop":"5d62a508-f0f7-45bc-be10-4b192d7fed2d","[email protected]":"Edm.Guid"

This would be simple to find and replace by encoding the Guid string.

Would you concur that this option is preferable? If so, I would submit the script to update the Guids as part of the PR, and flag the breaking change.

Jun 28 '22 06:06 edwin-huber

@XiaoningLiu : Regarding potential breaking change for GUID type representation in the persistence layer

Having reviewed options here, IMHO it would be best to provide a script to update databases for those cases where users do not want to re-create the tables containing GUID values. By including checks for a legacy version of the data representation, we would slow performance, while at the same time reverting the change which we are making to correct the behavior of the data type.

If we consider the Edm type's representation, it will take the format below:
"guidprop":"5d62a508-f0f7-45bc-be10-4b192d7fed2d","[email protected]":"Edm.Guid"
This would be simple to find and replace by encoding the Guid string.

Would you concur that this option is preferable? If so, I would submit the script to update the Guids as part of the PR, and flag the breaking change.

Thanks Edwin for the evaluation! I can see there are several options:

Provide a way for customers to upgrade legacy loki persisted file to new format. The new Azurite version only works with new schema.
The new Azurite version only works with new schema. (Breaking change)
The new Azurite version works with both new schema and old. 3.1 Azurite logic detects data it's under new or old schema, then choose behavior to handle. 3.2 Azurite logic always views the data as new format, only when exception/error happens, fallback to old schema.

About option 1.

It's an applicable way and used in many products to provide a way to upgrade/migration old data schema to new. For Azurite, instead of providing a script. I'd like to provide a customer unaware way to do the migration or backward compatibility without asking customers additional do something. For example, some customers get Azurite from Visual Studio Code extension or docker image. These customers may not have the proper environment to execute the script. We can also let Azurite integrate the script inside of Azurite and do the migration when Azurite starts Table service. Though it's not an easy task to integrate an additional task when Azurite starts for the migration.

I would recommend option 3, specifically, option 3.2 instead of 3.1.

Azurite logic can always view the data schema as new format, only when exception or error happens, does it the fallback. In this way, for data in new schema, there will not be performance impact. Compared with option 1, option 3.2 should have less code changes.

What do you think?

Jul 07 '22 09:07 XiaoningLiu

I accept, that for some code paths, we can use the fallback logic, specifically if expecting 1 data type in an entity and finding another.

For others, such as searching, there will not be any error, entities will just not be found with the change to the schema, as data will be represented with base64 rather than a standard string, for such cases, a "DB schema upgrade" would be the only option.

Jul 07 '22 10:07 edwin-huber

I shall break out the changes to GUID behavior and storage into a separate PR which we can target for vNext

Jul 22 '22 11:07 edwin-huber

This PR is pending re-writing the query parser to support the backwards compatible queries.

Sep 13 '22 11:09 edwin-huber

Azurite Azurite copied to clipboard

Multiple fixes for Table and Refactoring

Azurite
Azurite copied to clipboard