flow icon indicating copy to clipboard operation
flow copied to clipboard

Types Metadata

Open norberttech opened this issue 11 months ago • 3 comments

Sometimes when converting from one data format to another some metadata is missing, for example, when converting from PHP int into Parquet int we can't say for sure if it's INT32 or INT96.

The same applies for the databases, we never know what's the maximum length of a String column.

The solution for that problem would be to let each Adapter to provide and handle custom schema metadata that can be used to improve transformation precision.

For example Doctrine Adapter could expose metadata like:


class DoctrineMatadata
{
    public const STRING_LENGTH = 'string_length';
}

$schema = schema(
    str_schema('name', nullable: true, metadata: Metadata::fromArray([DoctrineMetadata::STRING_LENGTH => 255])),
);

Of course, we would need to build a mechanism that would covert schemas from Flow into Doctrine DBAL first to support that metadata fully.

But it's a generic concept that could be applied for example in Parquet or in the #1353

norberttech avatar Jan 17 '25 17:01 norberttech

On top of Adapter specific Metadata, we could also try to define some generic metadata provided by Core, like:

  • STRING_LENGTH - int<1,max>
  • INT_SIZE - (enum 32/64)
  • DESCRIPTION - text- could be used by adapters to add extra description to columns

and then more specific ones like those provided by Doctrine Adapter:

  • DB_PRIMARY_KEY - boolean
  • DB_INDEX - boolean
  • DB_UNIQUE_INDEX - boolean
  • DB_TYPE string
  • DB_TYPE_PLATFORM_OPTIONS - array
  • DB_COLUMN_DEFINITION - string

norberttech avatar Jan 18 '25 01:01 norberttech

Other generic metadata could be focused on data purpose, like:

  • PII bool - defines if the column carries PII data

norberttech avatar Jan 18 '25 01:01 norberttech

Partially implemented in #1429, just for Doctrine Adapter

norberttech avatar Feb 01 '25 16:02 norberttech