ballerina-lang icon indicating copy to clipboard operation
ballerina-lang copied to clipboard

[Task]: Support `key-constraint` in table semtype

Open lochana-chathura opened this issue 1 year ago • 7 comments

Description

$subject.

In the current nBallerina table semtype implementation key-constraint is not supported. Ref: https://github.com/ballerina-platform/ballerina-lang/pull/42639#discussion_r1582795300

lochana-chathura avatar Apr 29 '24 18:04 lochana-chathura

  • KS - Key Specifier
  • KC - Key Constraint
  • KST - KeySemtype

In general, table<C1> key(KS1)/key<KC1> <: table<C2> key(KS2)/key<KC2> iff table<C1> <: table<C2> & KeySemType1 <: KeySemType2

Keyless table table<C> can be modeled as C[]. Thefore the keyed-table can be modeled using a tuple [C[], KST]

How do we model KST?

Basically, we encounter 4 scenarios when performing subtype checking.

Sincetable<C> key(KS)/key<KC> <: table<C>, in all cases, KS = undef and KC=never/undef, KST has to be modeled as ANYDATA(key's top type) or ANY(simpler).

Case 1: key(f'1, f'2, ...) <: key(f"1, f"2, ...) iff KST1 <: KST2

where KST is either record {|T1 f1; T2 f2; ...; any...;|} or [[T1, "f1"], [T2, "f2"], ..., [any, string]...]

Case 2: key<KC1> <: key<KC2> iff KST1 <: KST2

where KST = KC. This is straightforward.

Case 3: key(f1, f2, ...) <: key<KC> iff KST1 <: KST2

KST1 is either record {|T1 f1; T2 f2; ... any...;|} or [[T1, "f1"], [T2, "f2"], ..., [any, string]...]

KST2? To hold the subtyping KC has to be a tuple where [T1, T2, ...,] is a subtype. So, if KC is a tuple. KST2 can be modeled as [[T'1, any], [T'2, any], ..., [any, any]...], else KST = KC

Case 4: key<KC> <: key(f1, f2, ..) iff KST1 <: KST2

KST2 is either record {|T1 f1; T2 f2; ... any...;|} or [[T1, "f1"], [T2, "f2"], ..., [any, string]...]

KST1? If KC is a tuple it is modeled as [[T1, "f1"], [T2, "f2"], ..., [any, string]...], else KST1=KC

lochana-chathura avatar Jun 27 '24 07:06 lochana-chathura

// CASE 1: 

type Employee record {|
    readonly string name;
    readonly string department;
    int salary;
|};

type Employee2 record {|
    readonly [string, string] name;
    int salary;
|};

public function case1() {
    table<Employee> key<[string]> t1 = table key(name)[
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "",salary: 200}
    ]; // ALLOWED

    table<Employee2> key<[string, string]> t3 = table key(name)[
        {name: ["John", ""], salary: 100},
        {name: ["Jane", ""],salary: 200}
    ]; // NOT ALLOWED.

    table<Employee2> key<[[string, string]]> t2 = table key(name)[
        {name: ["John", ""], salary: 100},
        {name: ["Jane", ""],salary: 200}
    ]; // ALLOWED.
}

// Conclusion: KC being a tuple has special meaning. Tuple fixed member size = no of items in composite key

lochana-chathura avatar Jul 02 '24 08:07 lochana-chathura

// CASE 2:

type Employee record {|
    readonly string name;
    readonly string department;
    int salary;
|};

public function case2() {
    table<Employee> key<[string]> a = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "",salary: 200}
    ];

    table<Employee> key<string> a1 = a; // NOT ALLOWED. BUG?

    table<Employee> key<string> b = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "",salary: 200}
    ];

    table<Employee> key<[string]> b1 = b; // NOT ALLOWED. BUG?
}

lochana-chathura avatar Jul 02 '24 08:07 lochana-chathura

// CASE 3:

type Employee record {|
    readonly string name;
    readonly string department;
    int salary;
|};

public function case3() {
    table<Employee> key(name, department) a = table key(name, department) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key(name) a1 = a; // NOT ALLOWED. BUG?

    table<Employee> key<[string, string]> b = table key(name, department) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key<string> b1 = b; // NOT ALLOWED. BUG? 
    table<Employee> key<[string]> b2 = b; // NOT ALLOWED. BUG? 

    table<Employee> key(name) c = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key(name, department) c1 = c; // NOT ALLOWED. OK.

    table<Employee> key<[string]> d = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key<[string, string]> d1 = d; // NOT ALLOWED. OK.
}

lochana-chathura avatar Jul 02 '24 08:07 lochana-chathura

CASE III is not a BUG. e.g.

table<Employee> key(name, department) a = table key(name, department) [
    {name: "John", department: "IT", salary: 100},
    {name: "John", department: "finance", salary: 200}
];

This being assignable to table<Employee> key(name) would not make the name a unique key.

lochana-chathura avatar Jul 02 '24 11:07 lochana-chathura

CASE III is not a BUG. e.g.

Based on the above, the new model would be,

  • KS - Key Specifier
  • KC - Key Constraint
  • KST - KeySemtype

In general, table<C1> key(KS1)/key<KC1> <: table<C2> key(KS2)/key<KC2> iff table<C1> <: table<C2> & KeySemType1 <: KeySemType2

Keyless table table<C> can be modeled as C[]. Thefore the keyed-table can be modeled using a tuple [C[], KST]

How do we model KST?

Basically, we encounter 4 scenarios when performing subtype checking.

Sincetable<C> key(KS)/key<KC> <: table<C>, in all cases, KS = undef and KC=never/undef, KST has to be modeled as ANYDATA(key's top type) or ANY(simpler).

Case 1: key(f'1, f'2, ...) <: key(f"1, f"2, ...) iff KST1 <: KST2

where KST is either record {|T1 f1; T2 f2; ...;|} or [[T1, "f1"], [T2, "f2"], ...]

Case 2: key<KC1> <: key<KC2> iff KST1 <: KST2

where KST = KC. This is straightforward.

Case 3: key(f1, f2, ...) <: key<KC> iff KST1 <: KST2

KST1 is either record {|T1 f1; T2 f2; ...; |} or [[T1, "f1"], [T2, "f2"], ...]

KST2?. If KC is a tuple. KST2 can be modeled as [[T'1, any], [T'2, any], ...], else KST = KC

Case 4: key<KC> <: key(f1, f2, ..) iff KST1 <: KST2

KST2 is either record {|T1 f1; T2 f2; ...;|} or [[T1, "f1"], [T2, "f2"], ...]

KST1? If KC is a tuple it is modeled as [[T'1, "f1"], [T'2, "f2"], ....], else KST1=KC

lochana-chathura avatar Jul 02 '24 12:07 lochana-chathura

Documented summary can be found here.

lochana-chathura avatar Jul 02 '24 14:07 lochana-chathura

Based on ballerina-platform/ballerina-spec#1310, revised table semtype model can be found here.

lochana-chathura avatar Jul 16 '24 08:07 lochana-chathura

Closing with #43092

lochana-chathura avatar Aug 07 '24 10:08 lochana-chathura