haxe icon indicating copy to clipboard operation
haxe copied to clipboard

Array<T> variance on static targets

Open Simn opened this issue 9 years ago • 66 comments

To avoid fragmentation I'm consolidating multiple issues here that likely have the same cause and solution: #2103 #2579 #3429 #4187 #4340 #4424 #4783

Edit NC, added #5799

Simn avatar Feb 23 '16 11:02 Simn

I have studied quite in detail the problem of variance with Arrays, and the particular case of Array<Dynamic>, as part of my work on HL.

I came up with the following design that I would like to be generalized on all "static" targets that have type-specific arrays (usually for basic types such as Array<Float>), so that would concern C++,C#,Java, and HL.

I define several Array classes implementations:

  • ArrayAccess (not the haxe one) is the base class for all arrays. It only has two methods : getDyn() and setDyn() operating on dynamics. Not even length (we will come to that afterwards). It used when doing array accesses on a Dynamic value.
  • ArrayBase is the base class for all arrays implementations EXCEPT ArrayDyn. Actually ArrayDyn will just provide a wrapper for an actual ArrayBase implementation. ArrayBase provides more operations such as pushDyn(), slice, etc. that can be called by ArrayDyn
  • Each specific Array then implements ArrayBase. In HL I have ArrayBasic<Float>, ArrayBasic<Int> which uses @:generic, then ArrayObj which contains any kind of "pointer" object. They implement both the "classic" Haxe Array API and some extra *Dyn() functions that comes from ArrayBase
  • Finally, ArrayDyn is the haxe Array<Dynamic> implementation, which operates on an underlying ArrayBase.

The trick is the following:

When creating a new ArrayDyn, we have an extra flag in it which is named allowReinterpret, which means that when we cast such ArrayDyn to an Array<Int> for instance, we will first reinterpret the underlying ArrayObj to an ArrayBasic<Int> before returning it, then set allowReinterpret to false, which means that after the first cast to a known type, the Array type becomes fixed, which prevent copying the whole array on each of such casts.

I think this cover all the problematic cases (JSON, Serialization, etc.)

This requires two kind of runtime support/emulation:

  • casting between Array types needs to be implemented/detected. I'm doing that in a dynamic fashion in HL by having an extra __cast method
  • accessing ArrayDyn.length needs to return the length of the underlying Array, even when typed as Dynamic. I'm implementing it using __get_field in HL. That's the reason length is not in ArrayAccess.

The full source code for HL solution is available here: https://github.com/HaxeFoundation/haxe/tree/hl/std/hl/types (ArrayBase, ArrayObj, ArrayDyn)

I would like to hear about @hughsando and @waneck comment on this, then if it's fine with both of you, I can write unit tests for it so we make sure that this behavior is correctly standardized on all static platforms.

ncannasse avatar Feb 23 '16 11:02 ncannasse

PS: one last thing I haven't yet worked on is referential equality : we also want our ArrayDyn == its underlying ArrayBase.

ncannasse avatar Feb 23 '16 11:02 ncannasse

I have implemented a similar thing with "VirtualArray" The array starts off unknown, and then as you add stuff to it, the underlying representation changes until you ultimately cast it, and then it becomes fixed. So if you add a bunch of ints, it is an int array. If you then add a float, it becomes a float array. If you add a null, it becomes a Dynamic array. This should help by avoiding boxing in a lot of cases. You can try this in hxcpp target by swapping the commenting of these likes: "cpp::ArrayBase" ^ suffix (* "cpp::VirtualArray" ^ suffix *) (side note - can have a global variable to switch this in ocaml, or do I need to pass a context everywhere?)

It seems to work ok, but I have not got it working in cppia.

On Tue, Feb 23, 2016 at 7:59 PM, Nicolas Cannasse [email protected] wrote:

PS: one last thing I haven't yet worked on is referential equality : we also want our ArrayDyn == its underlying Array.

— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-187671417 .

hughsando avatar Feb 23 '16 12:02 hughsando

@hughsando the problem with this approach is that your array keep changing its underlying type, for instance if you add two floats, then cast to Array<Int>, then add floats again, what is the behavior ? My proposal is that the array is either in an "unkown state" (dynamic) or in a known state, and once it is it cannot change anymore. And since the first cast to an not-Array-Dynamic will make it a known state, it guarantee that you never happen to have several copies of the underlying array which does not preserve side effects.

ncannasse avatar Feb 23 '16 20:02 ncannasse

The casting only ever happens in one direction: empty -> Int -> Float -> Dynamic (maybe skip Float or Int), or empty -> String -> Dynamic or empty -> Bool -> Dynamic. or just empty -> Dynamic. As soon as you cast, it locks it in forever (like your solution). The good thing about this is that the final cast (eg, to Array<Int>) is usually a no-op if you have only added ints - ie, zero changes. If you add a whole lot of ints, a single float and then an object, you could get 2 transitions, where perhaps none was required. But it is more likely to be a homogeneous array, so I would think that this rarer than the "not need to do anything" case. Adding two floats and then casting to Ints shows a breakdown of the compiler logic so throwing an error might be appropriate, but currently it will perform Float->Int and lock it in.

On Wed, Feb 24, 2016 at 4:01 AM, Nicolas Cannasse [email protected] wrote:

@hughsando https://github.com/hughsando the problem with this approach is that your array keep changing its underlying type, for instance if you add two floats, then cast to Array<Int>, then add floats again, what is the behavior ? My proposal is that the array is either in an "unkown state" (dynamic) or in a known state, and once it is it cannot change anymore. And since the first cast to an not-Array-Dynamic will make it a known state, it guarantee that you never happen to have several copies of the underlying array which does not preserve side effects.

— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-187878494 .

hughsando avatar Feb 24 '16 01:02 hughsando

Ok, that seems good to me, I could indeed optimize HL this way as well, it's not something really detectable by the enduser anyway.

Any reason it has not been actived by default in hxcpp yet ?

Also, still waiting to hear from @waneck :)

ncannasse avatar Feb 24 '16 12:02 ncannasse

Testing and cppia mainly - I wrote it, but did not have time to put out fires if it went wrong. Now I've moved over to "haxe mode", I will try to git it in in the next week or two.

Hugh

On Wed, Feb 24, 2016 at 8:49 PM, Nicolas Cannasse [email protected] wrote:

Ok, that seems good to me, I could indeed optimize HL this way as well, it's not something really detectable by the enduser anyway.

Any reason it has not been actived by default in hxcpp yet ?

Also, still waiting to hear from @waneck https://github.com/waneck :)

— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-188240459 .

hughsando avatar Feb 25 '16 01:02 hughsando

I just wanted to start adding tests for this but didn't get very far:

class Main {
    static function main() {
        var a:Array<Dynamic> = [1, 2, 3];
        trace("I'm just here to prevent the analyzer fusion");
        var b:Array<Int> = cast a;
        trace("Yeah me too");
        trace(a == b); // false
    }
}

@hughsando: Wasn't this supposed to work now?

Simn avatar Mar 29 '16 08:03 Simn

Looks like it is an issue with "op==". If you do: b.push(4); trace(a); you get 1,2,3,4

hughsando avatar Mar 29 '16 08:03 hughsando

I added the operators to hxcpp.

hughsando avatar Mar 29 '16 12:03 hughsando

That helped with the direct ==, but it still fails if we hide the arrays behind type parameters:

class Main {
    static function main() {
        var a:Array<Dynamic> = [1, 2, 3];
        trace("I'm just here to prevent the analyzer fusion");
        var b:Array<Int> = cast a;
        trace("Yeah me too");
        eq(a, b);
    }

    static function eq<T>(e1:T, e2:T) trace(e1 == e2); // false
}

Simn avatar Mar 29 '16 12:03 Simn

P.S @ncannasse: That fails on HL too, even in the interpreted version.

Simn avatar Mar 29 '16 12:03 Simn

There's not much of a point in me adding a test that fails on multiple targets. I'll assign this issue to the 3.3.0 final milestone so we can see what we want to do here after the RC.

Simn avatar Apr 02 '16 15:04 Simn

@Simn which targets apart hxcpp are concerned ? Java and C# I presume ?

@hughsando could you look into it for RC? I think this kind of non-equality could break people code when trying RC.

ncannasse avatar Apr 02 '16 16:04 ncannasse

I think hxcpp is fine now actually, so it's probably Java, C# and HL.

Simn avatar Apr 02 '16 16:04 Simn

Please submit the unit tests with #if !(hl || java || cs) for the parts that fail for now, and open a separate issue for these.

ncannasse avatar Apr 02 '16 18:04 ncannasse

I don't like these #if tests, it's too easy to forget about them. I've added it without the conditional compilation to a separate branch so we can forget about that instead.

Simn avatar Apr 02 '16 19:04 Simn

As expected the test fails on C# and HL: https://travis-ci.org/Simn/haxe/builds/120320083

Simn avatar Apr 02 '16 20:04 Simn

I'm assigning to @waneck because HL is not ready for 3.3 anyway

ncannasse avatar Apr 03 '16 13:04 ncannasse

I won't be able to get this done for 3.3. Pushing to 3.4

waneck avatar Apr 04 '16 00:04 waneck

Moved from https://github.com/HaxeFoundation/haxe/issues/5467 .

It looks like C# implementation copying all fields of original Array. (Using reflection for some reason). Doesn't that cause more issues? That's slow, and newly created array no longer refers to the same array. I don't want Haxe to copy Array in that case.

vroad avatar Jul 18 '16 12:07 vroad

Yes, this is #2103.

nadako avatar Jul 18 '16 12:07 nadako

Is reflection always needed to achieve this? I don't know where reflection calls came from. Are they automatically generated by Haxe?

vroad avatar Jul 18 '16 13:07 vroad

I think hxcpp is fine now actually, so it's probably Java, C# and HL.

@Simn Both of the test cases you posted here still trace false for me on hxcpp (Haxe 3.4.0 and hxcpp 3.4.49). Or was it fixed at some point and then broken again?

Gama11 avatar Feb 01 '17 11:02 Gama11

I have checked hxcpp again and both tests work there now. I also made sure that the analyzer doesn't inadvertently fix it.

I didn't test HL because all I ever get out of that these days is "Unsupported bytecode version". It's definitely still broken on Java and C# though.

Simn avatar May 03 '18 10:05 Simn

My CI says this fails on C# and PHP of all things.

Simn avatar May 03 '18 11:05 Simn

I confirmed HL to still be failing here as well.

Simn avatar May 03 '18 11:05 Simn

I have fixed HL support for this specific test in be1a102a910ec8446dce8c325dc23aeb9a768033

ncannasse avatar May 08 '18 14:05 ncannasse

This might break some runtime wrt Array<Dynamic> handlings, we need proper unit testing for Haxe 4.0, we will discuss if we can make java/c# compliant in time but we need to pass the tests on other platforms.

ncannasse avatar May 08 '18 20:05 ncannasse

@RealyUniqueName Could you look at the PHP failure?

class Main {
	@:analyzer(ignore)
	static public function main() {
		var a:Array<Dynamic> = [1, 2, 3];
		var b:Array<Int> = cast a;
		b.push(4);
		trace(a[3]); // null
	}
}

Note that the @:analyzer(ignore) is required to reproduce this.

Simn avatar Jun 05 '18 17:06 Simn

Java shouldn't have any issue like that.

As for C#, my real issue with implementing this is how should we deal with user-created type parameters? The possible solutions I can think are:

  1. We don't generate type parameters for hxGen C# types (e.g. make erase_generics the default)
  2. We keep our old behavior, so that Array is a special case
  3. We try to generalize the way we handle Array so it works for any generic class

I can't think of a way to generalize the Array handling, but I'd be interested to hear ideas. As for between 1 and 2, I'm also not sure what's best. Does anyone have strong opinions regarding them?

waneck avatar Jun 05 '18 17:06 waneck

What problems does erase_generics cause?

Simn avatar Jun 05 '18 17:06 Simn

I think it's mostly about performance and generated code compatibility with native C# code

waneck avatar Jun 05 '18 18:06 waneck

Arrays are special - in particular Array<Dynamic> is the one thing the the whole haxe type system that is inconsistent. I think if the above was "var a:Array = [1,2,3]" it would make everything consistent and logical for developers and backend writers Or maybe "IArray" interface. The solution hxcpp (and hl?) uses is to use an extra level of indirection for Array<Dynamic> - this punishes Arrays-of-actual-Dynamic because there is not way to tell it apart from the "magical array that can be an array-of-int" shown here.

On Wed, Jun 6, 2018 at 1:36 AM, Cauê Waneck [email protected] wrote:

Java shouldn't have any issue like that.

As for C#, my real issue with implementing this is how should we deal with user-created type parameters? The possible solutions I can think are:

  1. We don't generate type parameters for hxGen C# types (e.g. make erase_generics the default)
  2. We keep our old behavior, so that Array is a special case
  3. We try to generalize the way we handle Array so it works for any generic class

I can't think of a way to generalize the Array handling, but I'd be interested to hear ideas. As for between 1 and 2, I'm also not sure what's best. Does anyone have strong opinions regarding them?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-394796613, or mute the thread https://github.com/notifications/unsubscribe-auth/ABlp1lM3ircLKQDruzIvat-8RN-xTQjzks5t5sG0gaJpZM4Hgg8I .

hughsando avatar Jun 06 '18 00:06 hughsando

@RealyUniqueName Could you look at the PHP failure?

That's because Array.push() is inlined in std of php backend. That push is generated as:

$b = $a;
$this1 = $b->arr;
$idx = $b->length;
$this1[$idx] = 4;
++$b->length;

And $b->arr is a native PHP array. And native arrays are passed by copy, not by reference. I think I can fix this in std without removing inline. But this is not ok to automatically introduce temp vars for accessing object fields. I guess it also affects native structs on C# target.

RealyUniqueName avatar Jun 06 '18 06:06 RealyUniqueName

I've updated the branch again:

  • PHP: src/unit/TestArray.hx:12: 4 should be null
  • HL: src/unit/TestArray.hx:9: characters 5-11 : Don't know how to compare #hl.types.ArrayDyn and #hl.types.ArrayBytes_Int
  • C#: src/unit/TestArray.hx:9: false should be true src/unit/TestArray.hx:10: [1,2,3] should be [1,2,3,4] src/unit/TestArray.hx:11: 4 should be 3 src/unit/TestArray.hx:12: 4 should be 0

Simn avatar Sep 04 '18 16:09 Simn

Can you please point me to the branch? Can't find it )

RealyUniqueName avatar Sep 04 '18 18:09 RealyUniqueName

Sure, I keep these branches on my own GitHub because that tends to be better for CI: https://github.com/Simn/haxe/tree/array_variance_tests

Simn avatar Sep 04 '18 18:09 Simn

PHP fails because of the same problem described in my previous comment: https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-394954238

RealyUniqueName avatar Sep 05 '18 08:09 RealyUniqueName

Yes but is that for me or for you to fix? If it's a problem with temp vars we have to check if it can be reproduced manually (without inline).

Simn avatar Sep 05 '18 08:09 Simn

I think it's for you to fix. Because those temp vars will case the same problem for any field of non-scalar value-type.

RealyUniqueName avatar Sep 05 '18 09:09 RealyUniqueName

Then we have to mark the affected types with some metadata. Though I don't really like having to support copy-semantics just for this.

Note that this example also doesn't behave like the syntax would suggest:

class Main {
	@:analyzer(ignore)
	static public function main() {
		var a = [1, 2, 3];
		var b = @:privateAccess a.arr;
		b[3] = 4;
		trace(a[3]); // null
	}
}

Simn avatar Sep 05 '18 09:09 Simn

Note that this example also doesn't behave like the syntax would suggest

This is a defined behavior for php.NativeArray so it's up to the user to decide what to do with it. On the other hand inst.inlinedMethod() temp vars generation is out of the user's control, so it should behave the same way with and without inlining.

RealyUniqueName avatar Sep 05 '18 10:09 RealyUniqueName

This would mean that for certain types we never create a temp var. This is quite problematic for various cases I can think of.

IMO this has to be approached differently, e.g. by not using a field access expression on the native array type.

Simn avatar Sep 09 '18 19:09 Simn

Moved to 4.1 because it's a rather involved project.

Simn avatar Dec 01 '18 13:12 Simn

Any motion on this for the exciting 4.1 release? I'm hoping https://github.com/HaxeFoundation/haxe/issues/8314 gets cleared.

hoseyjoe avatar Feb 18 '20 00:02 hoseyjoe

This won't be resolved for 4.1. Which target are you concerned about?

Simn avatar Feb 18 '20 08:02 Simn

C# We started using haxe 4 in preview 3. We liked the syntax so we started incorporating it as we weren't going to release for a year. We figured nothing major would change. Now our Server side (C#) is compiling out of rc2 and our client side (JS) is compiling out of 4.0.5

hoseyjoe avatar Feb 19 '20 02:02 hoseyjoe

@kLabz What do you think about this? Any workaround you can recommend for #8314? I really don't want people to be stuck on RC versions.

Simn avatar Feb 19 '20 06:02 Simn

I don't think there's much that can be done except fix this issue.

The same kind of functionalities can be achieved, using c# utils (so not in a cross-target way), and without Array<T> (so it requires some rewrite):

import cs.system.collections.generic.List_1;
import newtonsoft.json.JsonConvert;

@:publicFields
class DataItem {
	var data:String;
	var strArray:List_1<String>;
	var intArray:List_1<Int>;
	var dynArray:List_1<Dynamic>;
	var anyArray:List_1<Any>;
}

class Main {
	public static function main():Void {
		var t:DataItem = JsonConvert.DeserializeObject(
			haxe.Json.stringify({data:"d", strArray:[], intArray: [], dynArray: [], anyArray: []})
		);

		var strArray = t.strArray;
		strArray.Add("via reference");
		t.strArray.Add("via getField");
		trace(strArray.Count); // 2
		trace(t.strArray.Count); // 2

		var strArray = t.strArray;
		strArray.Add("via reference");
		t.strArray.Add("via getField");
		trace(strArray.Count); // 4
		trace(t.strArray.Count); // 4

		var intArray = t.intArray;
		intArray.Add(0);
		t.intArray.Add(1);
		trace(intArray.Count); // 2
		trace(t.intArray.Count); // 2

		var dynArray = t.dynArray;
		dynArray.Add(0);
		dynArray.Add("test");
		trace(dynArray.Count); // 2
		trace(t.dynArray.Count); // 2

		var anyArray = t.anyArray;
		anyArray.Add(0);
		anyArray.Add("test");
		trace(anyArray.Count); // 2
		trace(t.anyArray.Count); // 2
	}
}

kLabz avatar Feb 19 '20 09:02 kLabz

@Simn Thanks. We are stable-ish. I assume it is a mighty pain in the arse issue since it has been lingering/hidden for 4 years.

hoseyjoe avatar Feb 27 '20 19:02 hoseyjoe

@Simn Guessing 4.2 is out?

hoseyjoe avatar May 28 '20 21:05 hoseyjoe

No.

Gama11 avatar May 29 '20 07:05 Gama11

Any magic moving on this pipe of love?

hoseyjoe avatar Dec 04 '20 21:12 hoseyjoe

Quarterly ping! Let me know how I can help if possible

hoseyjoe avatar Mar 31 '21 22:03 hoseyjoe

AFAIK nothing new here

RealyUniqueName avatar Apr 01 '21 18:04 RealyUniqueName

Just checking in on this

hoseyjoe avatar Oct 13 '21 03:10 hoseyjoe

Keeping the dream alive

hoseyjoe avatar Apr 16 '22 16:04 hoseyjoe

At this point I can say that unfortunately this won't be addressed for the C# target.

Simn avatar Apr 16 '22 16:04 Simn

Thank you @simn I appreciate the message. It seems like a big hole but apparently very difficult.

hoseyjoe avatar Apr 29 '22 16:04 hoseyjoe