bcc icon indicating copy to clipboard operation
bcc copied to clipboard

Operator overloading: slices [begin .. end]

Open GWRon opened this issue 3 years ago • 3 comments

When I saw that @scaremonger used the "myTIntMap[intKey]"-array-approach I asked myself if it would be useable to have the "array slicing" being kind of useable too.

newIntMap = oldIntMap[beginKey .. endKey]
'or maybe only (which would be "more similar" to arrays
newIntMap = oldIntMap[beginIndex .. endIndex]

This way it would be really nifty to extract eg a subset from an "sparse ID map" (1,5,6,10,11,12 -> oldMap[5 .. 11] would extract 5,6,10,11) - with array like behaviour one needed to find the "index/position" of a given key first.

Of course you might be able to do similar stuff in a T(xyz)Map on your own with a custom "extractMap(condition...)" function/method.

So just see this as a suggestion / brain fart / idea. Maybe it has benefits I am not even thinking about - or culprits not solveable to easily :)

... or is "slicing" overrideable already?

GWRon avatar Oct 25 '22 08:10 GWRon

... or is "slicing" overrideable already?

It isn't.

I considered this idea before, and I think that rather than adding .. as an overloadable operator, a better solution would to add a new "range" type to the standard library, which would represent expressions of the form x..y.

In other words, BRL.Blitz could get a new struct SRange like this:

Struct SRange
	Field ReadOnly startIndex:Int
	Field ReadOnly endIndex:Int
	
	Method New(startIndex:Int, endIndex:Int)
		Self.startIndex = startIndex
		Self.endIndex = endIndex
	End Method
End Struct

and the .. operator would just become syntactic sugar to create an instance of this struct:

myNewCollection = myOldCollection[5..11]
' is now the same as:
myNewCollection = myOldCollection[New SRange(5, 11)]

Since we already have overloadable index operators, with this, any collection type could support slicing by simply adding a Method Operator []:TMyCollection(range:SRange). No special syntax for "slice overloads" would be needed. Another upside is that you could store ranges in variables, pass and return them from functions etc., like any other value. The built-in array types would be updated to make this work for arrays as well:

Local a:Int[] = [1, 2, 3, 4, 5]
Local r:SRange = 1..3
Local b:Int[] = a[r]

The built-in array types would be updated to make code like this work with arrays and collections alike.

Issues:

  • How to support open ranges (ranges with an unspecified start or end or both), like 5..? When you read the values of a range, you would need some way of knowing if the start/end index actually exists, so an Int field for each them is not enough. Ideally, you wouldn't even be able to accidentally read the Int at all in this case. Possible solutions:
    • SRange could have additional boolean fields Field ReadOnly hasStartIndex:Int and Field ReadOnly hasEndIndex:Int. This wouldn't be a very clean solution though: you would still be able to access the startIndex field even when it shouldn't exist, and using it would cause bugs if you ever forgot to check hasStartIndex first.
    • Four different structs instead of just SRange: one for x..y, x.., ..y, .. each. This seems like it would be really bad to work with though (for example, you would have to write four Method Operator [] for your collection to support them), so this is probably a bad idea.
    • If BlitzMax had a generic Nullable/Maybe/Option type, the obvious solution would be to make the start/end index fields optional: Field ReadOnly hasStartIndex:SOptional<Int>. This means that that feature would have to be implemented first though (which has its own design challenges in terms of how it would interact with BlitzMax's existing Null and type casting).
    • Add Field ReadOnly hasStartIndex:Int and Field ReadOnly hasEndIndex:Int as above, but make startIndex and endIndex private and add two public methods Method GetStartIndex:Int() and Method GetEndIndex:Int(). These would return the index if it exists, but throw an exception if it doesn't, making mistakes easier to notice.
    • Represent the start and end index as another new struct: Field ReadOnly startIndex:SIndex. This looks a bit more orderly, but only really moves the problem down into SIndex, where it would still have to be solved with one of the other approaches. (e.g.: If myRange.startIndex.exists Then DoSomethingWith(myRange.startIndex.Value()) instead of If myRange.hasStartIndex Then DoSomethingWith(myRange.GetStartIndex())
  • What about indices with a type other than Int?
    • Maybe something requires very large collections that supports Long indices. Maybe ranges based on Float or Double, or some custom (number?) type could be useful somewhere. Considering that, should the struct actually be SRange<T> rather than SRange?

HurryStarfish avatar Oct 25 '22 14:10 HurryStarfish

Local a:Int[] = [1, 2, 3, 4, 5]
Local r:SRange = 1..3
Local b:Int[] = a[r]

"1..3" -- think this sooner or later might clash with ".." line concatenation if you allowed Local r:SRange = 1.. ("open end").

Regarding your proposed solutions: I consider the first option

Field ReadOnly hasStartIndex:Int and Field ReadOnly hasEndIndex:Int

as the "most simple" one - and also something we Blitzmax users are "used to" (we are used to primitives having a default "null" value and no real "None/Null")

GWRon avatar Oct 25 '22 15:10 GWRon

think this sooner or later might clash with ".." line concatenation

I can't really think of any realistic situation where this would result in code that still compiles (and does the wrong thing), so you'd at least notice it from getting a compile error... but yes, it might. In that case, you would have to write Local r:SRange = (1..) instead.

I consider the first option as the "most simple" one

It would be the simplest to implement right now, that's true. Personally, my favourite would be the third option (SOptional). Unfortunately, that would require such a feature first. But on the other hand, it would be a really useful feature to have in general.

Oh yeah, yet another option I forgot mention is the way it's done in .NET (this was implemented after I first considered this for BlitzMax). Their solution:

  • Range and Index structs
  • Index has two properties:
    • Value, which is an Int32
    • IsFromEnd, which is a Boolean

This means they don't use nullable/optional, and they have the additional feature of "counting from the end": a[...^3] in C# means "give me a slice of everything except the last 3 elements". But in exchange, it has the disadvantage that it isn't possible to distinguish [..x] from [0..x] (so collections always have to start at 0), and it only supports Int32.

HurryStarfish avatar Oct 25 '22 16:10 HurryStarfish