Cursively
Cursively copied to clipboard
Read sync input as IEnumerable<ImmutableArray<string>>
On CsvSyncInputBase
:
public IEnumerable<ImmutableArray<string>> AsEnumerableFromUTF8()
{
return AsEnumerableFromUTF8(UTF8FieldDecodingParameters.Default); // CsvReaderVisitorWithUTF8HeadersBase.DefaultMaxHeaderLength
}
public virtual IEnumerable<ImmutableArray<string>> AsEnumerableFromUTF8(UTF8FieldDecodingParameters fieldDecodingParameters)
{
// ...
}
Base implementation probably needs to spin up a thread so that the visitor methods can block while our consumer processes the results.
That said, I think every subclass can do a lot better by reimplementing their input reading logic.
Hmm, a more flexible API would be to have just one virtual method that returns something that can be configured like how the input classes work.
I kinda want to limit the default max field length and let callers control it.
Then again, I've come down elsewhere on the side of, "if you have some encoding other than UTF-8, then most helpers won't work for you". So only one parameter is needed.
Ugh. The ideal would be IEnumerable<IEnumerable<ReadOnlySpan<char>>>
so that the consumer can choose whether or not to do a .ToString()
themselves, at no meaningful extra cost on top of us allocating the string
.
But of course, ReadOnlySpan<char>
is a ref struct
, so I can't do exactly that. ReadOnlyMemory<char>
can get captured and stored, so I can't safely pool allocations there.
Custom ref struct
s with GetEnumerator()
can't be used with LINQ, and this feature comes down on the "productivity" side of things rather than the "bleeding-edge performance" side of things.
And IEnumerable<IEnumerable<string>>
is slightly less productive than IEnumerable<string[]>
, but the latter suggests a parameter to control the max number of fields in a record.
Ugh.
So only one parameter is needed.
Ugh. Of course that's not true: we also have DecoderFallback
.
Ugh.
IEnumerable<ImmutableArray<string>>
might make more sense than IEnumerable<IEnumerable<string>>
?
Someone might pull the next element from the outer sequence before fully consuming the previous element from the inner sequence (which is really easy to do with .AsParallel
), so the latter approach would require the outer enumerable's state machine to ensure that the current inner enumerable's records get buffered before moving on.