rmkaplan

Results 360 comments of rmkaplan

I did this last summer, for internal XCCS to file UTF8/Unicode, in that particular sense this can be closed. But this really should be generalized. What is needed is a...

One further issue: The case array. The current implementation seems to apply the case array to the raw bytes of the string and the file and then test the results...

The problem is whether you are looking up bytes or characters. > On May 8, 2021, at 12:16 PM, Larry Masinter ***@***.***> wrote: > > > the efficient way to...

Right, and that’s what presumably the search string and the file both contain. But the file-characters are coded in different byte representation and with different mappings into the internal character-coding...

Yes, this is still screwed up, I was looking at it last night and this morning. (But then I got distracted by another glitch: (OPENSTREAM T ‘OUTPUT) produces a stream...

I thought I should say more about the current (= forever) issues with FILEPOS. It is described as behaving like STRPOS, except that it searches files instead of other strings....

The current code is incorrect in another way. It returns the wrong byte position if the search pattern begins with a SKIP character. I think it has a needless optimization...

As noted in the comments above, FILEPOS is currently only a byte-sequence searcher, highly optimized (and even then providing incorrect results if the search pattern begins with the skip byte)....

What is "right" presumably involves the equivalent of a finite-state transducer in the middle of file searching, with importing or creating a whole bunch of code to interpret Unicode features...

On further thought, though, it seems that the correct search of any NS-format file will always be the (relatively) slower (character, not byte, matching) case. Unlike UTF-8, ISO8859 etc., the...