asMSX Use memory buffers instead of temporary disk files to pass input to flex scanners and bison parser

I don't think it's a good idea to write a file just so we can pass it to flex scanners or bison parser.

Here is a bit from flex documentation:

Three routines are available for setting up input buffers for scanning in-memory strings instead of files. All of them create a new input buffer for scanning the string, and return a corresponding YY_BUFFER_STATE handle (which you should delete with yy_delete_buffer() when done with it). They also switch to the new buffer using yy_switch_to_buffer(), so the next call to yylex() will start scanning the string.

yy_scan_string(const char *str) - scans a NUL-terminated string. yy_scan_bytes(const char *bytes, int len) - scans len bytes (including possibly NUL's) starting at location bytes.

Note that both of these functions create and scan a copy of the string or bytes. (This may be desirable, since yylex() modifies the contents of the buffer it is scanning.) You can avoid the copy by using:

yy_scan_buffer(char *base, yy_size_t size) which scans in place the buffer starting at base, consisting of size bytes, the last two bytes of which must be YY_END_OF_BUFFER_CHAR (ASCII NUL). These last two bytes are not scanned; thus, scanning consists of base[0] through base[size-2], inclusive. If you fail to set up base in this manner (i.e., forget the final two YY_END_OF_BUFFER_CHAR bytes), then yy_scan_buffer() returns a nil pointer instead of creating a new input buffer. The type yy_size_t is an integral type to which you can cast an integer expression reflecting the size of the buffer.

Mar 31 '19 02:03 oboroc

I agree. If we're able to scan the buffer directly, it would be faster.

Mar 31 '19 15:03 Fubukimaru

Speed is mostly irrelevant, since it is fast enough even with files. But reading input from memory would make building test cases for scanner and parser much easier.

Mar 31 '19 15:03 oboroc

https://man7.org/linux/man-pages/man2/mmap.2.html

Dec 06 '20 23:12 duhow

Hello!

On the other hand, I have been using these temporal files for debugging. Maybe we can find a middle ground on this.

Jan 10 '21 09:01 Fubukimaru

I swear I was dreaming with this topic tonight :joy:

I suggest to use an array of "default temp folders", in Linux it would be great to use /dev/shm as it is a tmpfs filesystem that uses RAM memory, so you can store files in there.

- /dev/shm
- /var/tmp
- /tmp
- . # current dir

So for each dir, try to create a new tempdir asmsx-tmp, chmod 700, and create files in there. Mac OS X doesn't have this feature native, despite it allows to create RamDisk units, but this is not intended to be run everytime we execute the program. As such, having this loop dir, will fallback default to /tmp.

Last, Windows can also have this feature by using Memory-Mapped Files: https://docs.microsoft.com/es-es/dotnet/standard/io/memory-mapped-files

Or just again, store in same folder as the executable.

You can play with build parameters to specify this function to be available if binary is built for Linux, Mac or Windows. Let me know if you need help to edit the Makefile for that, but should be easy to change :)

Jan 10 '21 10:01 duhow

I understand it but the point here is not to make it faster using RAM, as the files are already tiny in modern computer terms. The main point was to have it in memory to have test functions that could get the result directly.

Anyways, thanks for the input!

Jan 10 '21 19:01 Fubukimaru

Skipping this ✌️

Nov 04 '23 15:11 Fubukimaru

asMSX asMSX copied to clipboard

Use memory buffers instead of temporary disk files to pass input to flex scanners and bison parser

asMSX
asMSX copied to clipboard