xmlservice icon indicating copy to clipboard operation
xmlservice copied to clipboard

XMLSERVICE truncating very large xmlOutput

Open Wikus86 opened this issue 6 months ago • 17 comments

Hi,

I have a program that returns a ds array of 99999. Seems like the data is getting truncated.

I can on my side just decrease the array elements as it is too big (old program). Just thought I will post it if you would like to fix it.

I tried to trace where it goes wrong, I think it is in the xmlservice-cli, but I have no idea how to debug it....

Looks like the max size is as below

Image

Thanks,

Wikus86 avatar May 05 '25 06:05 Wikus86

Isn't there still a 16gb limit in RPG on memory ? Or does it work with ODBC transport ?

I recall compiling RPG with Teraspace flags before. I think...

richardschoen avatar May 05 '25 11:05 richardschoen

https://www.ibm.com/docs/en/i/7.4.0?topic=model-specifying-teraspace-storage

richardschoen avatar May 05 '25 11:05 richardschoen

it does work when i set the transport mode to odbc and rest, just an issue when setting it to ssh

Wikus86 avatar May 05 '25 11:05 Wikus86

It would appear to be a current limit.


    int xmlout_len = (16 * 1024 * 1024);
    char* xmlout = (char*) malloc(xmlout_len);

    if(!xmlout) {
        perror("error allocating XML output buffer");
        return 1;
    }

https://github.com/IBM/xmlservice/blob/main/utils/xmlservice-cli.c

richardschoen avatar May 05 '25 12:05 richardschoen

Yeah I saw that, but correct me if i am wrong, 16 *1024 *1024 = 16777216 and the size of xmlOutput(the trancated output) is 12311439, giving me 4465777 bytes to play with? Or am I missing something?

Wikus86 avatar May 05 '25 12:05 Wikus86

xmlservice-cli goes through the RUNASCII function, which uses an internal buffer to convert to EBCDIC. This buffer is only 15M large: https://github.com/IBM/xmlservice/blob/main/src/plugrun.rpgle#L162-L163 (5001 * 3000 = 15,003,000 bytes). This still doesn't explain why it's truncating under 12M, though. What is the output in your screenshot from?

kadler avatar May 05 '25 14:05 kadler

xmlservice-cli goes through the RUNASCII function, which uses an internal buffer to convert to EBCDIC. This buffer is only 15M large: https://github.com/IBM/xmlservice/blob/main/src/plugrun.rpgle#L162-L163 (5001 * 3000 = 15,003,000 bytes). This still doesn't explain why it's truncating under 12M, though. What is the output in your screenshot from?

Hi, here is a link to the output https://github.com/Wikus86/xmlservicetest/blob/main/xmloutput.txt

Wikus86 avatar May 05 '25 14:05 Wikus86

That's good, but I meant the screenshot of size/size on disk. What were those stats from? Did you save the output to a file and just looking at file size or something?

kadler avatar May 05 '25 14:05 kadler

That's good, but I meant the screenshot of size/size on disk. What were those stats from? Did you save the output to a file and just looking at file size or something?

Ahh sorry, yeah that is exactly what i did.

Also looked at the length of the output which is 12311439

Image

Wikus86 avatar May 05 '25 14:05 Wikus86

What are the ds elements defined as?

kadler avatar May 05 '25 14:05 kadler

Image

Wikus86 avatar May 05 '25 14:05 Wikus86

I forgot that XMLSERVICE adds all that in the xml output anyway, lol. FYI, the XML output you shared is 12M of just XML metadata since it looks like all the records are empty. Assuming output is only ASCII characters, your max output size is 21M since the XML metadata for each entry is 130 bytes.

Is the SUBGROUP_LENGTH the number of valid entries in the array?

kadler avatar May 05 '25 14:05 kadler

That is a valid point. Yeah I think the reason For the length field is to determine how many entries there are then the calling program knows there are only x no of entries to loop through.

Wikus86 avatar May 05 '25 14:05 Wikus86

If that's the case, you could use dou/enddo to limit the amount of data returned. Assuming you never really need to return all 99,999 possible rows, this might be a good workaround.

You could try something like

pgm.addParam({
  type: 'ds',
  io: 'out',
  fields: [
    { type: '10i0', name: 'subgroup_length', value: 0, enddo: 'count' },

    { type: 'ds', name: 'subgroup', dim: 99999, dou: 'count', fields: [
      { type: '15A', name: 'subgroupcode', value: '' },
      { type: '70A', name: 'description', value: '' },
    ]},
  ]}
);

kadler avatar May 05 '25 15:05 kadler

I saw the dou in the documentation but could not wrap my head around how it works .

Thanks, let me give that a try and will let you know.

On that note, is there a way to strip out elements that are not populated by using something like dou without a length field?

Wikus86 avatar May 05 '25 15:05 Wikus86

I saw the dou in the documentation but could not wrap my head around how it works .

Yeah, it's confusing like so much of XMLSERVICE.

On that note, is there a way to strip out elements that are not populated by using something like dou without a length field?

I don't think that works. The dou/enddo is designed to be like an RPG loop counter. It loops over the array and only returns the number of elements in the length field. If you don't have the length field, I think it just doesn't do anything. How would XMLSERVICE know which elements to strip out?

kadler avatar May 05 '25 15:05 kadler

Thanks the dou works, just breaking some logic in my program, but that is my problem lol :)

I don't think that works. The dou/enddo is designed to be like an RPG loop counter. It loops over the array and only returns the number of elements in the length field. If you don't have the length field, I think it just doesn't do anything. How would XMLSERVICE know which elements to strip out?

lol, was just trying my luck. Will try build something on my side for that. Thanks for all the assistance Very much appreciated.

Wikus86 avatar May 05 '25 15:05 Wikus86