xidel icon indicating copy to clipboard operation
xidel copied to clipboard

[Request] Integrate the EXPath Binary Module

Open Reino17 opened this issue 5 years ago • 16 comments

Hello Benito,

For a specific task I need a "bitwise exclusive or"-function, but I realized xidel doesn't have one. So I created a function for that.

I was wondering if, in addition to the EXPath File Module, you'd be interested in integrating the EXPath Binary Module as well. Then I can use bin:xor() instead (although for integers that would be bin:xor(string-to-base64Binary(<int>),string-to-base64Binary(<int>)) I guess... hmm).

In the meantime can I have your professional opinion on the function I created? The following also includes notes leading up to that function.

$ printf '%s\n' $((33 ^ 73))
104

$ xidel -se 'let $a:=(33,73) return system(x"bash -c ""printf $(({$a[1]} ^ {$a[2]}))""")'
104

$ xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    ($a,$b) ! x:integer-to-base(.,2)
  };
  local:bin-xor(33,73)
'
100001
1001001

$ xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    let $bin:=($a,$b) ! x:integer-to-base(.,2)
    return
    $bin ! string-length()
  };
  local:bin-xor(33,73)
'
6
7

$ xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    let $bin:=($a,$b) ! x:integer-to-base(.,2)
    return
    max($bin ! string-length())
  };
  local:bin-xor(33,73)
'
7

$ xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    let $bin:=($a,$b) ! x:integer-to-base(.,2),
        $len:=max($bin ! string-length())
    return
    $bin ! concat(
      string-join(
        for $x in 1 to $len - string-length() return 0
      ),
      .
    )
  };
  local:bin-xor(33,73)
'
0100001
1001001

$ xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    let $bin:=($a,$b) ! x:integer-to-base(.,2),
        $len:=max($bin ! string-length()),
        $val:=$bin ! concat(
          string-join(for $x in 1 to $len - string-length() return 0),
          .
        )
    return
    string-join(
      for $x in 1 to $len return
      if (substring($val[1],$x,1) eq substring($val[2],$x,1)) then 0 else 1
    )
  };
  local:bin-xor(33,73)
'
1101000

$ xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    let $bin:=($a,$b) ! x:integer-to-base(.,2),
        $len:=max($bin ! string-length()),
        $val:=$bin ! concat(
          string-join(for $x in 1 to $len - string-length() return 0),
          .
        )
    return
    x:integer(
      string-join(
        for $x in 1 to $len return
        if (substring($val[1],$x,1) eq substring($val[2],$x,1)) then 0 else 1
      ),
      2
    )
  };
  local:bin-xor(33,73)
'
104

declare function local:bin-xor($a as integer,$b as integer) as integer {
  let $bin:=($a,$b) ! x:integer-to-base(.,2),
      $len:=max($bin ! string-length()),
      $val:=$bin ! concat(
        string-join(for $x in 1 to $len - string-length() return 0),
        .
      )
  return
  x:integer(
    string-join(
      for $x in 1 to $len return
      if (substring($val[1],$x,1) eq substring($val[2],$x,1)) then 0 else 1
    ),
    2
  )
};

What do you think? Would you have done it differently?

Reino17 avatar Jul 08 '20 22:07 Reino17

you'd be interested in integrating the EXPath Binary Module as well.

Yes, but I got stuck implementing XPath 3.1

What do you think? Would you have done it differently?

you could improve it a little:

xidel -s --xquery '
  declare function local:bin-xor($a,$b){
    let $bin:=($a,$b) ! x:integer-to-base(.,2),
        $len:=max($bin ! string-length()),
        $val:=$bin ! concat(
          string-join( ( 1 to $len - string-length()) ! 0),
          .
        ),
        $v1 := $val[1],
        $v2 := $val[2]
       
    return
    x:integer(
      string-join(
        for $x in 1 to $len return
        if (substring($v1,$x,1) eq substring($v2,$x,1)) then 0 else 1
      ),
      2
    )
  };
  ((1 to 100000) ! local:bin-xor(33,73))[7]
'

let variables are faster than accessing sequence elements.

benibela avatar Jul 09 '20 14:07 benibela

That's a clever way to repeat a function 100000 times!

$ time xidel -s --xquery 'declare function local:bin-xor($a,$b){let $bin:=($a,$b) ! x:integer-to-base(.,2),$len:=max($bin ! string-length()),$val:=$bin ! concat(string-join((1 to $len - string-length()) ! 0),.) return x:integer(string-join(for $x in 1 to $len return if (substring($val[1],$x,1) eq substring($val[2],$x,1)) then 0 else 1),2)}; ((1 to 100000) ! local:bin-xor(33,73))[1]'
104

real    0m27.156s
user    0m0.015s
sys     0m0.015s

$ time xidel -s --xquery 'declare function local:bin-xor($a,$b){let $bin:=($a,$b) ! x:integer-to-base(.,2),$len:=max($bin ! string-length()),$val:=$bin ! concat(string-join((1 to $len - string-length()) ! 0),.),$v1:=$val[1],$v2:=$val[2] return x:integer(string-join(for $x in 1 to $len return if (substring($v1,$x,1) eq substring($v2,$x,1)) then 0 else 1),2)}; ((1 to 100000) ! local:bin-xor(33,73))[1]'
104

real    0m24.391s
user    0m0.015s
sys     0m0.000s

You're right. It's ±10% faster. But let's be honest, you won't notice any difference when you run the function just once. xidel is already extremely fast! ;)

Thanks so far.

Reino17 avatar Jul 09 '20 22:07 Reino17

That's a clever way to repeat a function 100000 times!

I had not intended to post this here

real 0m27.156s user 0m0.015s sys 0m0.015s

That is weird. On my laptop real and user are the same at 4.5s

xidel is already extremely fast! ;)

It is actually very slow, especially when you compare this int<->string conversion with native code. Xor in assembly is like thousand times faster

benibela avatar Jul 10 '20 19:07 benibela

I had not intended to post this here

Well, I'm glad you did. ;)

That is weird. On my laptop real and user are the same at 4.5s

Probably a Cygwin issue. I don't know.
(and my cpu is really old, which is why even your laptop cpu is multitudes faster)

It is actually very slow [...]

Obviously, but at least I have a solution for now.

Reino17 avatar Jul 11 '20 12:07 Reino17

http://www.benibela.de/documentation/internettools/xpath-functions.html#x-integer:

x:integer($arg as item(), $base as xs:integer) as xs:integer
Converts a string to an integer. It accepts base-prefixes like 0x or 0b, e.g 0xABCDEF (Xidel only)

I was just thinking (and maybe I should open a new "issue" for this)...
If x:integer() accepts base-prefixes, why not have x:integer-to-base() put them out too.

Instead of...
x:integer-to-base($arg as xs:integer, $base as xs:integer) as xs:string
something like...
x:integer-to-base($arg as xs:integer, $base as xs:integer [ , $prefix as xs:boolean , [$length as xs:integer] ] ) as xs:string
...perhaps?

And the following expected results:

xidel -se 'x:integer("111",2)'
xidel -se 'x:integer("00000111",2)'
xidel -se 'x:integer("0b00000111")'
7

xidel -se '
  x:integer-to-base(7,2),
  x:integer-to-base(7,2,true()),
  x:integer-to-base(7,2,true(),8)
  x:integer-to-base(7,2,false(),8)
'
111
0b111
0b00000111
00000111

It would make local:bin-xor() a lot easier, by not having to make sure both values have the same amount of digits.

What do you think?

Reino17 avatar Jul 12 '20 14:07 Reino17

For the prefix you can just write "0b" || . That is even shorter than , true()

benibela avatar Jul 12 '20 22:07 benibela

Sure, ok. Any thoughts on the length/padding parameter?

Reino17 avatar Jul 27 '20 12:07 Reino17

not sure about that. When there are too many parameters it becomes confusing. (Although even the freepascal inttohex function has a padding)

format-integer is to do more complex formatting. But it does not support different bases. The best thing would be to ask the w3c to allow different bases in format-integer. But I doubt they will change anything in the next years

benibela avatar Jul 29 '20 09:07 benibela

I timed 100.000 runs with $bin ! format-integer(.,string-join((1 to $len) ! 0)) instead of $bin ! concat(string-join((1 to $len - string-length()) ! 0),.) and it's actually ±12% slower. So, that's not worth the effort.

Forget about the length/padding parameter. The initial question was about the EXPath Binary Module anyway.

Reino17 avatar Jul 31 '20 22:07 Reino17

I have implemented four functions of the module:

https://github.com/benibela/internettools/commit/f12cb45069f9bc276008f6401de0d1894f114990#diff-dc8588a65c0875a8535fd33db1999d07eeca49f80fa8ed79548b88d47d1f405eR146-R220

http://hg.code.sf.net/p/videlibri/code/rev/dc34769d65d1#l2.149

The other functions can be implemented in the same way straightforwardly

It is the perfect exercise for someone to learn Pascal and how to add functions to Xidel.

benibela avatar Jan 30 '21 21:01 benibela

Alright. Cool.

$ xidel -s --module=xivid.xqm -e 'xivid:bin-xor(33,73)'
104

$ xidel -s --xmlns:bin="http://expath.org/ns/binary" -e '
  binary-to-string(
    bin:xor(
      string-to-base64Binary(33),
      string-to-base64Binary(73)
    )
  )
'
♦

Obviously I'm not using it correctly. :D

Reino17 avatar Jan 30 '21 23:01 Reino17

Obviously I'm not using it correctly. :D

You need to convert the codepoints to a string first, and back later

If someone implements bin:from-octets and bin:to-octets they could be used, and you won't need any strings

benibela avatar Jan 31 '21 11:01 benibela

$ xidel -s --xmlns:bin="http://expath.org/ns/binary" -e '
  bin:xor(
    string-to-base64Binary(x:cps(33)),
    string-to-base64Binary(x:cps(73))
  ) ! x:cps(binary-to-string(.))
'
104

Hah! How cumbersome. But, granted, it's still 8.5 times faster than my own xivid:bin-xor().

Reino17 avatar Jan 31 '21 11:01 Reino17

I must be going mad. No matter what I do...

--xmlns:bin="http://expath.org/ns/binary" -e 'bin:xor(...)'
-e 'declare namespace bin = "http://expath.org/ns/binary"; bin:xor(...)'
-e 'Q{http://expath.org/ns/binary}xor(...)'

(didn't know about the last one, but got that from 'components/pascal/data/xquery_module_binary.pas')

...I constantly get err:XPST0017: unknown function: bin:xor #2.
This is with r8090, but also with the very binary I used in the previous post.

Reino17 avatar Aug 27 '21 23:08 Reino17

The module is commented out, since it is incomplete https://github.com/benibela/xidel/blob/master/xidelbase.pas#L37 https://github.com/benibela/xidel/blob/master/xidelbase.pas#L3549

benibela avatar Aug 28 '21 11:08 benibela

Ah, sorry. For the binary I compiled back then I must've uncommented those lines, but totally forgot about them now.

Reino17 avatar Aug 28 '21 18:08 Reino17