R.matlab icon indicating copy to clipboard operation
R.matlab copied to clipboard

encoding error when get a string array using getVariable .

Open Blockhead-yj opened this issue 4 years ago • 6 comments

Hi, everyone! I encountered a problem when get a string array using getVariable function. Here is a example code . My platform is PCWIN64 ,my matlab version is R2020b. The error is "can only read in bytes in a non-UTF-8 MBCS locale". I have tested and found that this problem came up when the array is string array rather than char string, and since my matlab default encoding is 'GBK', I have tried changing it to 'UTF-8' by using slCharacterEncoding('UTF-8'), but it didn't work. I will appreciate it if anyone can help or give some advice!

library(R.matlab)

Matlab$startServer()
matlab <- Matlab()
open(matlab)
evaluate(matlab,'a=["adsafdsa";"bfdgadfg"];')
a <- getVariable(matlab,'a')
close(matlab)

Blockhead-yj avatar Dec 28 '20 13:12 Blockhead-yj

PS: the encoding format of Rstudio is 'UTF-8', though I also tried 'WINDOWS-1252' as my win10 encoding.

Blockhead-yj avatar Dec 28 '20 14:12 Blockhead-yj

Hi, I have very little time to work on this package, but please provide what traceback() outputs immediately after you get the error, and also your sessionInfo(). This helps narrow down where in the code the problem lies.

HenrikBengtsson avatar Dec 28 '20 15:12 HenrikBengtsson

sorry, I made a mistake, it was not a error but a warning.

Warning messages: 1: In readChar(con = con, nchars = nbrOfBytes) : can only read in bytes in a non-UTF-8 MBCS locale 2: In readMat(filename) : strings not representable in native encoding will be translated to UTF-8

The problem is , the variable I get became a byte array. For example,

evaluate(matlab,'a=["a";"b"];')
a <- getVariable(matlab,'a')

a is just like

>a
$MCOS
[1] 73 74 72 69 6e 67

[[2]]
           [,1]
[1,] -587202560
[2,]          2
[3,]          1
[4,]          1
[5,]          1
[6,]          1

[[3]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
[1,]    0    1   73   77    0    0    0    0   14     0     0     0     8     3     0     0     6     0     0     0     8     0     0     0
     [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47]
[1,]     2     0     0     0     0     0     0     0     5     0     0     0     8     0     0     0     1     0     0     0     1     0     0
     [,48] [,49] [,50] [,51] [,52] [,53] [,54] [,55] [,56] [,57] [,58] [,59] [,60] [,61] [,62] [,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70]
[1,]     0     1     0     0     0     0     0     0     0     5     0     4     0     5     0     0     0     1     0     0     0     5     0
     [,71] [,72] [,73] [,74] [,75] [,76] [,77] [,78] [,79] [,80] [,81] [,82] [,83] [,84] [,85] [,86] [,87] [,88] [,89] [,90] [,91] [,92] [,93]
...(total 936)

attr(,"header")
attr(,"header")$description
[1] "MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Tue Dec 29 11:30:02 2020                                        \b\001"

attr(,"header")$version
[1] "5"

attr(,"header")$endian
[1] "little"

But there is no problem if the variable is not string array but char array.

evaluate(matlab,"a=['a';'b'];")
a <- getVariable(matlab,'a')

you got

> evaluate(matlab,"a=['a';'b'];")
> a <- getVariable(matlab,'a')
Warning message:
In readChar(con = con, nchars = nbrOfBytes) :
  can only read in bytes in a non-UTF-8 MBCS locale
> a
$a
     [,1]
[1,] "a" 
[2,] "b" 

attr(,"header")
attr(,"header")$description
[1] "MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Tue Dec 29 11:34:24 2020                                        "

attr(,"header")$version
[1] "5"

attr(,"header")$endian
[1] "little"

Blockhead-yj avatar Dec 29 '20 03:12 Blockhead-yj

Thanks. So, my MATLAB skills are super rusty - like from 2005-ish.

Let's focus on:

> evaluate(matlab,"a=['a';'b'];")
> a <- getVariable(matlab,'a')
Warning message:
In readChar(con = con, nchars = nbrOfBytes) :
  can only read in bytes in a non-UTF-8 MBCS locale

This warning comes from readMat() reading the results from MATLAB. It would be useful to have that as a MAT file. Can you create that a in MATLAB, and then save it in MAT v6 format, and make it available somewhere for download? Something like:

>> a=['a'; 'b']';
>>  save('issue49-a.mat', '-v6', 'a');

HenrikBengtsson avatar Dec 29 '20 04:12 HenrikBengtsson

issue49.zip

Blockhead-yj avatar Dec 29 '20 05:12 Blockhead-yj

Thank you for your kind replay!

Blockhead-yj avatar Dec 29 '20 05:12 Blockhead-yj