lz-string icon indicating copy to clipboard operation
lz-string copied to clipboard

Can you add base91 support to lz-string?

Open joeshae opened this issue 8 years ago • 11 comments

@pieroxy , can you add base91 support to lz-string? Thank you!

joeshae avatar Apr 03 '17 08:04 joeshae

Sure I can. What would be the use-case?

pieroxy avatar Apr 03 '17 09:04 pieroxy

I was using UTF-16 encoding for AJAX POST from Chrome to server previously. Sometimes it works well, but sometimes the data received by server are unexpectedly damaged which cannot decoded, even if I set HTTP header 'Content-Type' to 'application/octet-stream; charset=binary'. I think maybe the browser cannot handle some UTF-16 characters correctly in POST request body. And I tried change encoding method from UTF-16 to Base64, then everything is OK. For less data transferred between server and client, I think Base91 is better than Base64. Do you think so?

joeshae avatar Apr 03 '17 09:04 joeshae

I think there is a bug somewhere in your setup and instead of fixing it you are trying to work around it with a half-assed solution. I'm not opposed to having a base91 encoder but I think the real fix would be to find the bug. Have you tried some content type of the form text/plain; charset=UTF-8 ? Because that is what you are transmitting after all.

pieroxy avatar Apr 04 '17 14:04 pieroxy

Yes, I have tried text/plain; charset=UTF-8, but Chrome insists on sending HTTP header text/plain; charset=utf-8.

Since there is a Base64 encoder in lz-string, I think it's not a bad idea to add a Base91 support. Base91 encoding provides shorter string than Base64 encoding, this might be the good reason to add Base91 support.

Anyway, I'll keep debugging my codes.

joeshae avatar Apr 04 '17 15:04 joeshae

Are you submitting a form or using AJAX ?

pieroxy avatar Apr 04 '17 16:04 pieroxy

using AJAX.

joeshae avatar Apr 05 '17 05:04 joeshae

@pieroxy I'm trying to add some codes like this, but it looks like not working. Could you please take a look? Thank you very much!

...
var keyStrBase91 = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!#$%&()*+,./:;<=>?@[]^_`{|}~"';
...
var LZString = {
  compressToBase91 : function (input) {
    if (input == null) return "";
    return LZString._compress(input, 7, function(a){return keyStrBase91.charAt(a);});
  },
  decompressFromBase91 : function (input) {
    if (input == null) return "";
    if (input == "") return null;
    // input = input.replace(/ /g, "+");
    return LZString._decompress(input.length, 64, function(index) { return getBaseValue(keyStrBase91, input.charAt(index)); });
  },
...

joeshae avatar Apr 12 '17 21:04 joeshae

I think what @joeshae facing is still exist in LZString side. Fyi below are the problem I am facing and I am desperate to find the solution.

I have following (ugly)characters. And I am trying to send this from client(Javascript) side to server(Java). Following are the image data as "compressed" I am sending to server. javascript code :

function callServer() {
    debugger;
    var compressed = LZString.compress('/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAA0JCgsKCA0LCgsODg0PEyAVExISEyccHhcgLikxMC4pLSwzOko+MzZGNywtQFdBRkxOUlNSMj5aYVpQYEpRUk');
    alert('compressed : '+compressed);
    var formData = "img="+compressed;
        jQuery.ajax({
                url : "/RegisterServlet/servlet/Register",
            type: "POST",
            data : formData,
            cache: false,
            async:false,
            success: function() {
            },
            error: function (){
            }
          });       
     }

Java code

`public void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
{
    request.setCharacterEncoding(encoding);
    String compressedData = request.getParameter("img");
    String decompress = LZString.decompress(compressedData);
}`

As per above code, I can able to receive the input I sent from script side to java as compressedData as follow : received broken/changes chars

But As you can see the red marked item on above attachment some of the images I sent are broken or become "?" mark or some different looks.

But I require to get the exact image data which sent? Fyr, the input and output:

brokenimages uglychars

Prabh2k7 avatar Sep 13 '17 11:09 Prabh2k7

Just an idea, I thought encoding was utf-8, utf-16 (requires BOM), utf-16le, utf16-be.

http://unicode.org/faq/utf_bom.html

So, for utf16, what byte order is being used and is it identifying it all correctly.

rquadling avatar Sep 13 '17 16:09 rquadling

Also, any font rendering the data may display ? if it doesn't have the glyph for it. The data is still present.

rquadling avatar Sep 13 '17 16:09 rquadling

@rquadling , I get it. may be the data is present but doesn't have glyph. Anyway when I sent that input to decompress method, it should work. But due to chars are missing/collapsing, inside decompression algorithm calculation got all wrong!! But in the browser/script is not the same case.. Hence I think we got to make sure chars!!

Prabh2k7 avatar Sep 14 '17 07:09 Prabh2k7