Home » Php » javascript – PHP LZW Binary Decompression Function

javascript – PHP LZW Binary Decompression Function

Posted by: admin July 12, 2020 Leave a comment

Questions:

I’ve been looking on the internets and couldn’t find an LZW decompression implementation in PHP that works with the data outputted by these javascript functions:

function lzw_encode(s) {
    var dict = {};
    var data = (s + "").split("");
    var out = [];
    var currChar;
    var phrase = data[0];
    var code = 256;
    for (var i=1; i<data.length; i++) {
        currChar=data[i];
        if (dict[phrase + currChar] != null) {
            phrase += currChar;
        }
        else {
            out.push(phrase.length > 1 ? dict[phrase] : phrase.charCodeAt(0));
            dict[phrase + currChar] = code;
            code++;
            phrase=currChar;
        }
    }
    out.push(phrase.length > 1 ? dict[phrase] : phrase.charCodeAt(0));
    for (var i=0; i<out.length; i++) {
        out[i] = String.fromCharCode(out[i]);
    }
    return out.join("");
}

function lzw_decode(s) {
    var dict = {};
    var data = (s + "").split("");
    var currChar = data[0];
    var oldPhrase = currChar;
    var out = [currChar];
    var code = 256;
    var phrase;
    debugger;
    for (var i=1; i<data.length; i++) {
        var currCode = data[i].charCodeAt(0);
        if (currCode < 256) {
            phrase = data[i];
        }
        else {
           phrase = dict[currCode] ? dict[currCode] : (oldPhrase + currChar);
        }
        out.push(phrase);
        currChar = phrase.charAt(0);
        dict[code] = oldPhrase + currChar;
        code++;
        oldPhrase = phrase;
    }
    return out.join("");
}

I really just need a decompression algorithm in PHP that can work with the compression javascript function above.

The lzw_encode function above encodes “This is a test of the compression function” as “This Ă a test ofĈhe comprĊsion functěn”

The libraries I’ve found are either buggy (http://code.google.com/p/php-lzw/) or don’t take input of UTC characters.

Any help would be greatly appreciated,

Thanks!

How to&Answers:

I’ve ported and tested it for you to PHP:

function lzw_decode($s) {
  mb_internal_encoding('UTF-8');

  $dict = array();
  $currChar = mb_substr($s, 0, 1);
  $oldPhrase = $currChar;
  $out = array($currChar);
  $code = 256;
  $phrase = '';

  for ($i=1; $i < mb_strlen($s); $i++) {
      $currCode = implode(unpack('N*', str_pad(iconv('UTF-8', 'UTF-16BE', mb_substr($s, $i, 1)), 4, "\x00", STR_PAD_LEFT)));
      if($currCode < 256) {
          $phrase = mb_substr($s, $i, 1);
      } else {
         $phrase = $dict[$currCode] ? $dict[$currCode] : ($oldPhrase.$currChar);
      }
      $out[] = $phrase;
      $currChar = mb_substr($phrase, 0, 1);
      $dict[$code] = $oldPhrase.$currChar;
      $code++;
      $oldPhrase = $phrase;
  }
  var_dump($dict);
  return(implode($out));
}

Answer:

There is now a PHP extension for this!

lzw_decompress_file('3240_05_1948-1998.tar.Z', '3240_05_1948-1998.tar');
$archive = new PharData('/tmp/3240_05_1948-1998.tar');
mkdir('unpacked');
$archive->extractTo('unpacked');