Home » Java » Decode Base64 data in Java

Decode Base64 data in Java

Posted by: admin November 2, 2017 Leave a comment

Questions:

I have an image that is Base64 encoded. What is the best way to decode that in Java? Hopefully using only the libraries included with Sun Java 6.

Answers:

As of v6, Java SE ships with JAXB. javax.xml.bind.DatatypeConverter has static methods that make this easy. See parseBase64Binary() and printBase64Binary().

Questions:
Answers:

As of Java 8, there is an officially supported API for Base64 encoding and decoding.
In time this will probably become the default choice.

The API includes the class java.util.Base64 and its nested classes. It supports three different flavors: basic, URL safe, and MIME.

Sample code using the “basic” encoding:

import java.util.Base64;

byte[] bytes = "Hello, World!".getBytes("UTF-8");
String encoded = Base64.getEncoder().encodeToString(bytes);
byte[] decoded = Base64.getDecoder().decode(encoded);

The documentation for java.util.Base64 includes several more methods for configuring encoders and decoders, and for using different classes as inputs and outputs (byte arrays, strings, ByteBuffers, java.io streams).

Questions:
Answers:

Here is a working example using Apache Commons codec:

import org.apache.commons.codec.binary.Base64;
import org.apache.commons.codec.binary.StringUtils;

public String decode(String s) {
    return StringUtils.newStringUtf8(Base64.decodeBase64(s));
}
public String encode(String s) {
    return Base64.encodeBase64String(StringUtils.getBytesUtf8(s));
}

Maven / sbt repo: commons-codec, commons-codec, 1.8.

Questions:
Answers:

No need to use commons–Sun ships a base64 encoder with Java. You can import it as such:

import sun.misc.BASE64Decoder;

And then use it like this:

BASE64Decoder decoder = new BASE64Decoder();
byte[] decodedBytes = decoder.decodeBuffer(encodedBytes);

Where encodedBytes is either a java.lang.String or a java.io.InputStream. Just beware that the sun.* classes are not “officially supported” by Sun.

EDIT: Who knew this would be the most controversial answer I’d ever post? I do know that sun.* packages are not supported or guaranteed to continue existing, and I do know about Commons and use it all the time. However, the poster asked for a class that that was “included with Sun Java 6,” and that’s what I was trying to answer. I agree that Commons is the best way to go in general.

EDIT 2: As amir75 points out below, Java 6+ ships with JAXB, which contains supported code to encode/decode Base64. Please see Jeremy Ross’ answer below.

Questions:
Answers:

Specifically in Commons Codec: class Base64 to decode(byte[] array) or encode(byte[] array)

Questions:
Answers:

Guava now has Base64 decoding built in.

Use BaseEncoding.base64().decode()

As for dealing with possible whitespace in input use

BaseEncoding.base64().decode(CharMatcher.WHITESPACE.removeFrom(...));

See this discussion for more information

Questions:
Answers:

My solution is fastest and easiest.

public class MyBase64 {

    private final static char[] ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/".toCharArray();

    private static int[]  toInt   = new int[128];

    static {
        for(int i=0; i< ALPHABET.length; i++){
            toInt[ALPHABET[i]]= i;
        }
    }

    /**
     * Translates the specified byte array into Base64 string.
     *
     * @param buf the byte array (not null)
     * @return the translated Base64 string (not null)
     */
    public static String encode(byte[] buf){
        int size = buf.length;
        char[] ar = new char[((size + 2) / 3) * 4];
        int a = 0;
        int i=0;
        while(i < size){
            byte b0 = buf[i++];
            byte b1 = (i < size) ? buf[i++] : 0;
            byte b2 = (i < size) ? buf[i++] : 0;

            int mask = 0x3F;
            ar[a++] = ALPHABET[(b0 >> 2) & mask];
            ar[a++] = ALPHABET[((b0 << 4) | ((b1 & 0xFF) >> 4)) & mask];
            ar[a++] = ALPHABET[((b1 << 2) | ((b2 & 0xFF) >> 6)) & mask];
            ar[a++] = ALPHABET[b2 & mask];
        }
        switch(size % 3){
            case 1: ar[--a]  = '=';
            case 2: ar[--a]  = '=';
        }
        return new String(ar);
    }

    /**
     * Translates the specified Base64 string into a byte array.
     *
     * @param s the Base64 string (not null)
     * @return the byte array (not null)
     */
    public static byte[] decode(String s){
        int delta = s.endsWith( "==" ) ? 2 : s.endsWith( "=" ) ? 1 : 0;
        byte[] buffer = new byte[s.length()*3/4 - delta];
        int mask = 0xFF;
        int index = 0;
        for(int i=0; i< s.length(); i+=4){
            int c0 = toInt[s.charAt( i )];
            int c1 = toInt[s.charAt( i + 1)];
            buffer[index++]= (byte)(((c0 << 2) | (c1 >> 4)) & mask);
            if(index >= buffer.length){
                return buffer;
            }
            int c2 = toInt[s.charAt( i + 2)];
            buffer[index++]= (byte)(((c1 << 4) | (c2 >> 2)) & mask);
            if(index >= buffer.length){
                return buffer;
            }
            int c3 = toInt[s.charAt( i + 3 )];
            buffer[index++]= (byte)(((c2 << 6) | c3) & mask);
        }
        return buffer;
    } 

}

Questions:
Answers:

Here’s my own implementation, if it could be useful to someone :

Questions:
Answers:

As an alternative to sun.misc.BASE64Decoder or non-core libraries, look at javax.mail.internet.MimeUtility.decode().

Jist:

Link with full code: Encode/Decode to/from Base64

Questions:
Answers:

Given a test encode/decode example of javax.xml.bind.DatatypeConverter using methods parseBase64Binary() and printBase64Binary() referring to @jeremy-ross and @nightfirecat answer.

Result:

Questions:
Answers:

Another late answer, but my benchmarking shows that Jetty’s implementation of Base64 encoder is pretty fast. Not as fast as MiGBase64 but faster than iHarder Base64.

I also did some benchmarks:

These are runs/sec so higher is better.

Questions:
Answers:

If You are prefering performance based solution then you can use “MiGBase64”

http://migbase64.sourceforge.net/

Questions:
Answers:

Hope this helps you:

Or:

java.util.prefs.Base64 works on local rt.jar,

But it is not in The JRE Class White List

and not in Available classes not listed in the GAE/J white-list

What a pity!

PS. In android, it’s easy because that android.util.Base64 has been included since Android API Level 8.

Questions:
Answers:

This is a late answer, but Joshua Bloch committed his Base64 class (when he was working for Sun, ahem, Oracle) under the java.util.prefs package. This class existed since JDK 1.4.

E.g.

Questions:
Answers:

You can write or download file from encoded Base64 string:

Worked for me and hopefully for you also…

Questions:
Answers:

If you are already adding AWS SDK for Java, you can use com.amazonaws.util.Base64.

Questions:
Answers:

The Java 8 implementation of java.util.Base64 has no dependencies on other Java 8 specific classes.

I am not certain if this will work for Java 6 project, but it is possible to copy and paste the Base64.java file into a Java 7 project and compile it with no modification other than importing java.util.Arrays and java.util.Objects.

Note the Base64.java file is covered by the GNU GPL2

Questions:
Answers:

I used android.util.base64 that works pretty good without any dependances:

Usage:


package com.test;

import java.io.UnsupportedEncodingException;
/**
* Utilities for encoding and decoding the Base64 representation of
* binary data.  See RFCs <a
* href="http://www.ietf.org/rfc/rfc2045.txt">2045</a> and <a
* href="http://www.ietf.org/rfc/rfc3548.txt">3548</a>.
*/
public class Base64 {
public static final int DEFAULT = 0;
public static final int NO_PADDING = 1;
public static final int NO_WRAP = 2;
public static final int CRLF = 4;
public static final int URL_SAFE = 8;
public static final int NO_CLOSE = 16;
//  --------------------------------------------------------
//  shared code
//  --------------------------------------------------------
/* package */ static abstract class Coder {
public byte[] output;
public int op;
public abstract boolean process(byte[] input, int offset, int len, boolean finish);
public abstract int maxOutputSize(int len);
}
//  --------------------------------------------------------
//  decoding
//  --------------------------------------------------------
public static byte[] decode(String str, int flags) {
return decode(str.getBytes(), flags);
}
public static byte[] decode(byte[] input, int flags) {
return decode(input, 0, input.length, flags);
}
public static byte[] decode(byte[] input, int offset, int len, int flags) {
// Allocate space for the most data the input could represent.
// (It could contain less if it contains whitespace, etc.)
Decoder decoder = new Decoder(flags, new byte[len*3/4]);
if (!decoder.process(input, offset, len, true)) {
throw new IllegalArgumentException("bad base-64");
}
// Maybe we got lucky and allocated exactly enough output space.
if (decoder.op == decoder.output.length) {
return decoder.output;
}
// Need to shorten the array, so allocate a new one of the
// right size and copy.
byte[] temp = new byte[decoder.op];
System.arraycopy(decoder.output, 0, temp, 0, decoder.op);
return temp;
}
static class Decoder extends Coder {       
private static final int DECODE[] = {
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1, -1, 63,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -2, -1, -1,
-1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, -1, -1, -1, -1, -1,
-1, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
};
/**
* Decode lookup table for the "web safe" variant (RFC 3548
* sec. 4) where - and _ replace + and /.
*/
private static final int DECODE_WEBSAFE[] = {
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -2, -1, -1,
-1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, -1, -1, -1, -1, 63,
-1, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
};
/** Non-data values in the DECODE arrays. */
private static final int SKIP = -1;
private static final int EQUALS = -2;
private int state;   // state number (0 to 6)
private int value;
final private int[] alphabet;
public Decoder(int flags, byte[] output) {
this.output = output;
alphabet = ((flags & URL_SAFE) == 0) ? DECODE : DECODE_WEBSAFE;
state = 0;
value = 0;
}
public int maxOutputSize(int len) {
return len * 3/4 + 10;
}
/**
* Decode another block of input data.
*
* @return true if the state machine is still healthy.  false if
*         bad base-64 data has been detected in the input stream.
*/
public boolean process(byte[] input, int offset, int len, boolean finish) {
if (this.state == 6) return false;
int p = offset;
len += offset;
int state = this.state;
int value = this.value;
int op = 0;
final byte[] output = this.output;
final int[] alphabet = this.alphabet;
while (p < len) {                
if (state == 0) {
while (p+4 <= len &&
(value = ((alphabet[input[p] & 0xff] << 18) |
(alphabet[input[p+1] & 0xff] << 12) |
(alphabet[input[p+2] & 0xff] << 6) |
(alphabet[input[p+3] & 0xff]))) >= 0) {
output[op+2] = (byte) value;
output[op+1] = (byte) (value >> 8);
output[op] = (byte) (value >> 16);
op += 3;
p += 4;
}
if (p >= len) break;
}
int d = alphabet[input[p++] & 0xff];
switch (state) {
case 0:
if (d >= 0) {
value = d;
++state;
} else if (d != SKIP) {
this.state = 6;
return false;
}
break;
case 1:
if (d >= 0) {
value = (value << 6) | d;
++state;
} else if (d != SKIP) {
this.state = 6;
return false;
}
break;
case 2:
if (d >= 0) {
value = (value << 6) | d;
++state;
} else if (d == EQUALS) {
// Emit the last (partial) output tuple;
// expect exactly one more padding character.
output[op++] = (byte) (value >> 4);
state = 4;
} else if (d != SKIP) {
this.state = 6;
return false;
}
break;
case 3:
if (d >= 0) {
// Emit the output triple and return to state 0.
value = (value << 6) | d;
output[op+2] = (byte) value;
output[op+1] = (byte) (value >> 8);
output[op] = (byte) (value >> 16);
op += 3;
state = 0;
} else if (d == EQUALS) {
// Emit the last (partial) output tuple;
// expect no further data or padding characters.
output[op+1] = (byte) (value >> 2);
output[op] = (byte) (value >> 10);
op += 2;
state = 5;
} else if (d != SKIP) {
this.state = 6;
return false;
}
break;
case 4:
if (d == EQUALS) {
++state;
} else if (d != SKIP) {
this.state = 6;
return false;
}
break;
case 5:
if (d != SKIP) {
this.state = 6;
return false;
}
break;
}
}
if (!finish) {
// We're out of input, but a future call could provide
// more.
this.state = state;
this.value = value;
this.op = op;
return true;
}
switch (state) {
case 0:
break;
case 1:
this.state = 6;
return false;
case 2:
output[op++] = (byte) (value >> 4);
break;
case 3:
output[op++] = (byte) (value >> 10);
output[op++] = (byte) (value >> 2);
break;
case 4:
this.state = 6;
return false;
case 5:
break;
}
this.state = state;
this.op = op;
return true;
}
}
//  --------------------------------------------------------
//  encoding
//  --------------------------------------------------------    
public static String encodeToString(byte[] input, int flags) {
try {
return new String(encode(input, flags), "US-ASCII");
} catch (UnsupportedEncodingException e) {
// US-ASCII is guaranteed to be available.
throw new AssertionError(e);
}
}
public static String encodeToString(byte[] input, int offset, int len, int flags) {
try {
return new String(encode(input, offset, len, flags), "US-ASCII");
} catch (UnsupportedEncodingException e) {
// US-ASCII is guaranteed to be available.
throw new AssertionError(e);
}
}
public static byte[] encode(byte[] input, int flags) {
return encode(input, 0, input.length, flags);
}
public static byte[] encode(byte[] input, int offset, int len, int flags) {
Encoder encoder = new Encoder(flags, null);
// Compute the exact length of the array we will produce.
int output_len = len / 3 * 4;
// Account for the tail of the data and the padding bytes, if any.
if (encoder.do_padding) {
if (len % 3 > 0) {
output_len += 4;
}
} else {
switch (len % 3) {
case 0: break;
case 1: output_len += 2; break;
case 2: output_len += 3; break;
}
}
// Account for the newlines, if any.
if (encoder.do_newline && len > 0) {
output_len += (((len-1) / (3 * Encoder.LINE_GROUPS)) + 1) *
(encoder.do_cr ? 2 : 1);
}
encoder.output = new byte[output_len];
encoder.process(input, offset, len, true);
assert encoder.op == output_len;
return encoder.output;
}
/* package */ static class Encoder extends Coder {
/**
* Emit a new line every this many output tuples.  Corresponds to
* a 76-character line length (the maximum allowable according to
* <a href="http://www.ietf.org/rfc/rfc2045.txt">RFC 2045</a>).
*/
public static final int LINE_GROUPS = 19;
/**
* Lookup table for turning Base64 alphabet positions (6 bits)
* into output bytes.
*/
private static final byte ENCODE[] = {
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/',
};
/**
* Lookup table for turning Base64 alphabet positions (6 bits)
* into output bytes.
*/
private static final byte ENCODE_WEBSAFE[] = {
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '-', '_',
};
final private byte[] tail;
/* package */ int tailLen;
private int count;
final public boolean do_padding;
final public boolean do_newline;
final public boolean do_cr;
final private byte[] alphabet;
public Encoder(int flags, byte[] output) {
this.output = output;
do_padding = (flags & NO_PADDING) == 0;
do_newline = (flags & NO_WRAP) == 0;
do_cr = (flags & CRLF) != 0;
alphabet = ((flags & URL_SAFE) == 0) ? ENCODE : ENCODE_WEBSAFE;
tail = new byte[2];
tailLen = 0;
count = do_newline ? LINE_GROUPS : -1;
}
/**
* @return an overestimate for the number of bytes {@code
* len} bytes could encode to.
*/
public int maxOutputSize(int len) {
return len * 8/5 + 10;
}
public boolean process(byte[] input, int offset, int len, boolean finish) {
// Using local variables makes the encoder about 9% faster.
final byte[] alphabet = this.alphabet;
final byte[] output = this.output;
int op = 0;
int count = this.count;
int p = offset;
len += offset;
int v = -1;
// First we need to concatenate the tail of the previous call
// with any input bytes available now and see if we can empty
// the tail.
switch (tailLen) {
case 0:
// There was no tail.
break;
case 1:
if (p+2 <= len) {
// A 1-byte tail with at least 2 bytes of
// input available now.
v = ((tail[0] & 0xff) << 16) |
((input[p++] & 0xff) << 8) |
(input[p++] & 0xff);
tailLen = 0;
};
break;
case 2:
if (p+1 <= len) {
// A 2-byte tail with at least 1 byte of input.
v = ((tail[0] & 0xff) << 16) |
((tail[1] & 0xff) << 8) |
(input[p++] & 0xff);
tailLen = 0;
}
break;
}
if (v != -1) {
output[op++] = alphabet[(v >> 18) & 0x3f];
output[op++] = alphabet[(v >> 12) & 0x3f];
output[op++] = alphabet[(v >> 6) & 0x3f];
output[op++] = alphabet[v & 0x3f];
if (--count == 0) {
if (do_cr) output[op++] = '\r';
output[op++] = '\n';
count = LINE_GROUPS;
}
}
// At this point either there is no tail, or there are fewer
// than 3 bytes of input available.
// The main loop, turning 3 input bytes into 4 output bytes on
// each iteration.
while (p+3 <= len) {
v = ((input[p] & 0xff) << 16) |
((input[p+1] & 0xff) << 8) |
(input[p+2] & 0xff);
output[op] = alphabet[(v >> 18) & 0x3f];
output[op+1] = alphabet[(v >> 12) & 0x3f];
output[op+2] = alphabet[(v >> 6) & 0x3f];
output[op+3] = alphabet[v & 0x3f];
p += 3;
op += 4;
if (--count == 0) {
if (do_cr) output[op++] = '\r';
output[op++] = '\n';
count = LINE_GROUPS;
}
}
if (finish) {
if (p-tailLen == len-1) {
int t = 0;
v = ((tailLen > 0 ? tail[t++] : input[p++]) & 0xff) << 4;
tailLen -= t;
output[op++] = alphabet[(v >> 6) & 0x3f];
output[op++] = alphabet[v & 0x3f];
if (do_padding) {
output[op++] = '=';
output[op++] = '=';
}
if (do_newline) {
if (do_cr) output[op++] = '\r';
output[op++] = '\n';
}
} else if (p-tailLen == len-2) {
int t = 0;
v = (((tailLen > 1 ? tail[t++] : input[p++]) & 0xff) << 10) |
(((tailLen > 0 ? tail[t++] : input[p++]) & 0xff) << 2);
tailLen -= t;
output[op++] = alphabet[(v >> 12) & 0x3f];
output[op++] = alphabet[(v >> 6) & 0x3f];
output[op++] = alphabet[v & 0x3f];
if (do_padding) {
output[op++] = '=';
}
if (do_newline) {
if (do_cr) output[op++] = '\r';
output[op++] = '\n';
}
} else if (do_newline && op > 0 && count != LINE_GROUPS) {
if (do_cr) output[op++] = '\r';
output[op++] = '\n';
}
assert tailLen == 0;
assert p == len;
} else {
// Save the leftovers in tail to be consumed on the next
// call to encodeInternal.
if (p == len-1) {
tail[tailLen++] = input[p];
} else if (p == len-2) {
tail[tailLen++] = input[p];
tail[tailLen++] = input[p+1];
}
}
this.op = op;
this.count = count;
return true;
}
}
private Base64() { }   // don't instantiate
}