Home » Java » Where to get “UTF-8” string literal in Java?

Where to get “UTF-8” string literal in Java?

Posted by: admin November 2, 2017 Leave a comment

Questions:

I’m trying to use a constant instead of a string literal in this piece of code:

new InputStreamReader(new FileInputStream(file), "UTF-8")

"UTF-8" appears in the code rather often, and would be much better to refer to some static final variable instead. Do you know where I can find such a variable in JDK?

BTW, on a second thought, such constants are bad design: Public Static Literals … Are Not a Solution for Data Duplication

Answers:

In Java 1.7+, java.nio.charset.StandardCharsets defines constants for Charset including UTF_8.

import java.nio.charset.StandardCharsets

...

StandardCharsets.UTF_8.name();

For Android: minSdk 19

Questions:
Answers:

Now I use org.apache.commons.lang3.CharEncoding.UTF_8 constant from commons-lang.

Questions:
Answers:

The Google Guava library (which I’d highly recommend anyway, if you’re doing work in Java) has a Charsets class with static fields like Charsets.UTF_8, Charsets.UTF_16, etc.

Since Java 7 you should just use java.nio.charset.StandardCharsets instead for comparable constants.

Note that these constants aren’t strings, they’re actual Charset instances. All standard APIs that take a charset name also have an overload that take a Charset object which you should use instead.

Questions:
Answers:

In case this page comes up in someones web search, as of Java 1.7 you can now use java.nio.charset.StandardCharsets to get access to constant definitions of standard charsets.

Questions:
Answers:

There are none (at least in the standard Java library). Character sets vary from platform to platform so there isn’t a standard list of them in Java.

There are some 3rd party libraries which contain these constants though. One of these is Guava (Google core libraries): http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/base/Charsets.html

Questions:
Answers:

You can use Charset.defaultCharset() API or file.encoding property.

But if you want your own constant, you’ll need to define it yourself.

Questions:
Answers:

This constant is available (among others as: UTF-16, US-ASCII, etc.) in the class org.apache.commons.codec.CharEncoding as well.

Questions:
Answers:

If you are using OkHttp for Java/Android you can use the following constant:

import com.squareup.okhttp.internal.Util;

Util.UTF_8; // Charset
Util.UTF_8.name(); // String