Home » Java » Getting A File's Mime Type In Java

Getting A File's Mime Type In Java

Posted by: admin November 2, 2017 Leave a comment

Questions:

I was just wondering how most people fetch a mime type from a file in Java? So far I’ve tried two utils: JMimeMagic & Mime-Util.

The first gave me memory exceptions, the second doesn’t close its streams off properly. I was just wondering if anyone else had a method/library that they used and worked correctly?

Answers:

In Java 7 you can now just use Files.probeContentType(path).

Questions:
Answers:

Unfortunately,

mimeType = file.toURL().openConnection().getContentType();

does not work, since this use of URL leaves a file locked, so that, for example, it is undeletable.

However, you have this:

mimeType= URLConnection.guessContentTypeFromName(file.getName());

and also the following, which has the advantage of going beyond mere use of file extension, and takes a peek at content

InputStream is = new BufferedInputStream(new FileInputStream(file));
mimeType = URLConnection.guessContentTypeFromStream(is);
 //...close stream

However, as suggested by the comment above, the built-in table of mime-types is quite limited, not including, for example, MSWord and PDF. So, if you want to generalize, you’ll need to go beyond the built-in libraries, using, e.g., Mime-Util (which is a great library, using both file extension and content).

Questions:
Answers:

The JAF API is part of JDK 6. Look at javax.activation package.

Most interesting classes are javax.activation.MimeType – an actual MIME type holder – and javax.activation.MimetypesFileTypeMap – class whose instance can resolve MIME type as String for a file:

String fileName = "/path/to/file";
MimetypesFileTypeMap mimeTypesMap = new MimetypesFileTypeMap();

// only by file name
String mimeType = mimeTypesMap.getContentType(fileName);

// or by actual File instance
File file = new File(fileName);
mimeType = mimeTypesMap.getContentType(file);

Questions:
Answers:

If you’re an Android developer, you can use a utility class android.webkit.MimeTypeMap which maps MIME-types to file extensions and vice versa.

Following code snippet may help you.

private static String getMimeType(String fileUrl) {
    String extension = MimeTypeMap.getFileExtensionFromUrl(fileUrl);
    return MimeTypeMap.getSingleton().getMimeTypeFromExtension(extension);
}

Questions:
Answers:

From roseindia:

FileNameMap fileNameMap = URLConnection.getFileNameMap();
String mimeType = fileNameMap.getContentTypeFor("alert.gif");

Questions:
Answers:

Apache Tika offers in tika-core a mime type detection based based on magic markers in the stream prefix. tika-core does not fetch other dependencies, which makes it as lightweight as the currently unmaintained Mime Type Detection Utility.

Simple code example (Java 7), using the variables theInputStream and theFileName

try (InputStream is = theInputStream;
        BufferedInputStream bis = new BufferedInputStream(is);) {
    AutoDetectParser parser = new AutoDetectParser();
    Detector detector = parser.getDetector();
    Metadata md = new Metadata();
    md.add(Metadata.RESOURCE_NAME_KEY, theFileName);
    MediaType mediaType = detector.detect(bis, md);
    return mediaType.toString();
}

Please note that MediaType.detect(…) cannot be used directly (TIKA-1120). More hints are provided at https://tika.apache.org/0.10/detection.html.

Questions:
Answers:

With Apache Tika you need only three lines of code:

File file = new File("/path/to/file");
Tika tika = new Tika();
System.out.println(tika.detect(file));

If you have a groovy console, just paste and run this code to play with it:

@Grab('org.apache.tika:tika-core:1.14')
import org.apache.tika.Tika;

def tika = new Tika()
def file = new File("/path/to/file")
println tika.detect(file)

Keep in mind that its APIs are rich, it can parse “anything”. As of tika-core 1.14, you have:

String  detect(byte[] prefix)
String  detect(byte[] prefix, String name)
String  detect(File file)
String  detect(InputStream stream)
String  detect(InputStream stream, Metadata metadata)
String  detect(InputStream stream, String name)
String  detect(Path path)
String  detect(String name)
String  detect(URL url)

See the apidocs for more information.

Questions:
Answers:

If you are stuck with java 5-6 then this utility class from servoy open source product

https://github.com/Servoy/servoy-client/blob/e7f5bce3c3dc0f0eb1cd240fce48c75143a25432/servoy_shared/src/com/servoy/j2db/util/MimeTypes.java#L34

You only need this function

public static String getContentType(byte[] data, String name)

It probes the first bytes of the content and returns the content types based on that content and not by file extension.

Questions:
Answers:

I was just wondering how most people fetch a mime type from a file in Java?

I’ve published my SimpleMagic Java package which allows content-type (mime-type) determination from files and byte arrays. It is designed to read and run the Unix file(1) command magic files that are a part of most ~Unix OS configurations.

I tried Apache Tika but it is huge with tons of dependencies, URLConnection doesn’t use the bytes of the files, and MimetypesFileTypeMap also just looks at files names.

With SimpleMagic you can do something like:

// create a magic utility using the internal magic file
ContentInfoUtil util = new ContentInfoUtil();
// if you want to use a different config file(s), you can load them by hand:
// ContentInfoUtil util = new ContentInfoUtil("/etc/magic");
...
ContentInfo info = util.findMatch("/tmp/upload.tmp");
// or
ContentInfo info = util.findMatch(inputStream);
// or
ContentInfo info = util.findMatch(contentByteArray);

// null if no match
if (info != null) {
   String mimeType = info.getMimeType();
}

Questions:
Answers:

I tried several ways to do it, including the first ones said by @Joshua Fox. But some don’t recognize frequent mimetypes like for PDF files, and other could not be trustable with fake files (I tried with a RAR file with extension changed to TIF). The solution I found, as also is said by @Joshua Fox in a superficial way, is to use MimeUtil2, like this:

MimeUtil2 mimeUtil = new MimeUtil2();
mimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
String mimeType = MimeUtil2.getMostSpecificMimeType(mimeUtil.getMimeTypes(file)).toString();

Questions:
Answers:

It is better to use two layer validation for files upload.

First you can check for the mimeType and validate it.

Second you should look to convert the first 4 bytes of your file to hexadecimal and then compare it with the magic numbers. Then it will be a really secure way to check for file validations.

Questions:
Answers:

This is the simplest way I found for doing this:

byte[] byteArray = ...
InputStream is = new BufferedInputStream(new ByteArrayInputStream(byteArray));
String mimeType = URLConnection.guessContentTypeFromStream(is);

Questions:
Answers:

if you work on linux OS ,there is a command line file --mimetype:

String mimetype(file){

   //1. run cmd
   Object cmd=Runtime.getRuntime().exec("file --mime-type "+file);

   //2 get output of cmd , then 
    //3. parse mimetype
    if(output){return output.split(":")[1].trim(); }
    return "";
}

Then

mimetype("/home/nyapp.war") //  'application/zip'

mimetype("/var/www/ggg/au.mp3") //  'audio/mp3'

Questions:
Answers:

in spring MultipartFile file;

org.springframework.web.multipart.MultipartFile

file.getContentType();

Questions:
Answers:

After trying various other libraries I settled with mime-util.

<groupId>eu.medsea.mimeutil</groupId>
      <artifactId>mime-util</artifactId>
      <version>2.1.3</version>
</dependency>

File file = new File("D:/test.tif");
MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");
Collection<?> mimeTypes = MimeUtil.getMimeTypes(file);
System.out.println(mimeTypes);

Questions:
Answers:

To chip in with my 5 cents:

TL,DR

I use MimetypesFileTypeMap and add any mime that is not there and I specifically need it, into mime.types file.

And now, the long read:

First of all, MIME types list is huge, see here: https://www.iana.org/assignments/media-types/media-types.xhtml

I like to use standard facilities provided by JDK first, and if that doesn’t work, I’ll go and look for something else.

Determine file type from file extension

Since 1.6, Java has MimetypesFileTypeMap, as pointed in one of the answers above, and it is the simplest way to determine mime type:

new MimetypesFileTypeMap().getContentType( fileName );

In its vanilla implementation this does not do much (i.e. it works for .html but it doesn’t for .png). It is, however, super simple to add any content type you may need:

  1. Create file named ‘mime.types’ in META-INF folder in your project
  2. Add a line for every mime type you need and default implementation doesn’t provide (there are hundreds of mime types and list grows as time goes by).

Example entries for png and js files would be:

image/png png PNG
application/javascript js

For mime.types file format, see more details here: https://docs.oracle.com/javase/7/docs/api/javax/activation/MimetypesFileTypeMap.html

Determine file type from file content

Since 1.7, Java has java.nio.file.spi.FileTypeDetector, which defines a standard API for determining a file type in implementation specific way.

To fetch mime type for a file, you would simply use Files and do this in your code:

Files.probeContentType(Paths.get("either file name or full path goes here"));

The API definition provides for facilities that support either for determining file mime type from file name or from file content (magic bytes). That is why probeContentType() method throws IOException, in case an implementation of this API uses Path provided to it to actually try to open the file associated with it.

Again, vanilla implementation of this (the one that comes with JDK) leaves a lot to be desired.

In some ideal world in a galaxy far, far away, all these libraries which try to solve this file-to-mime-type problem would simply implement java.nio.file.spi.FileTypeDetector, you would drop in the preferred implementing library’s jar file into your classpath and that would be it.

In the real world, the one where you need TL,DR section, you should find the library with most stars next to it’s name and use it. For this particular case, I don’t need one (yet 😉 ).

Questions:
Answers:
public String getFileContentType(String fileName) {
    String fileType = "Undetermined";
    final File file = new File(fileName);
    try
    {
        fileType = Files.probeContentType(file.toPath());
    }
    catch (IOException ioException)
    {
        System.out.println(
                "ERROR: Unable to determine file type for " + fileName
                        + " due to exception " + ioException);
    }
    return fileType;
}