Home » Php » php – Will md5(file_contents_as_string) equal md5_file(/path/to/file)?

php – Will md5(file_contents_as_string) equal md5_file(/path/to/file)?

Posted by: admin April 23, 2020 Leave a comment

Questions:

If I do:

<?php echo md5(file_get_contents("/path/to/file")) ?>

…will this always produce the same hash as:

<?php echo md5_file("/path/to/file") ?>

How to&Answers:

Yes they return the same:

var_dump(md5(file_get_contents(__FILE__)));
var_dump(md5_file(__FILE__));

which returns this in my case:

string(32) "4d2aec3ae83694513cb9bde0617deeea"
string(32) "4d2aec3ae83694513cb9bde0617deeea"

Edit:
Take a look at the source code of both functions: https://github.com/php/php-src/blob/master/ext/standard/md5.c (Line 47 & 76). They both use the same functions to generate the hash except that the md5_file() function opens the file first.

2nd Edit:
Basically the md5_file() function generates the hash based on the file contents, not on the file meta data like the filename. This is the same way md5sum on Linux systems work.
See this example:

[email protected]:~# echo foobar > foo.txt
[email protected]:~# md5sum foo.txt
14758f1afd44c09b7992073ccf00b43d  foo.txt
[email protected]:~# mv foo.txt bar.txt
[email protected]:~# md5sum bar.txt
14758f1afd44c09b7992073ccf00b43d  bar.txt

Answer:

md5_file command just hashs the content of a file with md5.

If you refer to the old md5_file PHP implementation (but the principle is still the same) source :

function php_compat_md5_file($filename, $raw_output = false)
{
// ...
// removed protections

 if ($fsize = @filesize($filename)) {
        $data = fread($fh, $fsize);
    } else {
        $data = '';
        while (!feof($fh)) {
            $data .= fread($fh, 8192);
        }
    }

    fclose($fh);

    // Return
    $data = md5($data);
    if ($raw_output === true) {
        $data = pack('H*', $data);
    }

    return $data;
}

So if you hash with md5 any string or content, you will always get the same result as md5_file (for the same encoding and file content).

In that case, if you hash by md5 the content of a file with file_get_content() or if you use md5_file or even if you use md5 command with the same content as your file content, you will always get the same result.

By example, you could change the file name of a file, and for two different files, with the same content, they will produce the same md5 hash.

By example:
Considering two files containing “stackoverflow” (without the quotes) named 1.txt and 2.txt

md5_file("1.txt");
md5_file("2.txt");

would output

73868cb1848a216984dca1b6b0ee37bc

You will have the exact same result if you md5("stackoverflow") or if you md5(file_get_contents("1.txt")) or md5(file_get_contents("1.txt")).

Answer:

based on the file contents, not on the file metadata like the BOM or filename

That’s not correct about BOM.
BOM is a part of file content, you can see its three bytes in any non-unicode file editor.

Answer:

Yes, I tried it for several times.
In my case, result for:

<?php echo md5(file_get_contents("1.php")) ?>
<br/>
<?php echo md5_file("1.php") ?>

Produce output as:

660d4e394937c10cd1c16a98f44457c2
660d4e394937c10cd1c16a98f44457c2 

Which seems equivalent on both lines.