Home » Php » How to write to file in large php application(multiple questions)

How to write to file in large php application(multiple questions)

Posted by: admin July 12, 2020 Leave a comment

Questions:

What is the best way to write to files in a large php application. Lets say there are lots of writes needed per second. How is the best way to go about this.

Could I just open the file and append the data. Or should i open, lock, write and unlock.

What will happen of the file is worked on and other data needs to be written. Will this activity be lost, or will this be saved. and if this will be saved will is halt the application.

If you have been, thank you for reading!

How to&Answers:

I do have high-performance, multi-threaded application, where all threads are writing (appending) to single log file. So-far did not notice any problems with that, each thread writes multiple times per second and nothing gets lost. I think just appending to huge file should be no issue. But if you want to modify already existing content, especially with concurrency – I would go with locking, otherwise big mess can happen…

Answer:

Here’s a simple example that highlights the danger of simultaneous wites:

<?php
for($i = 0; $i < 100; $i++) {
 $pid = pcntl_fork();
 //only spawn more children if we're not a child ourselves
 if(!$pid)
  break;
}

$fh = fopen('test.txt', 'a');

//The following is a simple attempt to get multiple threads to start at the same time.
$until = round(ceil(time() / 10.0) * 10);
echo "Sleeping until $until\n";
time_sleep_until($until);

$myPid = posix_getpid();
//create a line starting with pid, followed by 10,000 copies of
//a "random" char based on pid.
$line = $myPid . str_repeat(chr(ord('A')+$myPid%25), 10000) . "\n";
for($i = 0; $i < 1; $i++) {
    fwrite($fh, $line);
}

fclose($fh);

echo "done\n";

If appends were safe, you should get a file with 100 lines, all of which roughly 10,000 chars long, and beginning with an integer. And sometimes, when you run this script, that’s exactly what you’ll get. Sometimes, a few appends will conflict, and it’ll get mangled, however.

You can find corrupted lines with grep '^[^0-9]' test.txt

This is because file append is only atomic if:

  1. You make a single fwrite() call
  2. and that fwrite() is smaller than PIPE_BUF (somewhere around 1-4k)
  3. and you write to a fully POSIX-compliant filesystem

If you make more than a single call to fwrite during your log append, or you write more than about 4k, all bets are off.

Now, as to whether or not this matters: are you okay with having a few corrupt lines in your log under heavy load? Honestly, most of the time this is perfectly acceptable, and you can avoid the overhead of file locking.

Answer:

If concurrency is an issue, you should really be using databases.

Answer:

If you’re just writing logs, maybe you have to take a look in syslog function, since syslog provides an api.
You should also delegate writes to a dedicated backend and do the job in an asynchroneous maneer ?

Answer:

These are my 2p.

Unless a unique file is needed for a specific reason, I would avoid appending everything to a huge file. Instead, I would wrap the file by time and dimension. A couple of configuration parameters (wrap_time and wrap_size) could be defined for this.

Also, I would probably introduce some buffering to avoid waiting the write operation to be completed.

Probably PHP is not the most adapted language for this kind of operations, but it could still be possible.

Answer:

Use flock()

See this question

Answer:

If you just need to append data, PHP should be fine with that as filesystem should take care of simultaneous appends.