Home » Php » Why does TRANSACTION / COMMIT improve performance so much with PHP/MySQL (InnoDB)?

Why does TRANSACTION / COMMIT improve performance so much with PHP/MySQL (InnoDB)?

Posted by: admin April 23, 2020 Leave a comment

Questions:

I’ve been working with importing large CSV files of data; usually less than 100,000 records. I’m working with PHP and MySQL (InnoDB tables). I needed to use PHP to transform some fields and do some text processing prior to the MySQL INSERTs (part of process_note_data() in code below). MySQL’s LOAD DATA was not feasible, so please do not suggest it.

I recently tried to improve the speed of this process by using MySQL transactions using START TRANSACTION and COMMIT. The performance increase was surprising. Processing time(s) dropped by a factor of 20. So, a 20 minute process only took about 1 minute.

QUESTIONS.

1.) Does anyone understand why there was such performance increase (20 mins to 1 min)?

2.) Should I be concerned about how big the transaction may get with 100,000 records?

3.) Should I be concerned with a large number of inserts and/or updates in the transaction?

/*
 * Customer Notes Data:
 * Rows are either a meeting, call or note!
 */
$row = 1;
$data = array();
$fields = array();
$line = '';

$db->query('SET autocommit=0;');
$db->query('START TRANSACTION;');

if (($handle = fopen("modules/".$currentModule."/Data/customernote.csv", "r")) !== FALSE) {
  while (($data = fgetcsv($handle, 4096, ',', '"')) !== FALSE && $row < 999000) {
    //Row 1 - CSV header row with field names
    if ($row == 1) {
      $csv_fields = $data;
    } elseif ($row > 1) {
      $fields = $this->process_note_data($data, $csv_fields, $row);
    }
    $row++;
  } // end while
  fclose($handle);
}

$db->query('COMMIT;');
$db->query('SET autocommit=1;');

Note: The text/field processing is done in the call to $this->process_note_data() which then calls another helper class that has the INSERT statement code. I didn’t have enough room to include all of the code. $db->query() is a typical database object for MySQL queries.

How to&Answers:
  1. Please check this link:

    https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-transaction-management.html

    InnoDB must flush the log to disk at each transaction commit if that transaction made modifications to the database. When each change is followed by a commit (as with the default autocommit setting), the I/O throughput of the storage device puts a cap on the number of potential operations per second.

  2. Big transactions may affect performance during commit (check above)

  3. Only in case of rollback, however it may be optimized using some settings (check the link)

Answer:

My own little test in .Net (4 fields pr. records):

INSERT 1 record, no transaction:60 ms

INSERT 1 record, using transaction:158 ms

INSERT 200 records using transactions, commit after each record:17778
ms

INSERT 200 records using no transactions:4940 ms

INSERT 200 records using transactions, only commit after last
record:4552 ms

INSERT 1000 records using transactions, only commit after last record:21795 ms

Client in Denmark, server in Belgium (Google cloud f1-micro).

I meant to put this in a comment but the formatting is not good….so here is my apology in advance 😉