Home » excel » php – Optimizing EXCEL + MySQL Processing

php – Optimizing EXCEL + MySQL Processing

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have a module in my application whereby a user will upload an excel sheet with around 1000-2000 rows. I am using excel-reader to read the excel file.

In the excel there are following columns:

1) SKU_CODE
2)PRODUCT_NAME
3)OLD_INVENTORY
4)NEW_INVENTORY
5)STATUS

I have a mysql table inventory which contains the data regarding the sku codes:

1) SKU_CODE : VARCHAR(100) Primary key
2) NEW_INVENTORY INT 
3) STATUS : 0/1 BOOLEAN

There are two options available with me:

Option 1: To process all the records from php, extract all the sku_codes and do a msql in query:

 Select * from inventory where SKU_CODE in ('xxx','www','zzz'.....so on ~ 1000-2000 values);

 - Single query

Option 2: is to process each record one by one for the current sku data

Select * from inventory where SKU_CODE = 'xxx';
..
...
around 1000-2000 queries

So can you please help me choose the best way of achieving the above task with proper explanation so that i can be sure of a good product module.

How to&Answers:

You shall find a middle way, have a specific optimal BATCH_SIZE , and use that as criteria for querying your database.
An example batch size could be 5000.
So if your excel contains 2000 rows, all the data gets returned in single query.
If the excel contains 19000 rows, you do four queries i.e 0-5000 sku codes, 5001-1000 sku codes….and so on.
Try optimizing on BATCH_SIZE as per your benchmark.
It is always good to save on database queries.

Answer:

As you’ve probably realized, both options have their pro’s and cons. On a properly indexed table, both should perform fairly well.

Option 1 is most likely faster, and can be better if you’re absolutely sure that the number of SKU’s will always be fairly limited, and users can only do something with the result after the entire file is processed.

Option 2 has a very important advantage in that you can process each record in your Excel file separately. This offers some interesting options, in that you can begin generating output for each row you read from the Excel instead of having to parse the entire file in one go, and then run the big query.