Home » Php » php – Laravel Artisan command multithreading?

php – Laravel Artisan command multithreading?

Posted by: admin February 25, 2020 Leave a comment

Questions:

I have a command that scraped roughly around 300K webpages, and it takes forever to run since it’s a lot of websites and the website is throttled from where I’m running the server. So since the process of the web scraper is

POST Website > Scrape > Collect into Array > Write to DB

all the other steps than POST becomes delayed since it takes forever to even do the first step. So I’m looking to run multiple workers at once; The options I’m looking at is AsyncOperation and Queue Workers from Laravel, but I’m not exactly sure how I would implement either of those.

How to&Answers:

You are likely wanting to use the queue/worker system, which is explained in detail here:
https://laravel.com/docs/6.x/queues

One of the possible setups includes Supervisor (Linux process monitor) which makes sure that the php artisan queue:work command keeps running in the background, and gets restarted if an error occurs.

Within the Supervisor configuration you can then define that you want 4 instances of this running using numprocs=4 in the /etc/supervisor/conf.d/laravel-worker.conf file.

Basic queue explanation

So basically this is all depending on a queue, which for Laravel could be Redis (I can recommend this), Beanstalkd or a regular database table called “jobs” (the last one might not be the best solution for production environments) or any other implementation you choose.

Say if you are running 4 workers, one of the running queue:work processes will pick up and reserve a job as soon as one becomes available in your queue. Multiple jobs in the queue may thus be reserved by different workers.

Note that multiple processes run in parallel, which means that if you push 3 jobs to the queue, you cant assume that they will be handled in the order 1-2-3. They are commenced in this order, but they might not finish in this order. So you have to keep that in mind while doing any read or write operations like database queries. Depending on your needs, you can set the number of processes to 1 to ensure correct order of execution, but this may limit your throughput considerably.