Home » Php » Prevent Sending Duplicate record from MySQL and PHP

Prevent Sending Duplicate record from MySQL and PHP

Posted by: admin July 12, 2020 Leave a comment

Questions:

I have one table as ad_banner_queue which is i am using to generate the Queue based on weightage of ads. Ads are inserted into advertisement table. Queue will be generated if all existing ads which are in queue delivered to user.

Now the issue is how should i prevent to sending the duplicate ads in case of request came at same time and Rand() returned the same record?

Below is the Code:

<?php
/* To Get the random Ad */
public function getBanner($params) {
    /* Fetch the Random from table */
    $ads_queue = (new \yii\db\Query())
            ->select('ad_quque_id, banner_image, unique_code')
            ->from('ad_banner_queue')
            ->join('inner join', 'advertisement', 'ad_banner_queue.ad_id = advertisement.ad_id')
            ->where('is_sent=0')
            ->orderBy('RAND()')
            ->one();

    /* In case of queue is not there generate the new queue */
    if ($ads_queue === false) {
        $output = $this->generateAdQueue();
        //In case of something went wrong while generating the queue
        if ($output == false) {
            return array();
        }

        //Now fetch the record again
        $ads_queue = (new \yii\db\Query())
                ->select('ad_quque_id, banner_image, unique_code')
                ->from('ad_banner_queue')
                ->join('inner join', 'advertisement', 'ad_banner_queue.ad_id = advertisement.ad_id')
                ->where('is_sent=0')
                ->orderBy('RAND()')
                ->one();
    }

    /* Now, marked that one as is_sent */
    Yii::$app->db->createCommand()->update('ad_banner_queue', ['is_sent' => 1], 'ad_quque_id =:ad_quque_id', array(':ad_quque_id' => $ads_queue['ad_quque_id']))->execute();
    return $ads_queue;
}

/**
 * Below will Generate the Queue if not exist
 */
public function generateAdQueue() {
    /* First check thatt there is existing queue, if so don't generate it */
    $data_exist = (new \yii\db\Query())
            ->select('ad_quque_id')
            ->from('ad_banner_queue')
            ->where('is_sent=0')
            ->scalar();
    if ($data_exist === false) {
        /* Delete all other entries */
        (new \yii\db\Query())
                ->createCommand()
                ->delete('ad_banner_queue')
                ->execute();

        /* Fetch all banner */
        $ads = (new \yii\db\Query())
                ->select('ad_id, unique_code, ad_name, banner_image,ad_delivery_weightage')
                ->from('advertisement')
                ->where('status_id in (8)') //Means only fetch Approved ads
                ->all();
        if (!empty($ads)) {
            foreach ($ads as $ad) {
                /* Make entry as per that weightage, example, if weightage is 10 then make entry 10 times */
                $ins_fields = array();
                for ($i = 1; $i <= $ad['ad_delivery_weightage']; $i++) {
                    $ins_fields[] = array($ad['ad_id']);
                }
                Yii::$app->db->createCommand()->batchInsert('ad_banner_queue', ['ad_id'], $ins_fields)->execute();
            }
            return true;
        } else {
            return false;
        }
    } else {
        return false;
    }
}
How to&Answers:

I’m taking it that you mean that different “people” that are conducting simultaneous requests should not get the same random row? The most robust way, without testing it, in order to avoid the minute chance of the same record being selected twice in two running requests will probably be to lock the table and perform the read and update in a transaction. You would have to use a storage engine that supports this, such as InnoDB.

The way to accomplish LOCK TABLES and UNLOCK TABLES with transactional tables, such as InnoDB tables, is to begin a transaction with SET autocommit = 0, not START TRANSACTION, followed by LOCK TABLES. Then you should not call UNLOCK TABLES until you commit the transaction explicitly.

For example, if you need to read and write to your table in one go, you can do this:

SET autocommit = 0;
LOCK TABLES ad_banner_queue AS ad_banner_queue_w WRITE, ad_banner_queue AS ad_banner_queue_r READ;
... perform your select query on ad_banner_queue_r, then update that row in ad_banner_queue_w with is_sent = 1...
COMMIT;
UNLOCK TABLES;

The reason we lock with an alias is that you cannot refer to a locked table multiple times in a single query using the same name. So we use aliases instead and obtain a separate lock for the table and each alias.

Answer:

Although it might seem to be a trivial question it is not at all, there are several ways to handle it and each of them has its own downsides, mainly you can face this issue from three different points:

Live with it

Chances you can get a repeated pull are low in real life and you need to really think if you are willing to face the extra work just to make sure an ad is not shown twice in a row, you also need to think caches exist and you might break your brain making ads atomic just to find out browser/proxy/cache are serving a repeated ad 🙁

Handle it on database

You can handle this issue leaving the database the responsability to keep data safe and coherent(indeed it is the main task of database), there are several ways of doing it:

  • Locks and table (as previously suggested), I personally disklike the approach of using locks with PHP and MySQL, you will suffer performance penalities and risk deadlocking but anyway it’s still a solution, you just make a select for update on your queue table to be sure no-one reads again until you update. The problem in here is you will lock the whole table while this is done and you need to be careful with your DB Driver and autocommits.
  • Cursors Cursors are database structures created basically for the duty you are willing to do, you create a cursor and safely traverse it with its functions. Using cursors into PHP might be very tricky due to transactions too and you need to know very well what are you doing to avoid errors.
  • Cursors and stored procedures the best way to handle this into database is managing cursors inside the database itself, that’s why stored procedures are there, just create procedures to pull a new item from your cursor and to fill it again once it’s all consumed.

Handle it on PHP side

In this case you need to implement your own queue on PHP and there might be several ways to do so but the main problem might be implementing multiprocess-safe atomic operations on your app, I personally dislike using any kind of locks if you are not 100% sure of the execution flow of your app or you might end up locking it all. Anyway there are three chances in here:

  • Use sems or mutex both included on php or third party, timeouts and locks can become a hell and they are not easy to detect so as stated above I’d avoid it.

  • Use PHP MSG Queue I think this is the safest way as long as you run your app on a *nix system, just send all the available ads to the message queue instead of creating a table on database, once all ads are consumed you can regenerate the queue once again, the drawback of this system is your server can not be distributed and you might lose the current queue status if you don’t save it before a restart.

  • Third party queue system depending on your app workload or interactions you might need to use a queue management system, this is a must if you want a distributed system, it might sound too serious using a msg queue system to handle this issue but this kind of approaches may be life-savers.

Summary

If you can’t live with it and are proficient enough with databases i’d go for stored procedures and cursors, you don’t need to break your mind with concurrency, database will handle it as long as you are using an ACID compliant database (not MyISAM i.e.)

If you want to avoid conding into the database and your system is *nix and not going to be distributed you can give a try to msg_queues

If you think your system may be sometime distributed or do not rely on old SysV mechanisms you can give a try to a message broker like RabbitMQ, this nice things are addictive and once you start using them you start seeing new uses for them daily.

Answer:

You may use mutex component to ensure that there is only one process trying to pop ad from queue.

$banner = [];
$key = __CLASS__ . '::generateAdQueue()' . serialize($params);
if (Yii::$app->mutex->acquire($key, 1)) {
    $banner = $this->getBanner($params);
    Yii::$app->mutex->release($key);
}

However be aware that this may greatly reduce performance, especially if you want to process multiple request at the same time. You may consider different technology for such queue, relational databases does not really fit well for such task. Using Redis-based queue and SPOP may be much better choice.

Answer:

Presumably, the ads are presented from separate pages. HTML is “stateless”, so you cannot expect one page to know what ads have previously been displayed. So, you have to either pass this info from page to page, or store it somewhere in the database associated with the individual user.

You also want some randomizing? Let’s do both things at the same time.

What is the “state”? There is an “initial state”, at which time you randomly pick the first ad to display. And you pass that info on to the next page (in the url or in a cookie or in the database).

The other “state” looks at the previous state and computes which ad to display next. (Eventually, you need to worry about running out of ads — will you start over? Will you re-randomize? Etc.)

But how to avoid showing the same “random” ad twice in a row?

  • You have N ads — SELECT COUNT(*) ...
  • You picked add number J as the first ad to display — simple application of RAND(), either in SQL or in the app.
  • Pick a number, M, such that M and N are “relatively prime”.
  • The “next” ad is number (J := (J + M) mod N). This will cycle through all the ads without duplicating until all have been shown. — Again, this can be done in SQL or in the ap.
  • Pass J and M from one page to the next.

To get the J’th row: Either have the rows uniquely and consecutively numbered; or use ORDER BY ... LIMIT 1 OFFSET J. (Caveat: it may be tricky to fill J into the SQL.)

No table locks, no mutexes, just passing info from one page to the next.

Answer:

You should create a separate db table and mark ads that user received with its help. Before sending ads to user check if he has already received it.

Answer:

You can use transactions and SELECT FOR UPDATE construction for lock data and consistent executing of queries. For instance:

public function getAds()
{
    $db = Yii::$app->db;
    $transaction = $db->beginTransaction(Transaction::REPEATABLE_READ);
    try {
        $ads_queue = (new \yii\db\Query())
            ->select('ad_quque_id, banner_image, unique_code')
            ->from('ad_banner_queue')
            ->join('inner join', 'advertisement', 'ad_banner_queue.ad_id = advertisement.ad_id')
            ->where(new Expression('is_sent=0 FOR UPDATE'))
            ->orderBy('RAND()')
            ->one();
        if ($ads_queue === false) {
            $transaction->commit();
            return null;
        }
        $db->createCommand()->update('ad_banner_queue', ['is_sent' => 1], 'ad_quque_id =:ad_quque_id', array(':ad_quque_id' => $ads_queue['ad_quque_id']))->execute();      
        $transaction->commit();
        return $ads_queue;
    } catch(Exception $e) {
        $transaction->rollBack();
        throw $e;
    }
}

public function getBanner($params)
{
    $ads_queue = $this->getAds();
    if (is_null($ads_queue)) {
        $output = $this->generateAdQueue();        
        if ($output == false) {
            return array();
        }
        $ads_queue = $this->getAds();
    }
    return $ads_queue;
}

Answer:

Make your index unique or make a check that checks the data and see’s if it is a duplicate.

Hope this helps. Good luck