Home » Php » php – How effective is the honeypot technique against spam?

php – How effective is the honeypot technique against spam?

Posted by: admin April 23, 2020 Leave a comment

Questions:

By “honeypot”, I mean more or less this practice:

#Register form
<style>
    .hideme{
        display:none;
        visibility: hidden;
    }
</style>
<form action="register.php">
    Your email: <input type="text" name="u-email" />
    Choose a password: <input type="text" name="passwd" />
    <div class="hideme">
        Please, leave this field blank: <input type="text" name="email" />  #the comment is for text-browser users
    </div>
    <input type="submit" value="Register" autocomplete=off />
</form>

//register.php
<?php
if($_POST['email'] != ''){
    die("You spammer!");
}
//otherwise, do the form validation and go on.
?>

more info here.

Obviously, the real fields are named with random hashes, and the honeypot fields can have different names (email, user, website, homepage, etc..) that a spambot usually fills in.

I love this technique because it doesn’t cause the user to be annoyed by CAPTCHA.

Do any of you have some experience with this technique? Is it effective?

How to&Answers:

It works relatively well, however, if the bot creator caters to your page they will see that (or even have a routine setup to check) and will most likely modify their bot accordingly.

My preference is to use reCaptcha. But the above will stop some bots.

Answer:

Old question, but I thought I’d chime in, as I’ve been maintaining a module for Drupal (Honeypot), which uses the Honeypot spam prevention method alongside a time-based protection (users can’t submit form in less than X seconds, and X increases exponentially with each consecutive failed submission). Using these two methods, I have heard of many, many sites (examples) that have eliminated almost all automated spam.

I have had better success with Honeypot + timestamp than I have with any CAPTCHA-based solution, because not only am I blocking most spammers, I’m also not punishing my users.

Answer:

With below technique, I block 100% of spams.

  1. honeypot with display:none.
    if failed, run extra script to collect IP address and write it in .htaccess file on deny from line.
  2. count number of URL on comment field.
    if failed, warn only because this can be human.
  3. measure the time to post.
    if less than 5 sec, show error message and let them try again because human can write pretty fast with auto-filling plugin.
  4. trim htaccess file dailly with crontab so deny lines won’t go over 30 lines (adjust accordingly).

Deny access with IP address is very effective because bots keep trying to sneak in with same IPs (if they change IP then I put that new IP on htaccess so no problem). I trim .htaccess file daily with crontab automatically so the file won’t be too big. I adjust the number of IP to block so same bot with same IP will be blocked for about a week or so. I noticed that same IP is used by bot for 3 days attacking several times.

The first #1 trick blocks about 99% and #2 blocks about 1% and the bot won’t go through those 2 so #3 might not be necessary.

Answer:

I’ve used the honeypot captcha on three forms since about 2010, and it’s been stunningly effective with no modifications until very recently. We’ve just made some changes that we think will stop most of the spambots, at least until they get more sophisticated. In broad strokes, here’s the way we’ve set it up:

One input field on each form is hidden (display:none specified in the CSS class attribute) with a default value of “”. For screen readers and such, the hidden input label makes it clear that the field must be left empty. Having no length at all by default, we use code server-side (ColdFusion in our case, but it could be any language) to stop the form submission if anything at all is in that field. When we interrupt the submission that way, we give the same user feedback as if it was successful (“Thank you for your comment” or something similar), so there is no outward indication of failure.

But over time, the bots wised up and the simplest of our forms was getting hammered with spam. The forms with front-end validation held up well, and I suppose that’s because they also don’t accept just any old text input, but require an email address to be structured like an email address, and so on. The one form that proved vulnerable had only a text input for comments and two optional inputs for contact information (phone number and email); importantly, I think, none of those inputs included front-end validation.

It will be easy enough to add that validation, and we’ll do that soon. For now, though, we’ve added what others have suggested in the way of a “time trap.” We set a time variable when the page loads and compare that timestamp to the time the form is submitted. At the moment we’re allowing submission after 10 seconds on the page, though some people have suggested three seconds. We’ll make adustments as needed. I want to see what effect this alone has on the spam traffic before adding the front-end validation.

So the quick summary of my experience is this: The honeypot works pretty well as it was originally conceived. (I don’t recall where I found it first, but this post is very similar to the first I saw about it more than a decade ago.) It seems even more effective with the addition of client-side validation enabled by HTML5. And we think it will be even better with the server-side limits we’ve now imposed on those too-hasty submissions.

Lastly, I’ll mention that solutions like reCaptcha are off the table for us. We spent significant time developing a web app using Google’s map API, and it worked great until Google changed their API without warning and without transition advice. We won’t marry the same abusive spouse twice.