View Full Version : Idea to stop spam bots
keyboard
12-02-2012, 08:26 PM
Has anyone thought of using .mousemove() (jquery) to detect spam bots?
Because (I'm assuming) spam bots wouldn't need to move the mouse while navigating a website (and therefore wouldn't) and normal users do, couldn't you use this to trigger some extra security to stop them? (Like a harder validation that is to time consuming or annoying to serve up to everyone)
Something like
$('body').mousemove(function() {
//set not a spam bot
});
Admittedly, this could cause some problems for mobile users... but maybe you only run it for non-mobile devices.
djr33
12-02-2012, 09:14 PM
Well, JS doesn't run for spam bots. At least it doesn't always. So, that means that this method is good in that it's something that real users would do, rather than something that would actually detect spam bots. But the downside is that it's client side, so that they could read the code and do the same thing. If someone is specifically targeting your site, it wouldn't help. It would catch generic bots though. It would catch some real users too, but I guess just harder validation might work out.
mouse-less users aren't limited to mobile, either.
However:
http://imgs.xkcd.com/comics/a_new_captcha_approach.png
keyboard
12-02-2012, 09:38 PM
No.... No I didn't.
That was a funny tv show... Especially when the asteroid hit the earth... Oh back in the young'en days :p
Hmmm.... I didn't think of that Daniel.
Maybe not having javascript on in it's self is enough to trigger extra security.... Would noscript tags (don't judge me) work on a spam bot application?
I assume not... but maybe...
djr33
12-02-2012, 09:41 PM
You'd have to create an exception for non-bots. Use JS to add a hidden field to the form with some value, either "not-a-bot" or maybe an MD5 string that will match something stored in a session var on the server.
Thus, if someone HAS this, it's NOT a bot. If they don't, it MIGHT be a bot.
Unless of course, as I said, the bots learn to use the material because it is client side.
Perhaps AJAX... they could still theoretically get it, but it wouldn't be in the raw source code, so it would take a lot more work for them. How's that sound?
keyboard
12-02-2012, 09:43 PM
I was thinking a php session?
If the user has javascript disabled, then the session won't work providing a backup... (E.g. if session var set and session var = true, no extra security else extra security end; )
And then right the session using ajax.
djr33
12-02-2012, 09:45 PM
Do bots not accept cookies? Some advanced ones might. (And maybe JS too.)
keyboard
12-02-2012, 09:51 PM
Errrr.... Daniel - -
Well, JS doesn't run for spam bots.
And if they've got javascript disabled, they couldn't accept cookies...
If I were making a bot, I'd disable javascript... it'd mean that half the validation on the website wouldn't work (the other half being backend) increasing their chances of getting a succesful hit.
I'm fairly sure that it was you who said that you couldn't use cookies to deny access to a site Daniel. (I was working on a way to stop brute force attacks and I was using a cookie to log the number of form submissions. You said that I couldn't do that because spam bots wouldn't do cookies or something like that)
I'd actually say that it's the less advanced bots who accept cookies and the more advanced ones that don't.... maybe.
But programming wise, it's not very hard to disabled javascript in a webbrowsing application. (depending, of course, on the language of their choice)
djr33
12-02-2012, 10:20 PM
Cookies and Javascript aren't the same thing. They can work independently. A well-programmed bot WILL accept cookies for exactly that reason. Otherwise CAPTCHAS would always work even if they displayed just "1" each time-- they use a session to store the CAPTCHA's correct response.
By "advanced" for bots I'm referring to how they're programmed. It's very easy to program a "bot" that just grabs the HTML as text from a page. I can do that in about 30 seconds with PHP, plus however much time it takes to add the functionality you'd like, such as submitting a form via POST. But it is much more difficult to built a browser for that bot-- cookies, Javascript, HTML, etc. For example, a good CAPTCHA question would involve questions about the geometry of the page-- even though the bot technically has that information it wouldn't see it as shapes but just as plain text un-rendered for graphics. The relatively easiest way to build a bot that can do those things would be to attach it to a browser, with the bot guiding the browser. The harder way would be to build that functionality in from scratch just for the bot.
keyboard
12-02-2012, 10:26 PM
Cookies rely on javascript.
If you don't have javascript on, cookies won't work.
Not all captchas work like that Daniel. re-captcha doesn't.
I thought there are bots out there that actually analyse the image and try to work out what it says?
That's why all the letters on them are slanted and so on.
It's actually surprisingly easy to make a web-browser.
Using the .net library, all the browsing engines are pre-built so you can start there and add in more functionality.
djr33
12-02-2012, 10:42 PM
No, that's simply untrue. They're completely unrelated. You can use cookies from PHP or, I think, send them in HTTP headers directly. ReCAPTCHA does something different, true. It uses Javascript. I believe it still uses sessions or something like sessions, though. I can't remember the details.
There are OCR programs to read text from an image. I don't know how many bots actually try that, but, yes, that's why the letters are obfuscated.
Well, it's still a lot more work than just creating a text-only bot that reads HTML pages.
Cookies rely on javascript.
If you don't have javascript on, cookies won't work.cookies work over HTTP (http://en.wikipedia.org/wiki/HTTP_cookie). JavaScript can create them, but is not required for them to function.
...You can use cookies from PHP or, I think, send them in HTTP headers directly.PHP simply sets the proper HTTP headers. (i.e., You could do the same thing using header().)
I thought there are bots out there that actually analyse the image and try to work out what it says?
That's why all the letters on them are slanted and so on.true. not with javascript, however (that'd be pretty inefficient and not very effective (http://ejohn.org/blog/ocr-and-neural-nets-in-javascript/)). AFAIK, python and java are common bot languages (for more advanced bots; there are tons of php bots running around that may or may not actually qualify as "bots").
djr33
12-03-2012, 12:11 AM
PHP simply sets the proper HTTP headers. (i.e., You could do the same thing using header().)Right. I meant that you don't even need PHP. You could do something with .htaccess or just in the server settings for what to do with HTTP requests. (And of course any other serverside language.)
coRpSE
08-16-2013, 12:28 PM
I know it's an old topic, but about 5 - 6 months ago or so, I designed a system to work on the Nuke CMS which works in 3 ways.
Text Removal - This here is an addition to this script which puts in a section that has off to the left of it a flashing "Antibot" and the form field is pre-filled in with information telling them to "Delete All Of This Text!". I also named that field "company" in hopes that if a bot does remove the text, it will fill in the information with a company name or something. As long as there is something in that field, it will kill the operation and stop them in their tracks.
Hidden Form Field - This snip-it of a code that will be put in is a fake hidden form field. This hidden form field will remain blank and hidden to all users, if a bot answers the hidden question, it will again, stop him in his tracks and he will go no further. I also hid the code using JS because most bots which I been reading up and read the HTML aspect of input if it is hidden or not. I know I could have used CSS, but I wanted to make it easier on the users to install with as little edits that they needed to do.
Wait Script - This script puts in a function so if they click the "Continue" button to fast, it will kill there registration and stop them right in their tracks. After watching a video on someone using one of the bot programs, I had an idea and that was the wait script. Most bots will fill out the registration info within a few seconds, (usually about 1 - 2 seconds.), where a human will take between 25 - 60 seconds seconds depending on what is needed for registration. So for the human aspect of it, there is a JS countdown timer over the "Continue" button that will tell them to please wait till the timer is done, then once the time is up, it will tell them they are okay to click the "Continue" button. In my script, I had set the default timer to 15 seconds but is not hard to change.
Those were the three key features that I put into this system and if any of them was done incorrectly, it would stop them and tell them what they did wrong. I was asked if I was going to put in a banning system, but I did not want to do that because of the high possibility of a false positives. But with the CMS that I wrote this for, I did design a admin side so each time its set off, it records the date and time, the email they tried using along with username and IP and store that info in the database that allows the admin of the CMS to view when ever he wants to and from there, if he feels that it was a bot, he has the info needed to block it if they decide to do so.
Just these three things that were not hard to put together and implement into the CMS have proven to be thus far 100% successful by the responses I have gotten from people using it. Some of the people have reported they were getting 100+ attacks per day and it has completely eliminated the attacks. Do I believe this to be a permanent solution for bots, absolutely not. But for those getting attacked on their sites, It may be something worth taking a look into for these three areas thus far has been successful.
The system I did for the CMS is found here: http://www.clanthemes.com/ftopict-8823-help-stop-bots-from-registering-on-your-site--part-ii.html
Basically, its only designed for the Evo Xtreme and RavenNuke CMS, but using the idea's wouldn't be hard to implement into pretty much any CMS once you figure out the registration steps.
@coRpSE
Welcome to DD. As a newcomer, "hot" links are generally frowned upon (since they are often nothing more than spam and self-promotion). I have left the URL in your post as text because it is -more or less- on topic.
The "hidden form field" you describe is generally known as a "honeypot." Yes, they're very effective. Timers are also common, effective anti-bot tools.
The "Remove this text" field, however, is likely to cause some confusion among "real" users. The best approaches don't even make themselves known to humans. If you're going to include something like this, I suggest simply using a captcha - they're more straightforward and familiar to users.
coRpSE
08-16-2013, 10:47 PM
@coRpSE
Welcome to DD. As a newcomer, "hot" links are generally frowned upon (since they are often nothing more than spam and self-promotion). I have left the URL in your post as text because it is -more or less- on topic.
Thanks, and I did not know, So many forums out there with so many different rules.
The "hidden form field" you describe is generally known as a "honeypot." Yes, they're very effective. Timers are also common, effective anti-bot tools.
The "Remove this text" field, however, is likely to cause some confusion among "real" users. The best approaches don't even make themselves known to humans. If you're going to include something like this, I suggest simply using a captcha - they're more straightforward and familiar to users.
To surprise allot, its the timer that catches most of the false positives. I have had about 60 people register on my site since I put this system on with little to no issues, (even those that barely know how to turn on their computer). Granted that it may confuse some people, that is why if you forget to remove it, it will tell you what you did wrong and all you have to do is click the back on your browser and fix what you did wrong. For the time, I did not know that was common. In my 11 years of registering on forums and site and what not, I have never come upon it. Granted, they could have it hidden where I actually used a visible timer above the "Continue" button using javascript where it would display when its clear to click the "Continue" button.
As for Captcha's, I have found they are almost worthless now. Captcha's may work on some of the older bots, but most of the newer bot programs get right through logic captchas and basic captchas like reCaptcha and the standard captcha's found on CMS's. Taking it out of the realm of the basics is what needs to be done because if everyone keeps doing the same thing, the bots will get by. As long as there are ways implemented that change things up and expand the ways bots are stopped, it will make it hard for those that create the bot programs to keep their program up dated and working on a wide selection of sites.
Everyone has their own idea's towards different ways to stop bots, but I find that captcha's are getting out dated and working less and less now. My honest opinion, allot of those captcha systems are harder then the system I have put together and less effective. Even with my site with two forms of captcha, (Logic and basic captcha), along with email activation, I have been hit with forum bots. That's why I put the system I have together and since I have put it on, I have had 0 issues, granted, I have had some false positives, but they were able to get past it easily with no repeating issues.
djr33
08-19-2013, 07:27 PM
There's a very important distinction to be made: what can a bot do automatically, and what can it do with some specialization/help?
None of what you described would stop a well-programmed bot, based on someone familiar with your website writing a bot for it. But it would stop any generic bot (unless it's particularly clever, not that most are).
In my experience, the best thing you can do is something novel that a generic bot won't expect. Beyond that, don't bother trying to stop a bot that is programmed specifically for your site.
A CAPTCHA (or something like it) is the only real effective way to stop all automated attacks. Certainly there are potential issues of how strong it is, but it's the only method that can actually make a bot unable to use your website, even if a human is there to help it figure out the little tricks (like hidden fields).
So personally, I look at it like this:
1. Do a few things to stop the enormous volume of generic spam. Basically anything will stop that. Those messages are generated by particularly stupid bots that cruise around the internet looking for any kind of form to submit. They're easily stopped.
2. Use human moderation/filtering to deal with the rest. Adding a captcha or something similar can limit more of this, but doing too much will also confuse legitimate users.
In the end, I think that's the most efficient system.
coRpSE
09-04-2013, 09:42 PM
There's a very important distinction to be made: what can a bot do automatically, and what can it do with some specialization/help?
None of what you described would stop a well-programmed bot, based on someone familiar with your website writing a bot for it. But it would stop any generic bot (unless it's particularly clever, not that most are).
In my experience, the best thing you can do is something novel that a generic bot won't expect. Beyond that, don't bother trying to stop a bot that is programmed specifically for your site.
A CAPTCHA (or something like it) is the only real effective way to stop all automated attacks. Certainly there are potential issues of how strong it is, but it's the only method that can actually make a bot unable to use your website, even if a human is there to help it figure out the little tricks (like hidden fields).
So personally, I look at it like this:
1. Do a few things to stop the enormous volume of generic spam. Basically anything will stop that. Those messages are generated by particularly stupid bots that cruise around the internet looking for any kind of form to submit. They're easily stopped.
2. Use human moderation/filtering to deal with the rest. Adding a captcha or something similar can limit more of this, but doing too much will also confuse legitimate users.
In the end, I think that's the most efficient system.
Most bots are not specifically targeting one style of site. My experiences, most automated bots are generalized to target a larger base of sites. One of the most productive and most sold bots that I found is what I used for the basis of what I put together. After re-searching it and watching youtube videos of people using it, gave me the idea's on what I put together. Captcha's worked in the past, but, automated bots can get by the pretty easily. Some of the more generic automated bots may be stopped by them, but we are finding they are less and less effective. Right now, on my site, when registering, I have two forms of captcha, and the bots have gotten by both of them. Captcha's are getting more and over complicated to the point where people hate them and just leave after failing them because they can't read them.
Example. An admin of this clan site contacted me 3 months ago on the CMS support site I work at, and he was referred by his host to talk to me. After talking to him in a chat system, he had told me that he had the email activation, captcha, and he had installed a second captcha system like I did, and the attacks were not slowing down on his site. He implemented my sytem in and since then he has had 0 bots register. So how you say a Captcha system is the only "real effective way to stop all automated attacks.", I would have completely disagree. Captchas are effective to a degree, as well as everything you can put out, but, nothing is ever 100% effective. Even systems like ReCAPTCHA that can be integrated are not 100% effective. When it comes to the spammers with human assistance or are human, nothing out their will stop them, even captchas.
All in all, every little precaution that can be taken should be taken if you really want to protect your site instead of running on the assumption that your site is protected. I always tell people to always think that their site is vulnerable and not to let their guard down and never believe that anything is 100% fool proof, because anyone with even the basic knowledge of anything knows that is BS.
djr33
09-04-2013, 10:59 PM
I stand by what I said. It's about a balance of effort-- complete spam protection involves some human assistance. The question is how to best balance two kinds of effort:
1. The effort involved in manually deleting spam.
2. The effort involved in designing an automated system to prevent spam.
The best answer is the one that involves the least effort as a sum of (1) and (2).
He implemented my sytem in and since then he has had 0 bots register.That's great, and I believe you. I had a no-protection form on my website a few years ago and started getting 10 spam emails per day. I added an incredibly simple (but unique) anti-spam measure [you can see it on my website if you want], and I have had 0 spam since as well.
It's not very hard to prevent spam like that-- all you need is anything that differs from the expectation of normal websites. In fact, this includes CAPTCHAs. It's better protection in some cases to have something unique than to have a theoretical strong, but common CAPTCHA, because the spammers will be trying to defeat the common CAPTCHA because it's on a lot of websites, while the spammers won't think about your website or design a bot to attack it.
What is important here is the distinction between:
1. Generic, automatic bots that don't specifically target your site.
2. Bots that are specialized for your website.
The methods you've described are not very good for (2), but they will work very well for (1).
Personally, I think this is the most important. There are three kinds of attacks:
1. Generic, automatic bots.
2. Targetted bots.
3. Humans.
You can prevent (1) as I said. You can try to prevent (2) by using something like a CAPTCHA (see below). And you can't do anything about (3).
So how you say a Captcha system is the only "real effective way to stop all automated attacks.", I would have completely disagree.Disagree if you want, but I'm not wrong. It's true that some CAPTCHAs are defeated by bots. It may even be true that the age of CAPTCHAs is over (I'd be happy as a website visitor) because bots are smarter than them. BUT... that doesn't mean that any other method is technically, or theoretically better at preventing bots. If you had a bot-designing competition, CAPTCHAs are basically the only system out there that bots can't pass but humans can (reliably).
There are lots of methods that accomplish the prevention of type (1) above, but there are very few methods that can effectively block type (2) above. If someone wants to attack your website, you will not be able to stop them with anything aside from some kind of human-targetted puzzle like a CAPTCHA.
The real trick is finding a better CAPTCHA-- one type that is interesting (but possibly hard for humans) is to say "click on the cat" and display 4 images. Computers can't do that very well, at least not yet. (However, some problems: Computers can actually guess somewhat, and out of a multiple choice set, maybe 4 images, they will be right probably 50% of the time. That's not great protection.) The issue is that methods like this rely on language, so that your visitors must speak English. A solution to that problem would be very useful.
Captchas are effective to a degree, as well as everything you can put out, but, nothing is ever 100% effective.Correct. But they actually block bots, rather than just confusing them. A bot can easily be programmed to circumvent most options, but there is no way to get around a CAPTCHA. It must actually solve the CAPTCHA. Therefore, even if it's not 100% effective, it's still better than everything else, which is about 0% effective against a targetted attack.
Even systems like ReCAPTCHA that can be integrated are not 100% effective.ReCAPTCHA is pretty good. The reason it doesn't work that well is because it is so popular. That makes bot-makers interested in breaking it. The lesson: don't rely on the most popular technology. (Imagine if you could use a special kind of key to lock your website. Now imagine if there was a single best kind of key. Now imagine if everyone had that exact key. Now imagine how hard it would be to break into all websites-- very easy, once you figure out the key!)
In this case, the situation is exactly the opposite of "safety in numbers". In fact, the internet may be the first place in history where that applies. (And perhaps with diseases.)
When it comes to the spammers with human assistance or are human, nothing out their will stop them, even captchas.Obviously not. But your methods won't stop them either. As you said, nothing well. That's not an argument against CAPTCHAs. That's an argument against trying to stop human spam (type 3 above). The only effective methods there are to: 1) filter suspicious messages (eg, keywords), and 2) deal with the rest manually.
All in all, every little precaution that can be taken should be taken if you really want to protect your site instead of running on the assumption that your site is protected. I always tell people to always think that their site is vulnerable and not to let their guard down and never believe that anything is 100% fool proof, because anyone with even the basic knowledge of anything knows that is BS. Of course. That's why I discussed a balance of effort above. Nothing is 100% effective, so it is a waste of time/effort to look for a perfect solution. I'm not trying to describe a perfect solution. I'm describing a good (perhaps the best) solution. It involves minimizing effort. Here, I make some effort to stop generic bots (by making my website unique), and I also may try to stop specialized bots by adding a CAPTCHA or something like that. For especially well-designed specialized bots, and of course for humans, I don't make any attempt stop them. That's a waste of time. The best method is applied, and that's done. Whatever extra spam comes through can be dealt with by me (or whoever is helping me on the website), manually. That's how these things work.
Finally, you have to be aware of scale. As a moderator here, I'm aware of just how strong the force of attacking spammers can be. This is a relatively popular website (within the top 3000 in the world), so it's not surprising the spammers make a lot of effort to attack it. We have automated methods in place, but not all spam can be stopped. It just can't be. Therefore, the moderators manually delete spam when it comes (and we do a good job-- it's very often deleted within 5 minutes of being posted, so there's not much incentive to continue posting it!). On a smaller site (my personal website for example), it's just not a problem like that. So if you really want to prove that your methods work as well as you claim they do, you'd need to try them on a giant site like facebook. The point is... your methods wouldn't work well at all, because a specialized bot could destroy them with little effort. A CAPTCHA would work very well, until someone put in a lot of time, and then a specialized bot would eventually be able to break it. In terms of strength, a CAPTCHA is better. In terms of practical purposes, it depends on the situation. The biggest variability in situation is how much spam you would get without any protection, and whether anyone cares enough to design a spam bot for your website.
Most bots are not specifically targeting one style of site.
Depends on the kind of site you have. If you run WordPress, for example, you will be targeted by wp-specific bots at some point.
reinadeoz
09-05-2013, 02:17 AM
In your quest to stop spam, your kind of spamming the forums. Ironic, huh? Maybe you should edit your first post with new ideas because everytime I see a new post I think I got an approval code and I didn't. I don't really think posting this many times when editing threads is allowed is necessary.
Cruceros en México (http://crucerosenmexico.net)
djr33
09-05-2013, 02:42 AM
reinadeoz, how can this possibly bother you? I mean, this is your first post on the forums. Did you subscribe to this thread for notifications before responding?
As far as I can tell, your post just borders on flaming and doesn't contribute to the discussion. You are right that coRpSE has posted similar information a few times, but it hasn't yet reached the point of spam.
everytime I see a new post I think I got an approval code and I didn't...
What do you mean by "approval code"? What are you looking for?
I don't really think posting this many times when editing threads is allowed is necessary.
Actually, new posts are easier to read. These aren't edits to his earlier posts; they're responses to other people's posts. It's a discussion. If you're not interested in it, then, as Daniel implied, make sure you're not subscribed to this thread.
Powered by vBulletin® Version 4.2.2 Copyright © 2021 vBulletin Solutions, Inc. All rights reserved.