View Full Version : Emails harvested through PHP code?
BLiZZaRD
11-17-2006, 10:17 AM
I am in an all out war with the spam bots and email harvesters.
I have deleted ALL email links from my site. I set up a contact form, and that is it as far as the end user can tell.
Here is my question, as I know little about bot harveters or whatever you call them:
The contact form is all PHP, and relies on 2 files, one holds all the email addresses and is placed outside my root folder, so it is on the same level as my public_html folder, not inside it. The other is the "instructions" for the form and has no reference to the other file.
I also have a php page I made that will check my inbox and display the current number of emails in my various inboxes. Through the source code no email address is visible. You can see the page here (http://cleverwasteoftime.com/cwotmail.php)
These are the only 2 spots where I have all these email addresses on my server. I am still getting a lot of spam to all the addresses, but I don't know the "life" of these spam bots. I just deleted and made the changes over the last couple days. Is it possible for the bots to get to either of these email addresses (specifically worried about the one outside the root folder), and if not does that mean that spam bots have gotten my email address earlier in the week or month and still send mail to it?
djr33
11-17-2006, 10:27 AM
This is like abstinence after getting AIDS. (Tried to come up with a better analogy, but I'm tired... gimme a break :p)
Hiding the emails won't fix it once it's a problem.
They can't get it from the php source or the file outside the public_html directory.
However, they already have your email. Done.
You have two options--
live with spam, since they now have your email and it's getting spread around to various spammers, companies, etc... it's on some big list.
Or, get a new email address and be more careful next time. I got rid of an old address for this reason.
Welcome to the wonderful world of spam. Ew.
I'd recommend gmail... it has a great spam filter, so you really don't get that much. //sidenote.
By the way, I suppose that over enough time, your email might fade away, but it also seems that once they have it, it'll just keep spreading and keep getting worse. Not really sure what you can do.
BLiZZaRD
11-17-2006, 10:39 AM
Well, it isn't bad yet, which is why I started the whole process. New emails are an option for some/most of the address there, but I wanted to stop the new ones from coming in.
If they can't get it through php then I feel a lot better about it.
I understand about the lists and all that, just wanted to nip as much in the bud as I could, and wasn't sure if what I have done is the best way to go. Seems it is though, and I feel comfortable with that.
I have gmail, and all my mail is forwarded to that and that spam filter is great, but I still have all my site only addys on the server and I have to go clean them out once in a while, just wanting to lessen that burden a little :D
Thanks.
djr33
11-17-2006, 10:40 AM
Yep. Sounds good. From what I can tell, you're doing the right thing with all of that.
BLiZZaRD
11-17-2006, 10:57 AM
Thats good to hear! :) I am still way behind the learning curve on the web design thing, but I am learning and for only doing it for about 1 1/2 years I think I am catching on pretty quick. Of course I would rather not make mistakes, but what better way to learn?
Acey99
11-17-2006, 07:14 PM
You could consider hiding the form from bots
$ua = getenv("HTTP_USER_AGENT");
$bot = (preg_match("/bot/i",$ua)) ? 1 : 0;;
if (!$bot) {
# show form
}
else {
# show nothing
}
Attached is a great browser class that'll help.
Because, of course, every spambot sends a user agent string that says "I'M A SPAMBOT! BLOCK ME!" :p
Acey99
11-17-2006, 08:00 PM
no every bot either sends a useragent or not
and since we can make a "Valid" list we can kinda make an "invalid" list
look at most bots
http://en.wikipedia.org/wiki/Useragent#Bots
what do they have in common?
look at the browsers, what do they have in common ?
Yeah I know The Mozilla Browsers can Spoof them, but isn't that a good thing to see if this script can block bots?
according to wikipedia thy all have a useragent string.
Acey99
11-17-2006, 08:02 PM
add his code to the browser class (near the end)
function isValidBrowser(){
return ( ($this->isIE() || $this->isMoz() || $this->isOpera()) && !$this->isBot() ) ? 1 : 0;
}
Those are innocent do-gooder bots :) Nasty spambots won't be so nice. There's no comprehensive listing of them because they tend to send user-agent strings that mimic those of popular browsers.
djr33
11-17-2006, 08:45 PM
Viruses don't play nice. They don't let you block them. That assumption is naïve and wrong.
(Is it the double dots over the i in naive? I always forget...)
It is :)
We're not technically talking "viruses" here. But yes, same concept applies. These aren't "nice" bots. They're not going to tell you what they are or what they're going to do, and they will try everything they can to look like a normal user.
mwinter
11-17-2006, 10:36 PM
Attached is a great browser class that'll help.
No server-side user agent detection code could ever be called "great"; it's even less reliable than attempting it client-side.
Mike
Acey99
11-17-2006, 10:53 PM
the other alternative is to have a "validation" image
Thus needing a db (MySQL)
and the GD Library (installed/compiled) ?
& ok, so it's a decent browser detection class.
Again, read my post on effective CAPTCHAs (http://dynamicdrive.com/forums/showthread.php?t=14168).
djr33
11-18-2006, 04:22 AM
Acey, CAPTCHAs are totally irrelevant here.
We're talking about a document containing email address. It would be accessed by a bot. No user would be supposed to see it either, so no need to include a CAPTCHA that would then allow them to see it.
The question is just how to store the emails on the server and have them available for his needs but not allow spam bots to collect them (or malicious users, for that matter).
The probelm was solved in the first post; he was simply asking if what he did was correct. And it was.
BLiZZaRD
11-18-2006, 08:18 AM
Aye, and I felt pretty secure storing the file with the email addys outside the /root/ but I wanted to verify this was indeed inaccessable (without server admin passwords of course)
Acey99
11-21-2006, 05:04 PM
ah, I understand
try html encoding them
or they some encoding method of your own I use this :
function EncryptText($iVal,$hex=""){
$out = "";
$sl = strlen($iVal)-1;
for($i=0;$i<=$sl;$i++){
$ch = substr($iVal,$i,1);
if ($hex) $out .= "&#x".dechex(ord($ch)).";";
else $out .= "&#".ord($ch).";";
}
return $out;
}
bot's don't understand &#xNN; or &#NN;
or what to do with them - yet;
you could also write your own encoder/decoder that would accomplish the same thing.
Acey99
11-21-2006, 05:33 PM
other things you can try:
if using a linux/unix box, hide the emails in a . file (like .htaccess)
don't use a file called emails (or have email in the filename)
also hide the file somewhere bot's can't get to (like /var/ or /etc), those are bad, but you get the idea.
use base64_encode() / base64_decode() on the text file.
use a database to store the emails, also encrypted with base64_encode() / base64_decode()
I have tons of other Ideas.
djr33
11-23-2006, 01:23 AM
Many bots DO understand that.
Anything you do that you then undo can also be undone by bots.
The problem is solved.
Directories above 'public_html' cannot be accessed.
Easy answer too.
mwinter
11-23-2006, 01:42 AM
if using a linux/unix box, hide the emails in a . file (like .htaccess)
don't use a file called emails (or have email in the filename)
Obfuscation is not security, and using daft names only makes maintenance harder.
also hide the file somewhere bot's can't get to (like /var/ or /etc), those are bad, but you get the idea.
That's one of the things djr33 suggested at the start of this thread, and BLiZZaRD had already done that, anyway.
use base64_encode() / base64_decode() on the text file.
Why? What's that supposed to accomplish?
use a database to store the emails, also encrypted with base64_encode() / base64_decode()
Base64 encoding is not a form of encryption. It is a means of transmitting 8-bit data across 7-bit protocols (like SMTP). Though it does so without regard for human readability, that is far from calling it a cipher. Again, obfuscation is not security.
Mike
function EncryptText($iVal,$hex=""){
$out = "";
$sl = strlen($iVal)-1;
for($i=0;$i<=$sl;$i++){
$ch = substr($iVal,$i,1);
if ($hex) $out .= "&#x".dechex(ord($ch)).";";
else $out .= "&#".ord($ch).";";
}
return $out;
}This is better than nothing, and has no disadvantages like a CAPTCHA does, but will still be ineffective against most bots.
The problem is solved.
Directories above 'public_html' cannot be accessed."Usually" :) Also, it must be noted that this "solution" falls apart if the email addresses are ever actually output :)
Acey99
11-27-2006, 11:17 PM
Obfuscation is not security, and using daft names only makes maintenance harder.
That's one of the things djr33 suggested at the start of this thread, and BLiZZaRD had already done that, anyway.
Why? What's that supposed to accomplish?
Base64 encoding is not a form of encryption. It is a means of transmitting 8-bit data across 7-bit protocols (like SMTP). Though it does so without regard for human readability, that is far from calling it a cipher. Again, obfuscation is not security.
Mike
Hopefully the last post on this..
ZnJhbmtAZnVydGVyLmNvbQ==
do you think a bot will look at that & say - oh look, an email address ?
I Don't. & agree, put in in a non accessable place like say /myfolder & make that folder :)
or you think a bot will look in a DB like Postgres or MySQL for emails and again with the above (ZnJhbmtAZnVydGVyLmNvbQ==) encrypted into it ?
I personally hide them in a wierd place. - not on my server !
djr33
11-28-2006, 12:12 AM
If a bot knows where to look, it will be inefective, just like if a bot knows a password.
However, I do agree that if the obfuscation of the email address makes it not look like an email address, that may help, unless the word "email" appears on the page, in which case the bot will search for it.
Also, in the case of converting to html characters ("encrypting") as opposed to straight text, bots may very well see that as clearly as we/they see plain text, so that isn't doing much.
Powered by vBulletin® Version 4.2.2 Copyright © 2021 vBulletin Solutions, Inc. All rights reserved.