Log in

View Full Version : Resolved How do you retrieve a hashed password?



james438
01-25-2013, 03:23 AM
A while back I heard that a web designer should never store a plain text password anywhere. I can see how that can be done, but let's say the visitor to your site has forgotten his password. I would imagine that his password is stored in the database in md5 format. How do you retrieve that password in plain text format so that it can be emailed to the person requesting it?

On another note I just noticed that you can now rate up or down comments on php.net!

I get a kick out of little things like that.

EDIT: On a related note there are md5 decryption programs out there. Would double or triple md5 encryption be a good enough deterrent for the average hacker? Either way I imagine just getting the hashed password of any ol' user is difficult enough to get.

traq
01-25-2013, 04:38 AM
...let's say the visitor to your site has forgotten his password. I would imagine that his password is stored in the database in md5 format. How do you retrieve that password in plain text format so that it can be emailed to the person requesting it?

you don't.

not ever. If it is even possible (for you, as an admin/developer, even with full site/DB access), then your website has a critical security flaw.
passwords should be reset, not recovered.

here's a good way to approach it:
user clicks your "forgot password" link.
website generates a random password hash and stores it in the DB, along with the user account to associate it with and the time it was created.
website sends an email with a "password reset" link, which includes the hash (in the URL or query string).
user visits the password reset URL.
website gets the hash and compares is to the DB records.
if there is a match, and it's not too old (they should expire quickly; an hour is more than long enough), the user is allowed to set a new password.
DB record is invalidated.
user logs in with new password.

Regarding md5, it's "good enough" for little-to-no security sites, but using other algorithms is preferable. SHA-512 or Whirlpool (using PHP's crypt() (http://php.net/crypt) function) is best, but you've got the right idea with multiple hashing - even "weak" hashing functions can be made fairly secure this way. It's more like "hundreds," though, not "double" or "triple"; and salting is critical also.

receive the password the user choose.
generate a salt: an arbitrary string unique to the particular user.
A salt should be at least as long as the hash generated by the hashing function you're using.
combine the password and salt.
Some implementations break the salt into pieces and intersperse it with the password string.
You might even introduce more entropy by using different approaches based on the length (or other computed value) of the password.
hash the salted password.
You should do this many times (speed is not an advantage: it should be as slow as practically possible).
You might re-introduce the salt or vary the total number/sequence of hash iterations based on the password/salt (as above).
This is called stretching.

Obviously, you'll have to store the salt in your DB along with the final password hash, so you can recreate this process reliably for each password.

james438
01-25-2013, 05:50 AM
Ah, I was thinking along the wrong lines as far as password retrieval. It took me about 10 seconds to find a website to decrypt md5. I believe I already am salting my passwords. I think you have been to my site before and know that it is just a casual site mostly for my own use, but others can register and post in the forum or write an article if they so choose. In other words a casual site, so the need for high tech security is not critical. I'm curious if it is server heavy to use crypt() in a 100 or 300+ loop to hash a salted password. I'll try to read up more on the link you gave.

djr33
01-25-2013, 06:03 AM
Traq covered most of it.

My brief thoughts to add:
1. Hashing is one way. That's the design. It allows you to know if the two inputs were the same (technically, only with a very very high probability-- there's potential for two inputs to have the same output, but very very rarely, to the point where it's probably irrelevant-- after all, there are only 16^32 possible combinations and the input can be any length.)
2. There is no "decryption"-- instead, the only way is to compare known pairs-- there are some massive databases out there of MD5 hashes and inputs, so you can do a lookup based just on the database of known pairs (generated by brute force or popularity). So if you have an odd enough input it can't be reversed because they won't have the input.
3. Salting is a good trick.


Edit: saw your latest post, james.
That does sound server intensive to me, and I'm not sure why 100 loops would actually help. There's a point where you're just doing a 32-character string over and over again, so the odds of guessing it won't change at some point, I think. And I don't really see why looping it 2-3 times with salt won't be enough. I'd also suggest adding global salt on your site, just something like your domain name. That way "password" won't have the same hash as it usually would on any other site, regardless of the other aspects of your hash algorithm.

However, traq probably has a good reason for suggesting doing it so many times, so I'm open to hearing that-- I just personally wouldn't have thought it made sense to do it that much.

traq
01-25-2013, 07:02 AM
reason being:

many attacks involve lookup tables: lists of words/phrases and their resultant hashes.
Moving away from one-time hashing of passwords starts to protect against the risk of finding matching passwords right off the bat.

When it comes to brute force attacks, the idea is to make the attempt so slow that there's no point in doing it. A hundred iterations seems like a long time to you, but you're only doing it once every time a user logs in. It only takes a fraction of a second. But if you're trying to crack passwords, you have to expect to spend that "fraction of a second" possibly billions of times for each hash.

this is a good article (http://www.openwall.com/articles/PHP-Users-Passwords) that explains better than I can.


...I'd also suggest adding global salt on your site, just something like your domain name. That way "password" won't have the same hash as it usually would on any other site, regardless of the other aspects of your hash algorithm.

it's better to have password-specific salts, for the same reason you cite above: if two users on your site have the same password, they'll still have completely different hashes.

--------------

Keep in mind that, once you've moved away from plain text, you've defeated the risk of giving someone's password away (intentionally or not).

After that, it's all about limiting the damage in case your hashes are leaked/stolen (a very real concern, especially on shared servers). Once they're stolen, the attacker has a lot of time to crack them, and almost certainly will if they're intent enough. You need to slow them down as much as possible to give yourself time to suspend everyone's accounts and direct them to set new passwords.

djr33
01-25-2013, 08:17 AM
Interesting points. Let's see--

1. I don't see why 100 iterations would be a practical difference. Brute force is very hard to do if you require reasonably long passwords. I can see the argument here, but I don't think it's practical. There are bigger concerns with a site/security, right? But technically, yes, slowing them down would be good.

2. I don't mean you'd only use site-wide salt. Maybe it's overkill to have both. User-specific salt is useful. But having site-wide salt would accomplish the necessary security of making lookup tables irrelevant. If you append "mysite.com" to each password input, then it just means there's no point in looking anything up in a database because it's very unlikely any passwords ending in "mysite.com" will exist there. User-specific salt should be used too. (And perhaps with user specific salt, there's no point in site-wide salt.)

Personally, I think something like this is more than sufficient:

$pass = md5($pass);
$pass = md5($pass.'mysite.com');
$pass = md5($pass.$username);
Note that I added the salt in each time to a hash, rather than directly to the password. I don't know if there's any technical reason that's really more secure, but to me it seems less direct, and it certainly won't hurt.
(And you could use any hash algorithm you want-- I'm just demonstrating it with md5, which I think is still secure as long as you stay away from anything that might be in a database. I guess at some point there might be an almost complete database of md5 strings out there (at least up to a certain length), but with a few iterations it still seems relatively secure. Do such complete database exist? My impression was that they were only based on limited lists of inputs, not brute force to complete databases.)


There's also the security of them not knowing your algorithm, assuming your site is secure in the first place. If it's not, then you can be hacked in other ways anyway. The only problem with that logic I can think of would be for software you distribute to others. In that case, it would be vulnerable because the hacker would know what you're working with. (So, it might be a good idea to modify the hashing algorithm in whatever 3rd party software you use, like PHP forums, etc.)

james438
01-25-2013, 09:20 AM
Sorry djr33, but I could crack the code you posted in a few short minutes using a md5 decryption program I just found. crypt() sounds much more secure. Traq, if it is only or mostly about a delay against brute force attacks it sounds like it would be much less processor heavy if I add a wait command. I will and do salt my admin login encryptions though.

Thank you for the article. There is a fair amount of information to go over there.

djr33
01-25-2013, 04:28 PM
Interesting--
1. The decryption program you found can do any input? Not just a (potentially long) list of frequent ones? If so, then md5 is effectively dead. That's important to know. I was under the impression that md5('password') was weak (while somethingelse('password') might be relatively strong), while md5('random-new-string-never-used-before') was still (relatively) strong.

2. There's also the secrecy factor. Assuming no one can see the code you're using, using md5(md5($s)) might be enough. Or, more reasonably, something harder to guess. Maybe md5(sha1(md5($s))), or something else that mixes things up in an unpredictable way.


There are three pieces to the password puzzle:
--The hashed string (potentially known)
--The original password (unknown, except maybe by brute force)
--The algorithm (best if unknown, only known if they see your code)

And you need any two of those to figure out the other one, which is technically possible although in some cases very difficult (eg, only via brute force). If the algorithm is compromised (and it is known!), then that's a major security flaw indeed.

fastsol1
01-25-2013, 08:06 PM
Here is a good post to read also from another forum that I am on. There are a few of the guys on there that are very good at security and the understanding of it.
https://phpacademy.org/forum/topic/php-security-how-to-safely-store-your-passwords-15356

traq
01-25-2013, 09:02 PM
There are bigger concerns with a site/security, right?
Of course. SSL being the main one; everything you do is useless is it's being sent over HTTP in plain text.

Also using tokens to identify your forms, and rate-limiting login attempts.


2. I don't mean you'd only use site-wide salt. Maybe it's overkill to have both. User-specific salt is useful. But having site-wide salt would accomplish the necessary security of making lookup tables irrelevant.
Actually, I don't think it's overkill. I'm working on a class right now that will use two salts: one that can be determined client-side (though not the same one for everyone, it will be based on the input), and a stronger one that will be stored in the DB. This way, I'll be able to hash it "halfway," send it to the DB, and have the hash completed there. It saves a round-trip (I don't need to query the DB to get the salt).


There's also the security of them not knowing your algorithm, assuming your site is secure in the first place.
Hashing really accomplishes two different goals:

1) it should be impossible to produce a plain-text version of the password, either deliberately or by any sort of injection attack/ social hacking/ other trickery. For this purpose, strength is not as critical.

2) it should be difficult (hopefully "difficult enough" to be considered impractical) to crack the hashes if they are stolen or leaked. That's where a strong algorithm/process is needed.
...and don't just figure "you're not a target." james, I'm sure, knows that that's simply not true of anyone. Beyond that, consider the fact that most people use the same password for everything. That's not your fault, of course, and you can't be expected to be responsible for it, but you've probably got someone's bank or email password in there.


if it is only or mostly about a delay against brute force attacks it sounds like it would be much less processor heavy if I add a wait command.
no; it's processor-time we're talking about, here. Most brute force password cracking efforts take place on stolen hashes, not on your live site - so it's just a matter of not executing the wait, whereas extra processing can't be skipped.

fastol, I'm reading that now.

james438
01-25-2013, 09:53 PM
I use a different password/username for everything :) There may be one or two old exceptions to that, but I like to use something different every time I register some place. Hackers like to attack any weak security. If they can attack your site, even if it is the most generic site in the world, they will and repurpose it into whatever they want.

james438
01-25-2013, 11:35 PM
How would you use cookies to save a successful login so that the user can stay logged in even after leaving the site using crypt()? I'm sure the solution will come to me if I think on it a while. I figure that the password is not saved in the cookie in plain text, but using crypt is not making sense either. With md5 it was a lot simpler.

It just seems to me that it would not work to store the password in the database in crypt() format. It would have to be something like md5.

fastsol1
01-26-2013, 01:26 AM
Well from my knowledge and workings, you can do a persistent login with cookies but you would need to re-verify the person when they come back to the site to make sure they are who they say they are. I personally set at cookie for their username and another for their unique id in the db. I don't put their password either plain or hashed in the cookie so that they can't take their known password and the hash and do a reverse crack to figure out the algorithm steps I take. Then when they come back to the site I validate those cookies against the database and then turn everything over to sessions while they are there so they can't do more poking on the security once they arrive. Some sites also validate the IP but that is a little too much for the normal site to worry about cause so many people use mobile devices now that the IP is constantly changing.

I have found and read many of the tutorials for login systems and most if not all really don't do a excellent job of it securely. On my sites I have it re-validate the session even after 30 minutes of inactivity and store all passwords in a hash() whirlpool plus salt and unique salt to the user.

In my opinion, checking the username and user id from the cookies is pretty fool proof as long as you also have the db secured against sql injection cause user names and user ids will always be unique to one person and any trying to get a different combination will simply fail unless they just truly get lucky with the combination. I also require strong passwords with at least one capital letter, at least 7 characters long and at least one number.

Honestly a person could go on and on about this subject but if you take the normal precautions you will be fine. If your site gets big then by all means drastically improve your security so you don't become another victim of those rogue sites that deliberately attack sites just to show people they could, they do have some pretty smart people working there. My understanding is that Facebook uses O-Auth or something like that, that they developed I think, it is said to be very very secure and there are tutorials out there on how to login a person through the system as long as they have a facebook account. In that instance they are using their o-auth to do all the validation.

djr33
01-26-2013, 01:27 AM
Traq, interesting points. Makes sense to me. I think you missed my point about not knowing the algorithm, though. Let's say you use md5(md5(string)) (or anything else). Short of actually recursively unencrypting (via lookup tables), there's no way to work out what the original password is. The only reason that isn't very strong is because it's just using md5 on a loop. It seems very easy to me to create a unique/unknown hashing algorithm. For example, let's say this: md5(string) then reverse the first and last characters of that resulting string. Now, unless someone hacks your server and can see your algorithm they will never* figure out what you're doing. (*You'd actually have to make it less transparent than that-- if they had a password and a hash, then they could work out by just guessing that you did what I described. But if you make it a little less visually obvious, then what I said holds.)

Is this not true? Can you think of any reason that doesn't make sense?


Admittedly, there is one problem: if someone hacks your site, and some of your users use these passwords on other websites, then by hacking your site and working all of this out, the passwords will be compromised for all sites they're used on. But that's the users' fault anyway.



James, you should never allow the hash to give direct access to the account. If you do that, then it becomes the password and there is no decrypting necessary-- they can manually set the cookie to that value (assuming they can get their hands on the hash) and then they will have access.

There are two possibilities as I see it:
1. Use a weaker intermediate hash as a client-side somewhat non-secure login token stored in a cookie. This would then AGAIN go through hashing on the server (for the original point of it), but it would already be hashed on the client's machine so the password could not be read directly. However, aside from privacy concerns, this is actually as non-secure as just storing the password itself in a cookie-- that hash will still allow someone else access to the site. Probably best to avoid this.

2. Don't store the password or the hash anywhere. Upon logging in (that is, when the user submits a password that generates a hash matching the hash on file), then you would create a session (or database entry) and store a value for that user to be logged in. It can be as simple as $loggedin= 1; (or realistically, in a DB query or $_SESSION['loggedin'] = 1;), along with whatever account it is associated with (or if you prefer use the User ID as the value of that variable). Do not store that on the client side obviously, because they could fake that. Store it securely on the server-- the user never sees it (and by the way, unless you print/echo it, sessions are hidden from the user, so that's fine, short of the server being hacked).
Now, the login is based on the session (or DB entry). You will need to still connect the user to that session (or DB entry) and this can be done with a cookie (or by other things like remembering the IP address, but cookies are the most common and probably best way). And sessions automatically make those cookies for you, so you're set.

There is still a security concern-- if someone steals the session ID cookie from your computer, they can steal your session-- they'll be temporarily logged in as you. However:
1. That only lasts as long as that session. (Note that you wouldn't want to allow direct access to changing your password without re-inserting the old password for this reason!!)
2. You can try to prevent this by using HTTPS.
3. You can blame the user for getting hacked, and it's only for one session anyway.
4. You can require the same IP address for that session, if you'd like. The downsides: i) a network would share an IP address, so a hacker in a coffee shop would still be able to gain access with that session ID; ii) the user will need to keep logging in repeatedly if they're changing IP addresses-- this is rarely relevant but can be for mobile users who are switching between wifi and mobile networks often.
5. The password itself is never stolen or visible, and they cannot control how long that session ID is valid for-- if you can somehow catch this, you could stop it immediately, and the whole time the password is still secure and the user can just log in again. (This is one reason, I think, that banks and so forth automatically log you out after inactivity-- it just makes it all that much harder for someone to steal your session and do anything with it.)

james438
01-26-2013, 03:13 AM
The cookies used on my site are all tied to their ip address, which is stored in the database and updated every time the user logs in. I find that tying cookies to ip addresses is very handy because if I log in to my site on my home computer, but forget to log out of my friend's computer that I was visiting earlier in the day the cookie on my friend's computer will no longer work.

djr33
01-26-2013, 03:33 AM
I've done that on sites before and I like it. I'm a bit hesitant due to mobile users, and there's the thought that multiple users behind one IP are still non-secure, but overall it seems at least beneficial (if not perfect) in many cases. It's a tradeoff of how many times you require users to log in and how much security you want. Often it makes sense, I think.

For the record, you could also impose a 20 minute or 3 hour or 5 day limit on the access regardless of IP, if this is a truly high security website. (That could be extended with activity, or not.)

traq
01-26-2013, 03:38 AM
How would you use cookies to save a successful login so that the user can stay logged in even after leaving the site using crypt()? I'm sure the solution will come to me if I think on it a while. I figure that the password is not saved in the cookie in plain text, but using crypt is not making sense either. With md5 it was a lot simpler.
Don't store the hash (or password, or anything related to it). It shouldn't be exposed, and it's unnecessary anyway.

If they're already logged in, all you need is a unique identifier: the session id works just fine (even if you don't store any info with it).
Set the session cookie to expire some amount of time in the future, and store the id in the DB. You could even store their old session along with it and pick up where they left off.

In short, you're not really using the cookie, or the info it contains, for authentication - you're just making the user's session longer, and interruptable.

Which brings up another good practice - whenever a user tries to perform a sensitive action, you re-authenticate. If they want to view their user cp, and didn't log in _this_ session, they need to confirm their password. If they want to change their email or password, they need to confirm their current password "no matter what."


It just seems to me that it would not work to store the password in the database in crypt() format. It would have to be something like md5.
I'm not sure how you mean. Whatever algorithm you use with crypt() will give you a hash (of some length), just like md5 does.


Well from my knowledge and workings, you can do a persistent login with cookies but you would need to re-verify the person when they come back to the site to make sure they are who they say they are. I personally set at cookie for their username and another for their unique id in the db.
There's no real reason to set two cookies. If one is stolen, it's all but certain the other was also.

How do you verify them (or do you just mean you check the id in the cookie against the DB record)?


I don't put their password either plain or hashed in the cookie so that they can't take their known password and the hash and do a reverse crack to figure out the algorithm steps I take.
Absolutely.


In my opinion, checking the username and user id from the cookies is pretty fool proof as long as you also have the db secured against sql injection cause user names and user ids will always be unique to one person and any trying to get a different combination will simply fail unless they just truly get lucky with the combination.
instead of giving them the user id and name, give them a token (e.g., hash the user name and id). If you're going to give out the user id in a cookie, you have to assume that it will eventually be leaked - and therefore, can't be trusted for authentication. If you hash the username+id, you might add another unique value to the mix - say, the timestamp when they logged in and/or the browser's AU string. That way, it'll be harder to guess what the hash is composed of (since, every time they log in, the hash in the cookie will change).


I also require strong passwords with at least one capital letter, at least 7 characters long and at least one number.
These sorts of requirements can be self-defeating: an attacker can narrow the list of possible passwords if they know that they must contain characters of certain classes. A minimum length, absolutely - but "at least one of" and "one of" makes it harder for the user to remember the password, too. I always recommend passphrases (http://xkcd.com/936/), but there's no way to "enforce" any of this without causing more problems.


Honestly a person could go on and on about this subject but if you take the normal precautions you will be fine.
Agreed - "normal" meaning "reasonable," of course; not "typical."


... using their o-auth to do all the validation.
OAuth isn't an authentication tool, itself, it's a protocol for allowing third-party authentication (it's for trusting someone else to authenticate). It can be implemented with any login system, really. FB, twitter, most big players implement it.


I think you missed my point about not knowing the algorithm, though... let's say this: md5(string) then reverse the first and last characters of that resulting string. Now, unless someone hacks your server and can see your algorithm they will never* figure out what you're doing.
Of course - in this case, we're talking about "making it difficult" to crack stolen hashes (i.e., limiting/delaying damage from a successful break-in). If someone got your DB, they didn't necessarily get your site also, but it's a good bet.

Additionally, there's some contradictory wisdom:
...you should use stuff designed by people who know what they're doing, rather than trying to do it all yourself.
...if you use something designed by others, the attackers probably know about it (and better than you do) too.

Widely used software (like WordPress, Joomla, etc.) has this disadvantage.


James, you should never allow the hash to give direct access to the account. If you do that, then it becomes the password and there is no decrypting necessary-- they can manually set the cookie to that value (assuming they can get their hands on the hash) and then they will have access.
Right. You can use a "logged in" token to recognize the user (who they're claiming to be), but not to authenticate them (prove who they are).


Here is a good post to read also from another forum that I am on. There are a few of the guys on there that are very good at security and the understanding of it.
https://phpacademy.org/forum/topic/php-security-how-to-safely-store-your-passwords-15356
Good article, for the most part. I don't claim to be a crypto expert (more like a passive enthusiast or something), but a few things in that article don't strike me right.

First off, his attack on "double-hashing" contradicts most advice I've heard anywhere, some of it from people I consider authoritative. "Stretching" is usually considered indispensable for creating "difficult" hashes.

His argument about increasing the probability of a collision doesn't quite make sense, either: if you increase the odds from 1 / (n) to 1 / (n/2), and so forth, by extension you'll eventually reach 1 / (n/n)... essentially, the point where *any* password will match any other password. While that may statistically (eventually) (maybe?) be true, I don't see it as a practical concern. (I think -not 100% sure- that this doesn't hold up anyway: it's like buying 100,000 lottery tickets for consecutive drawings. It doesn't increase your odds. Each time, you're back to square one.)

(Speaking of the lottery, do you know just how big n is? Even with SHA-1, collisions are *highly* unlikely.
A higher probability exists that every member of your programming team will be attacked, and killed, by wolves, in unrelated incidents, on the same night.I forget where I read that, but it's real statistics. In any case, the author contradicts himself later on:
PBKDF2 (Password-Based Key Derivation Function v2)
Works off MD5, however this takes the current text and re-hashes thousands of times using md5. So it's actually really good, ...

james438
01-26-2013, 03:44 AM
This thread has become more involved than I originally intended, but it has increased my understanding of hashing and passwords. I have a few ideas now on how I want to improve security on my site. Namely I'll be creating my own hash that is a variant of one of the currently used hashes out there, keep using ip addresses as I have been, stop storing hashed passwords in cookies, and one or two other things that would be best left unmentioned.

EDIT: Just saw your post, traq, and am currently reading it.

EDIT: With crypt() the value is always different.


<?php
$test="password";
$test=crypt($test);
echo "$test";
?>

Now refresh the page a few times.

traq
01-26-2013, 04:18 AM
With crypt() the value is always different.
<?php
$test="password";
$test=crypt($test);
echo "$test";
?>Now refresh the page a few times.

You're leaving out the $salt argument, so PHP is generating a new (random) salt each time.

more (complicated) info (http://php.net/crypt)


$test = 'password';
for( $i=0; $i<10; $i++ ){
print crypt( $test,'salt' )."<br>";
}

djr33
01-26-2013, 04:41 AM
James, I haven't tried it, but I can't imagine it is supposed to give different results. I'd suggest checking over that script in some detail. The PHP.net page gives some examples including specifying which algorithm you're using, and that might be important. Admittedly that's one of the most confusing explanations I've seen for a function because it has so many non-argument parameters.


Traq, I looked over that article (only) to check on the potential problem of double-hashing. I'm quite confused by it too. But I think there's a point. Here's how I see it:

--Benefits of double-hashing (or similar techniques):
It means you're getting unique hashes (or at least they're less common). Hopefully those hashes aren't out there floating around in a database somewhere. For this reason, it makes more sense to me to do some weirder things that just double hashing (perhaps, in the middle of it, reverse some characters in the string, or whatever you want). In short, it just adds layers that would need to be unlayered/decrypted for anyone to track it. It also is a serious advantage with them not knowing your algorithm (but don't use just md5(md5()), haha, since that's obvious).

--Problems of double-hashing or more:
Every non-initial iteration has the same kind of input, a hash. Assuming that similar inputs generate similar hashes*, the odds of two hashes giving the same output hash are higher than that of two unrelated passwords in the first place. Therefore, "collisions" themselves aren't the problem, but that there will be multiple original passwords that could give you access. With brute force, that just means that it's relatively faster to find a password that happens to work, even if it isn't the same one originally entered by the user-- but because of the collisions it still looks the same to your server.
(*I have to assume that this is part of the concern. If two almost identical inputs are NOT more likely to give identical hashes than two completely unrelated inputs, then I can't see why this would be less secure in that sense. A perfect hashing algorithm would give completely unrelated results. But maybe the algorithms aren't perfect?)
Alternatively, there's another possibility, which might be what they were focusing on in the link. By iterating the algorithm many times, the odds increase that one of the loops generates the same hash as a previous loop. If that happens, then there will be an infinitely repeating pattern. (Think back to long division, where eventually the numbers began to repeat and you could stop because you knew it would keep repeating.) Regardless, I'm not sure why this would be less secure that only one iteration, although it may be asymptotically less beneficial to add more and more iterations. (Maybe if there might be any sort of unbalanced nature to the algorithms (one output is more probable than another) this could eventually lead to potential collisions and many inputs with the same output.)

Beyond that, my brain can't really handle working out the probabilities at the moment.



From another perspective there is a legitimate (theoretical, not practical) concern here-- by re-hashing, you are limiting it to the 16^32 possible inputs that hashes might be. Therefore, that's fewer than the N^L (N=possible symbols; L=length) passwords that could have been used. So on purely mathematical grounds, there are fewer possible inputs, fewer possibilities for brute force, etc. The problem with this concern is that 16^32 is a huge number! And even if that's not "enough", then we have to concede that most passwords are less than 32 characters anyway. The result is something like 100 symbols with maybe up to 16 places = 100^16. That's 1e32, and 16^32 is 3.4e38. Admittedly, the possibility of using 100 characters is helping the shorter passwords come somewhere near the scale of how many possible hashed strings there are, but it's still 6 powers of ten away-- that's like one million compared to one. The point is... although it's "limited" in a theoretical sense, it's still plenty big for the real world.
A similar issue is that of using sha1(md5(x)). Sure, using md5() in there limits the amount of information going into sha1(). But it's still as powerful as the original md5(). It's not LESS powerful than the md5().

Basically... there are some mathematical issues here and if you started to iterate thousands of times you might find that the more iterations were each less and less effective. But I don't see any of this actually compromising the hash so that it's worse than only hashing once. The only way that would be the case is if the hashing algorithms are not properly distributed and some results are more likely than others. But still, that would probably only cause issues at very large numbers of iterations.




Traq, as you said the odds do go way up as you repeat more times that you would eventually happen to land on the right hash somewhere in the iterations. But the way for it to actually work is to land on the right hash at the end of the iterations. I don't see how having the right hash after 50 iterations would be at all helpful for you to find the right one after 100. It might seem like you're "close" if you did get the right hash at the wrong time, but when you actually try to use that to access the server, the server won't care that it's close at one point in time-- it'll just check the last value, which likely won't be that one. The weakness would only apply if looping began to occur, which again only would happen if it was a badly designed hashing algorithm, I think. (Technically it might necessarily loop, but that's going to be after probably billions of iterations.)

traq
01-26-2013, 05:02 AM
by re-hashing, you are limiting it to the 16^32 possible inputs that hashes might be. Therefore, that's fewer than the N^L (N=possible symbols; L=length) passwords that could have been used. So on purely mathematical grounds, there are fewer possible inputs, fewer possibilities for brute force, etc.true. but the second (third, fourth, nth) hash does not have fewer possible outcomes than the first.


The problem with this concern is that 16^32 is a huge number!
Exactly.


And even if that's not "enough", then we have to concede that most passwords are less than 32 characters anyway.
Exactly.


the odds do go way up as you repeat more times that you would eventually happen to land on the right hash somewhere in the iterations. But the way for it to actually work is to land on the right hash at the end of the iterations. I don't see how having the right hash after 50 iterations would be at all helpful for you to find the right one after 100.
Exactly.


Something else to consider: part of the reason that hashes are non-de-hash-able (or, like, whatever) is that they're lossy. Flip this bit, rotate these ones, log() that one, throw away every fifth value that is a power of two. (Random example, of course, not any real algorithm.)

james438
01-26-2013, 05:09 AM
My posts are now almost off topic. If you want I can start a new thread. Continuing on with my current area of confusion:

I notice that:

<?php
$test="password";
$test=crypt($test,'saltt');
echo "$test";
?>

produces the same results as:


<?php
$test="password";
$test=crypt($test,'salt');
echo "$test";
?>

djr33
01-26-2013, 05:45 AM
James, I don't mind a couple topics in one, since it's all going in the same direction. But let us know if we're being distracting :)

As for your test code, I really don't know what's going on there. It sounds like something may be going wrong with your server. Maybe check the configuration for crypt()... traq, any ideas?


Traq, three things to add. I've been doing quite a bit of reading (including getting stuck in a Wikipedia loop for a while):
1. I figured out what wasn't making sense earlier. Hash algorithms aren't perfect.
http://en.wikipedia.org/wiki/Random_oracle
A "Random Oracle" would be perfect. But it doesn't (can't?) exist:

In cryptography, a random oracle is an oracle (a theoretical black box) that responds to every query with a (truly) random response chosen uniformly from its output domain, except that for any specific query, it responds the same way every time it receives that query. Put another way, a random oracle is a mathematical function mapping every possible query to a random response from its output domain.
...
No real function can implement a true random oracle.This means that all hashing algorithms (that exist, potentially ever could exist) are biased in one way or another. Therefore, more iterations means more impact of that bias. I'd assume that asymptotically, the result of md5(md5(...)) would end up at the same/similar values regardless of the starting input. But... these are HUGE numbers. Doing two or three iterations won't hurt at all I don't think. And doing 100 or 1000 might not. Doing 1 billion might.

2. There's an important detail to be added to your point above:

Something else to consider: part of the reason that hashes are non-de-hash-able (or, like, whatever) is that they're lossy. Flip this bit, rotate these ones, log() that one, throw away every fifth value that is a power of two. (Random example, of course, not any real algorithm.) There's no non-theoretical reason for knowing someone's actual password* (and you're right; that's hard). Lossy means we can't deterministically work out what the input was because there isn't enough information in the output to reconstruct it. But that doesn't really matter-- we're not trying to reconstruct the input; we're trying to reconstruct an input. Still, what you said probably applies, but at the most abstract level, there is enough information in the hash to reconstruct an input (via brute force) given what we know about it: it is some value X such that as an input to the algorithm it generates that hash. Even given the strongest brute force systems you can imagine and infinite time, you'd never be able to determine the original password-- it's lossy and there's no way to know; but you could find some password that works just as well.

In some sense, the lossiness is bad because then we're not using as much information as we could to make a complicated hash (and by coincidence there may be other passwords we could use equally well). On the other hand, the lossiness is a good thing because it obfuscates the original input and makes it harder to guess what's going on. Part of hashing well, I'd imagine, is not leaving any clues in the form of the hash that tell you about the algorithm or the input. Having a set length output for all inputs is useful in that.

(*The only practical reason to know someone's real password is so you know what they used-- perhaps to know more about them (what is their favorite pet's name, anyway!?) or more importantly to gain access to their other accounts if they use the same password. Having an algorithmically equivalent alternative only works with the same algorithm-- so if you want access to their email or bank accounts, you'd need to know the original password unless you're hoping the email or bank websites use the same exact algorithm as the one you hacked.)

3. Aside from lookup tables (or "rainbow tables"), the weaknesses of MD5, SHA1 and other algorithms are due to collision attacks. That is, in less time than brute force, people have managed to find an input Y that has the same hash as another input X. This is problematic when it could be used as a method to create verifying but fake data (eg, a checksum for a download would match but the data would be off, and might contain a virus) or to verify something like a security certificate. However, it doesn't appear to actually directly relate to what we're doing here with passwords. It means it might take less time to come up with a password that works, but it doesn't sound like it actually "breaks" the algorithm for use in passwords as it is used here. But.... I'm not sure about that. Basically the descriptions focus on the attacks being used for other purposes and only hint at it maybe not being relevant for passwords. The lookup tables, to the extent that they do exist, are the real problem for these algorithms.

traq
01-26-2013, 06:33 AM
My posts are now almost off topic. If you want I can start a new thread. Continuing on with my current area of confusion:

I notice that:

<?php
$test="password";
$test=crypt($test,'saltt');
echo "$test";
?>

produces the same results as:


<?php
$test="password";
$test=crypt($test,'salt');
echo "$test";
?>

The salts need to be constructed in a specific way to implement particular hashing methods. "just a string" leads crypt() to use DES, which only cares about the first two characters (try "alt" and "always", "salt" and "safety"). The way you need to format the salt is complicated. I've used implementations built by others, but I'm still at the stage where trying to use crypt() directly frustrates me. But I'm gettin' there. That big blue box at the top of the crypt() man page bears reading a few dozen times.

james438
01-26-2013, 08:01 AM
If I am understanding you correctly with salting crypt I need only concern myself with the first two characters, but that there are ways to get crypt() to use more or less than two characters or different characters by formatting it differently and that currently you are not fully familiar with crypt() syntax to do that, correct?

djr33
01-26-2013, 08:18 AM
The crypt() function is weird. Look at some of the examples on the PHP page.

The "salt" parameter not only allows actually adding salt, but it also allows you to insert special instructions on how the algorithm (or which algorithm?) is used. There are also constants you can use to select which algorithm is the default.

This is why I said earlier that the crypt() function is one of the strange or most difficult I've seen. It looks useful and in the end not all that complicated, but I don't get why these things aren't just easier-- give it three arguments and be done with it :)

james438
01-26-2013, 08:50 AM
I have looked over that page several times, but I may have been skimming it as well. Several terms are new to me as security is less of a hobby for me than other aspects of coding. I'm getting there too, but slowly. I have to push myself harder to learn this than with other topics, but it is important that I learn at least a little more than I already know so that I can improve the security I have so that it is acceptable for the type of website I have.

traq
01-26-2013, 08:55 PM
The crypt() function is weird.
Very much so.


The "salt" parameter not only allows actually adding salt, but it also allows you to insert special instructions on how the algorithm (or which algorithm?) is used. There are also constants you can use to select which algorithm is the default.
which algorithm. And actually, you can't use the constants to define which one - it's determined by the format of the $salt. The constants only allow you to check which algorithms are available on your system.


I have looked over that page several times, but I may have been skimming it as well. Several terms are new to me as security is less of a hobby for me than other aspects of coding. I'm getting there too, but slowly. I have to push myself harder to learn this than with other topics, but it is important that I learn at least a little more than I already know so that I can improve the security I have so that it is acceptable for the type of website I have.
As implied above, "it's not just you." It's weird. It's a heavily nuanced function that wasn't designed well in the first place and has been fiddled with a lot since then. I don't really like it, but it is the one you need to use. :)

djr33
01-26-2013, 09:20 PM
which algorithm. And actually, you can't use the constants to define which one - it's determined by the format of the $salt. The constants only allow you to check which algorithms are available on your system.Huh, ok. I did not get that out of the page. Thanks for clearing it up. It seemed weird that we'd have to use constants (but there are a few functions like that). Usually I can read a php.net page and generally understand it :rolleyes:

And I agree-- this looks like the best one to use, even if it's hard to use. And it's not that hard-- you just need to set up the salt once with your favorite algorithm, then you can cut and paste without worrying about the details.

traq
01-27-2013, 04:09 AM
Usually I can read a php.net page and generally understand it :rolleyes:

I know exactly what you mean. That damn function makes my head hurt.

james438
01-31-2013, 03:31 AM
Update:

database: The password is stored in the database hashed and salted with crypt with two characters removed. The username is stored in plain text as well as their ip address.

Cookies: When the member logs in their username is stored in a cookie in plain text. The password is stored in a separate cookie, which is a crypted, salted, combined with the user's ip address, and then has two characters removed. Both cookies need to be correct. The password one is tied to the user's ip address. If a user logs in on a different computer with a different ip address then the cookie on the first computer will no longer be valid.

I don't see any reason to encrypt a member's username.

Question: When a person registers their username an email is sent which tells them what their username and password is. How do you use email confirmation?

Note: If I manually alter a cookie (you can do this quite easily with the Opera browser) so that the the username is a valid username, but the password is one that is used by someone else the user can still post just fine. I'll need to correct this. If you try to log in with a correct username and someone else's password you won't be logged in. The cookie has to be modified manually in order to do this.

traq
01-31-2013, 05:18 AM
database: The password is stored in the database hashed and salted with crypt with two characters removed. The username is stored in plain text as well as their ip address.cool.


Cookies: When the member logs in their username is stored in a cookie in plain text. The password is stored in a separate cookie, which is a crypted, salted, combined with the user's ip address, and then has two characters removed. Both cookies need to be correct. The password one is tied to the user's ip address. If a user logs in on a different computer with a different ip address then the cookie on the first computer will no longer be valid.Is this the same hash that you store in the DB?

If so, create it some other way. You shouldn't be handing that out.

There's really no reason for a second cookie anyway. It doesn't add any security, because if one cookie is compromised, then it's likely *all* cookies were compromised.

Check the password when the user logs in, then forget about it. All you need for page-to-page visits is to remember that they did log in, and you can do that with $_SESSION['is_logged_in'] or similar. Don't worry about verifying their identity again unless they're about to do something sensitive, and then, simply ask them to give their password again.


I don't see any reason to encrypt a member's username.no.


Question: When a person registers their username an email is sent which tells them what their username and password is. How do you use email confirmation?Don't send them their password. Two approaches:

1) they choose a password when they register, and have to remember what it was.

2) they choose a password afterwards - when you send them their email confirmation, include a "first-time" log in link (exactly the same as if they'd clicked on "I forgot my password", like we discussed earlier). Then, they choose a password for themselves.

james438
01-31-2013, 06:30 AM
Is this the same hash that you store in the DB?

I am happy to say it is not :). Or rather it is the same one from the database, but that hashed password is first salted, encrypted with the ip address, and has two characters removed so that the hash stored in the cookie is not the same as the one in the database.

Thanks for the feedback! It sounds like I am on the right track :). I use method 1 currently, but I can't help but wonder if method 2 is better.

I was trying to remember why I even have the member's username stored in a cookie then I remembered that it is used to remember the theme he is using for the site whether it be one of the premade one's or one he designed himself. It really is not needed. I'll try to combine the two cookies, but not right away, because there are several files I need to go through. Not a big deal.

I should add that I use cookies as opposed to sessions. A long time ago I recall that I had trouble getting the session to last past the restart of a user's computer. I want users to be able to stay logged in without having to log back in every day or every moth even. I have always found it annoying to have have to log back into a site multiple times a day and don't wan that for my site. I would prefer to counter that with other measures such as limiting the number of articles a user can post in a 24 hour period, limit the number of registered users per ip address, etc. I also sanitize my sql queries. I have several other security measures in place as well.

djr33
01-31-2013, 04:16 PM
I agree with what traq said. Not much to add, but--

database: The password is stored in the database hashed and salted with crypt with two characters removed. The username is stored in plain text as well as their ip address.Removing two characters seems bad to me. If you have a 32 character hash, then you have 16^32 possibilities. If you have removed two characters, then you have 16^30 possibilities, which is significantly smaller. (The former has 256 times more many possibilities than the latter.)
Sure, 16^30 is still a huge number. But this means that any password that has a hash that has the same first 30 digits, regardless of the last 2, will count as the right password. That makes it a lot easier to hack with brute force.

Instead of removing any characters, I'd suggest doing some other kind of manipulation, perhaps reversing the string of the hash, or switching several characters, etc. Or, if the hash algorithm by itself is strong enough (in a way that MD5 is not), then you may not need to do anything (except that salt is always a good idea).

james438
02-03-2013, 05:55 AM
I took your advice to heart and stopped shortening the hash by two characters and have instead switched several characters. This was complicated because I then had to create a password reset script using email confirmation for the people that have registered on my site. I've been at it the last few days and I have just finished writing it. At least I can't seem to find any more bugs. I do still need to update my site so that the salt can be changed with one file edit instead of about 6. I also need to add some notation to the password reset script.

On a different aspect of security, is it much of a security risk to let users register their username, email, and password and be instantly logged in (usernames and passwords must still be unique) or would it be better to sacrifice a little convenience for increased security by requiring the member to confirm their registration via email?

The reason I removed the first two letters from the encrypted passwords is because the first two letters were the salt letters. That seemed a bit of a security risk to me and I am a little puzzled as to why crypt() behaves that way. I certainly agree that removing the letters just made brute force a whole lot easier, which is why I put the two characters back and opted for moving some characters around instead.

djr33
02-03-2013, 06:30 AM
I took your advice to heart and stopped shortening the hash by two characters and have instead switched several characters. This was complicated because I then had to create a password reset script using email confirmation for the people that have registered on my site. I've been at it the last few days and I have just finished writing it. At least I can't seem to find any more bugs. I do still need to update my site so that the salt can be changed with one file edit instead of about 6. I also need to add some notation to the password reset script.It would actually be more work for you, but you could allow them to continue with the old passwords by using both algorithms based on a date stored somewhere in your database (or just a binary value of which password system they're using). You should recommend that they upgrade for security, but it would allow them to not need to redo everything. What you did is probably fine, though.

On a different aspect of security, is it much of a security risk to let users register their username, email, and password and be instantly logged in (usernames and passwords must still be unique) or would it be better to sacrifice a little convenience for increased security by requiring the member to confirm their registration via email?There's no security issue at all*. It's just a matter of spam. Like several other measures you can take (eg, a CAPTCHA), requiring email verification will increase the difficulty for spammers.
(*At least there's none in theory. It's imaginable that a badly written script could do something odd like let them log in without an account if they can find some odd loophole.)
Also, usually sites require that you retype your password to actually log in. I'm not really sure why that is. It seems that as long as it's soon after registering, there's no real advantage there. Perhaps it's to be sure they don't forget the password (and they can store it in their browsers).


The reason I removed the first two letters from the encrypted passwords is because the first two letters were the salt letters. That seemed a bit of a security risk to me and I am a little puzzled as to why crypt() behaves that way. I certainly agree that removing the letters just made brute force a whole lot easier, which is why I put the two characters back and opted for moving some characters around instead. They were always the same output? In that case they don't add much security. But there shouldn't be any recognizable relationship between character-location in the string and the input string...

james438
02-03-2013, 07:20 AM
They were always the same output? In that case they don't add much security. But there shouldn't be any recognizable relationship between character-location in the string and the input string...

The first two characters of the hash were always the salt characters used. I tried this several times. It was a little confusing why this would be and a little annoying too. I figure it should be fine if the hash characters are rearranged a little.

As far as the registration part I can see spam as a potential issue, but I suppose it is fine for now. I can work out various anti-spam measures later when spam becomes more of an issue.

I'm going to mark this as resolved now :).

djr33
02-03-2013, 07:30 AM
As you've seen, updating a password hash algorithm can be problematic, but updating anti-spam measures can be done at any time (including a verify-by-email requirement). Sounds fine to me.

Not sure about the salt issue though-- maybe traq does.

traq
02-03-2013, 08:35 AM
...convenience?

I dunno. It's not necessarily a security risk, though - the salt isn't a decryption key, it's just extra entropy for the hash algorithm. And you have to store it somewhere, and it's typically stored close to the hash (in the same DB) anyway.

james438
02-03-2013, 06:58 PM
If you are interested would either of you be willing to register on my site here: http://www.animeviews.com/loginuser.php and then log out. Pretend that you can't remember your password and try resetting it? It works ok when I try it,but I want to be sure that it is working. I can delete your account afterwards if you want.

traq
02-04-2013, 12:12 AM
seems to work just fine the first time.

If I use the password recovery link a second time, it gives me the "choose a new password" form, but then says "check your email" after I submit it (the password is *not* changed). Using the password recovery link more than once, I would expect to get an "expired link" notice and an option to send a new "forgot password" email.

The "please check your email" and "your password has been reset" messages are dead ends - just the text, no formatting, no menus, no "Home" button, nothing. I assume this is just temporary, and that the messages will show on normal site pages eventually.

The one thing I **do not** like is that my password is emailed to me *in clear text* when I register.

Otherwise, well done :)

james438
02-04-2013, 02:23 AM
Yeah, I meant to remove that emailed password/username thing. Using the password recovery form should always work though. I'll try debugging it some more. Or were you talking about trying to reuse the same emailed link multiple times?

The blank pages with simple text messages is a place holder. Thanks for the help and encouragement everyone. I'll let you know if I find any bugs.

traq
02-04-2013, 05:57 AM
I was talking about reusing the same link.
It didn't work, and that's a good thing - but it did seem confused. :)

traq
06-16-2013, 07:27 AM
UPDATE:: PHP 5.5 will have native password hashing functions (http://php.net/intro.password) that will make this conversation (and all associated confusion) obsolete!

BONUS:: there is a userland compatibility patch (https://github.com/ircmaxell/password_compat) that you can use now (w/version 5.3.7+)!!

functions (same names/signatures for native and userland functions):

<?php
/**
* password_hash(): Hash the password using the specified algorithm
*
* @param string $password The password to hash
* @param int $algo The algorithm to use (Defined by PASSWORD_* constants)
* @param array $options The options for the algorithm to use
*
* @return string|false The hashed password, or false on error.
*/
// example usage:

$hash = password_hash( 'password',PASSWORD_BCRYPT );
// returns "$2y$10$a6o9xrystDhNxm3PAxaS5.GxojspgIrhgb5tFSey7aIHHtzQCWxKK", ready to save in your DB!

/**
* password_verify(): Verify a password against a hash using a timing attack resistant approach
*
* @param string $password The password to verify
* @param string $hash The hash to verify against
*
* @return boolean If the password matches the hash
*/
// example usage:

$match = password_verify( 'password',$hash );
// returns TRUE - the password and hash match! Log them in!

############

// other functions are useful, but less immediately so:

// password_get_info(): Get information about options used to create a hash.
// password_needs_rehash(): Determine if the password hash needs to be rehashed according to the options provided

Celebrate!

james438
06-17-2013, 03:22 AM
It was actually easy to install on my shared hosting account. It is a single php file that I can include (or require).

Just one question. How should I set the salt for the PASSWORD_BCRYPT used?

Thanks for the useful tip!

traq
06-17-2013, 04:59 AM
The third param of password_hash() is an associative array, $options. You can pass a specific salt to the function like so:
<?php

$hash_with_my_own_salt = password_hash( 'password',PASSWORD_BCRYPT,array( 'salt'=>'someUnique22charString' ) );

However, if you don't include a salt, one will be generated automatically. It is pretty well-implemented, so I'd recommend allowing the function to generate its own salts. Also, password_hash() returns the same value as crypt() (http://php.net/crypt), so you don't need to store the salt separately (because it's included in the hash, password_verify() automatically knows what salt and algo to use).

james438
06-17-2013, 07:47 AM
Nice, no more need for salts :). I still want to test this out more before updating my current password system, but it still looks promising.

traq
12-30-2013, 04:04 AM
Resurrecting this again…

Here's a good talk on the very subject (http://www.youtube.com/watch?v=0WPny7wk960), with lots of insight on the concepts (not so much on actual code).