Log in

View Full Version : Resolved mkdir urlencode



ggalan
10-23-2011, 05:59 PM
i am trying to apply foreign language to my make directory function
when i


$nTitle = urlencode($_POST['entryItm']);
$newDir = mkdir("./CONTENT/" . $nTitle , 01777, true);


it creates a folder "%E5%B8%BD%E5%AD%90"

but when i try to access the folder through header like
mydomain.com/CONTENT/%E5%B8%BD%E5%AD%90/SAM_0070.JPG

i get
mydomain.com/CONTENT/帽子/SAM_0070.JPG

+ error 404.
how can i reference this folder?

traq
10-23-2011, 06:25 PM
why are you url-encoding the directory name?

try
mydomain.com/CONTENT/%25E5%25B8%25BD%25E5%25AD%2590/SAM_0070.JPG

ggalan
10-23-2011, 06:52 PM
thanks but how would i get it from my encoding to yours? meaning what can i do to get it from
%E5%B8%BD%E5%AD%90
to
%25E5%25B8%25BD%25E5%25AD%2590

im encoding so i can reference the directory created using foreign characters, otherwise i get something like this

手袋

traq
10-23-2011, 07:04 PM
thanks but how would i get it from my encoding to yours? meaning what can i do to get it from
%E5%B8%BD%E5%AD%90
to
%25E5%25B8%25BD%25E5%25AD%2590
print urlencode("%E5%B8%BD%E5%AD%90");
// prints %25E5%25B8%25BD%25E5%25AD%2590


im encoding so i can reference the directory created using foreign characters, otherwise i get something like this
手袋

well, I think you've got a problem there.

you're actually naming the directory with the character references, not the characters they refer to (which is why you have to double-url-encode things to get the directory).

depending on your machine/system, you could name directories using the actual characters. PHP won't mind. The bigger problem (unless the intended audience is language-specific) is having users that can type those characters into the address bar - though it would be easier than using character references.

ggalan
10-23-2011, 07:24 PM
hmmm, i guess i need something like this but this doesnt work



<img src = './../CONTENT/' . print urlencode("%E5%B8%BD%E5%AD%90") . '/title.jpg' />

djr33
10-23-2011, 07:28 PM
I agree with traq. This doesn't make sense: you want to use the characters, but you're not using them. I understand the desire to use the characters, but it doesn't help if you don't actually use them-- you would be better just using random names.
Also, you probably can use the characters directly, even using urlencode(), but the way you're doing it, you're using that twice and it's making more characters that can't be encoded properly (like the %). There are reasons to avoid using characters like that in a URL (although I don't believe there are any true technical limitations, not to say it will always work), but double urlencoding them isn't helping anything.

One option would be to write the characters out in their Latin-alphabet forms. This could be very difficult if you don't have an easy way to automatically convert them, but maybe you could find some software to do that. I'm not sure if that's Japanese or Chinese you're using, but either way there are generally recognizable transcription formats into the Latin-alphabet (a-z) so you could use that.

ggalan
10-23-2011, 07:35 PM
i think my question has gotten abstracted
im simply trying to use special characters to make a directory.
if i dont encode then i get something like this

手袋

is that ok to work with?

djr33
10-23-2011, 07:57 PM
I'm not sure I understand. Why are you using urlencode() then?

urlencode() is specifically (and only) to be used when you are creating a link for a browser. So, for example "test directory" would be "test%20directory"-- you would still name the directory "test directory" but use the format "test%20directory" when you generate a link to it. If you do this while creating the directory, then it will mean you need to use it twice and it will never look like the right URL. Also, sometimes you don't need to use urlencode() if you can use the characters directly, but I think this varies by browser. Regardless, if you do use urlencode() you won't see the characters, so I don't understand the advantage.


Now, if you do want to create a directory named that, you should be able to do it without any trouble. If it is naming it with an unexpected format, then there is a character encoding conversion problem. You need to determine what the settings are for PHP, for your server, and for your webpages, and convert between them. It might just be easier to use trial and error in PHP to find the right combination for conversion.

No, you can't use what you get now or it will always be in a different encoding and you can't refer to with the original characters except through the same exact situation (so you probably won't be able to access it when you need to).

This is a system configuration issue, in PHP and/or your server's operating system. In some systems it might not be possible to use characters like that, but in most systems I've used it is possible, but it depends on your system, I guess.

ggalan
10-23-2011, 08:00 PM
how about this, i enter some foreign characters like


$nTitle = htmlspecialchars($_POST['entryItm']);
$newDir = mkdir("./CONTENT/" . $nTitle , 01777, true);

which gives me a directory called

手袋

then since i didnt encode anything my output gets


mydomain.com/CONTENT/帽子/SAM_0070.JPG


how can i go from that to this?


mydomain.com/CONTENT/手袋/SAM_0070.JPG

ggalan
10-23-2011, 08:11 PM
i think i posted this last question before i realized there was a reply.

how would you handle something like this using foreign characters?
i cant do this

One option would be to write the characters out in their Latin-alphabet forms.

btw, i got the encode idea from here
http://stackoverflow.com/questions/1525830/how-do-i-use-filesystem-functions-in-php-using-utf-8-strings

djr33
10-23-2011, 08:52 PM
The idea in the link you posted makes sense if you have a few characters that are strange (such as accent marks in Spanish or French), but if you have no characters that will be properly displayed (as in Chinese or Japanese, or anything else with a completely different writing system), there is no point in using that. I'd suggest using md5() to create a "random" name that probably won't overlap with any other names. It won't look pretty, but it won't be any worse than urlencode().

You can't use the Latin-alphabet forms? I don't see why not, except that it might be difficult to convert them. Google translate can do this automatically, but with their recent decision to not allow use of the API except as a paid service, that might not work for you. I'm sure that if you're willing to pay you can find a way to convert that way. And you might be surprised how many of your uses can understand the Romanization of the Asian scripts-- that's generally how they type anyway, based on the sounds of the characters.


You can't use the badly encoded string because it won't make sense. You need to investigate the character encodings at EVERY step in your process and figure out where it is going wrong. As I said before, you need to convert to the correct format as needed. My first guess would be to convert from UTF8 (the input) to whatever your OS uses by default, but there might be more complications involved.

traq
10-24-2011, 03:33 AM
if this

手袋
and this

帽子
result from the same input, then you've definitely got character encoding issues. If you're handling characters like this, you need to be using utf-8 for everything. Getting php (or a database, or your text editor), to use utf-8 is relatively easy. Getting your server's filesystem to play along when you create the directories might be more difficult (especially if you don't admin your own server).

Honestly, whatever solution you find will probably need to be server-specific (non-portable). I would recommend using a different naming scheme for your actual directories, and using "pretty urls" -either via php or .htaccess- to map the desired paths (with the desired characters) to the correct locations.



wow, read this (http://www.joelonsoftware.com/articles/Unicode.html).