PDA

View Full Version : Language translation with special chars, str_relace, preg_replace or strtr?



motormichael12
07-25-2008, 12:25 AM
I am trying to make a web page language translator such as babelfish or google translator. I was wondering which would be better, str_replace, preg_replace or strtr?

I have the following code:

<?php

$o_string = 'Kmeň';
$string[0] = strtr($o_string, 'Kmeň', 'Tribe');
$string[1] = preg_replace('#Kmeň#', 'Tribe', $o_string);
$string[2] = str_replace('Kmeň', 'Tribe', $o_string);

for($i=0;$i<count($string);$i++)
{
echo "$i - ";
echo str_replace('<', '&lt;', $string[$i]);
echo "<br>";
}
?>

It all returns the following:


0 - Tribe
1 - Tribe
2 - Tribe

I was wondering which of these would be better based on:

Speed (loading, processing)
Functionality (supports more characters, etc.)
PHP version availability
Case-insensitivity

Also as this will be translating pages, I would like to know if there is any way to capitalize the first word of each sentence on the page, as well as any that are in their own tags such as <th>blah</th> would be capitalized.

james438
07-25-2008, 01:09 AM
I wouldn't use preg_replace(). preg_replace() is more of a function of last resort as it is processor heavy and really should be used for finding patterns that you can't find using str_replace or strtr. str_replace and strtr are pretty good as far as speeds and processor power required to use them.

To capitalize the first letter of every sentence something like the following would work a little:

<?php
$sample="this is me. this is sentence two. hi.";
$sample=explode('. ',$sample);
foreach ($sample as $sentence=>$val) {$val1[]=ucfirst($val);}
$sample=implode(". ",$val1);
echo"$sample";
?>

preg_replace() might actually be a better choice though when you think of people using one space and sometimes two after a period and new paragraphs. Then again with a little more coding you could divide the text into paragraphs as well. Then do the same for sentences with only one space after the period.

motormichael12
07-25-2008, 01:52 AM
Well for words I would use strtr or replace but if I need to I could use something like this in preg_replace for the double spacing problem?


#\.\s(?:\s)?(.*?)#

Like do this:

$body = preg_replace('#\.\s(?:\s)?(.*?)#', '. ' . ucfirst($1), $body);

motormichael12
07-25-2008, 03:48 AM
I got the capitalization with this code (probably not the best but it is just a script for some friends to use, nothing professional):


<?php
$sample = "these have. two. spaces between them. but this. and this. only. have one.<b>and tags?</b> i do not know";

$sample=str_replace('. ', '. ', $sample);
$sample=str_replace('> ', '>', $sample);
$sample=str_replace('>', '> ', $sample);

$sample=explode('. ',$sample);
for($i=0;$i<count($sample);$i++){
$samplef .= ucfirst($sample[$i]) . ". ";
}



$sample=explode('> ',$samplef);
unset($samplef);
for($x=0;$x<count($sample);$x++){
$samplef .= ucfirst($sample[$x]);
if($x != (count($sample)-1)){$samplef .= "> ";}
}


echo"$samplef";
?>

james438
07-25-2008, 03:56 AM
I would still stay away from preg_replace() and use something like
$body=str_replace(' ',' ',$body);

However if I wanted to use preg_replace() I would do:
$body = preg_replace('/(\.\s{1,})/', '$1', $body); I am not terribly knowledgeable about PCRE, but I am assuming that is what you were trying to do in your example above.

Sorry, I keep looking at your code, but I can't see what you are trying to do, sorry. ?: is a term I have not been able to figure out yet. From what I can tell you are using "#" instead of "/" and you are trying to capture 1 or more sets of whitespace for capitalization, but that is just a guess.

EDIT: sorry, hadn't read your last post yet.

motormichael12
07-25-2008, 04:03 AM
Yeah that preg_replace is only going to be used once to capitalize all sentences. Any and all words will be strtr because they will be in a big array, this way it can just go through and replace each key with its item.

james438
07-25-2008, 04:15 AM
try looking your code over a bit more and see if you can remove redundant code and think of ways to simplify what you have as much as you can. You are on the right track, but I think you need to slow down a bit and refine what you have.

I am glad you are avoiding preg_replace(). Personally I really like it, because I find I need it, like in my other post on acquiring the color used in a particular CSS style or when I want to get my browsers to display spaces as opposed to condensing them or when i want to clean up code or reformat a document so that certain scripts are displayed as is and not executed and some scripts are executed, etc. Even so I avoid using it unless I really need to.

I am eager to see what your refined code looks like.

motormichael12
07-25-2008, 04:21 AM
Well I use Sothink HTML editor, and the tragic part is I can't use the special characters in it. I have to use Windows Notepad and save it in utf8 :O

Before that I had to use preg_replace for any symbols since I couldn't use them there, I was doing things like '#Kme.#' etc with a bunch of different regex but it would always have different problems. Now that I know notepad is useful for something I can use strtr, but before I had about 30 preg_replace lines before doing maybe a percent of it all..

james438
07-25-2008, 04:29 AM
Are you able to use special characters now? If not then you may want to use fireftp (an addon for firefox). I have found it quite useful and now use it as my primary website editor tool.

motormichael12
07-25-2008, 04:46 AM
I can insert them now in notepad and save as utf8 and it will keep them, so now I jsut have to add all the words :)