View Full Version : eregi_replace; recognize hyperlinks automatically
jonas-e
08-19-2007, 05:42 PM
Hi there
I would like to convert webadresses in text from my database automatically into hml hyperlinks. The pattern for recognizing it is that it starts with "http" (or "www") and ends with any word-separator. I have googled the issue without finding anything that works for me. I tried this out:
$text = eregi_replace(
'\\[http]([-_./a-z0-9!&%#?+,\'=:;@~]+)\\[\b]',
'<a href="\\1">\\1</a>', $text);
- as [\b] is supposed to be the regular expression for any word-separator. I tried a couple of other options too without success.
Can anyone see what's wrong?
$text = preg_replace('/https?:\\/\\/([-_.\\/\w\d!&%#?+\\,\\\\'=:;@~]+)/i', '<a href="$1">$1</a>', $text);
jonas-e
08-20-2007, 07:59 PM
Thanks!
I haven't tried it out yet but will very soon.
Jonas
james438
08-20-2007, 08:30 PM
Two tiny problems with it. I love it Twey, but I noticed it was missing a \.
$text = preg_replace('/https?:\\/\\/([-_.\\/\w\d!&%#?+\\,\\\\\'=:;@~]+)/i', '<a href="$1">$1</a>', $text);
I really am only an elementary student of this stuff. There needs to be an alternate statement to allow for missing 'http://' in case a person left that out.
Even when using the code listed with the '/' put in I get:
$text="http://www.what.com"
http://www.mysite.com/www.what.comWhere mysite is the webmaster's address.
EDIT: Part of the problem seems to be on my end. I am not sure why my links are all being prefaced with my website's url.
EDIT: nvm. problem fixed, but still having trouble with the regular expression.
EDIT: kk, got it. Try the following:
$text = preg_replace('/https?:\\/\\/([-_.\\/\w\d!&%#?+\\,\\\\\'=:;@~]+)/', '<a href=http://$1>$1</a>', $text); :) I changed <a href="$1"> to <a href=http://$1> Gonna add this one to my collection. It is a pretty cool idea, thanks :)
I almost have the one for the www.site.com addresses where there is no "http://" typed in, but I can't quite figure out how to remove the . in front of the hyperlinked text. Otherwise it works except for the cosmetic problem.
$text = preg_replace('/wwws?([-_.\\/\w\d!&%#?+\\,\\\\\'=:;@~]+)/', '<a href=http://www$1>$1</a>', $text); Anyone know how to work out the bugs? For example the two don't seem to work together.
preg_replace('/(https?:\\/\\/[-_.\\/\w\d!&%#?+\\,\\\\\'=:;@~]+)/i', '<a href="$1">$1</a>', $text);
james438
08-22-2007, 04:23 AM
It's a bit sloppy, but here is what I am using to hyperlink urls that are of both formats.
$text=preg_replace('/(https?:\\/\\/[-_.\\/\w\d!&%#?+\\,\\\\\'=:;@~]+)/i', '<a href="$1">$1</a>', $text);
$text=preg_replace('/[^\/](wwws?[-_.\\/\w\d!&%#?+\\,\\\\\'=:;@~]+)/i', '<a href="$1"> $1</a>', $text);The only problem is that this assumes the typed in "www" url (as opposed to the "http" url) is preceded by a space.
Try:
/^((?:https?:\/\/|www\d?\.)(?:www\.)?[^\b]+)/
james438
08-22-2007, 08:01 PM
I am just not quite at that level yet. How would the code look if you were to integrate it into the script you want me to integrate it into. Are we talking about one preg_replace() or two? For example I do not understand the ?:term?: command just yet or why you are looking for a digit right after "www" as in the case "?:www\d?\."
Exactly the same as it does now, just with the regex replaced:
$text=preg_replace('/^((?:https?:\/\/|www\d?\.)(?:www\.)?[\w\d.%+-]+)/i', '<a href="$1">$1</a>', $text);
Are we talking about one preg_replace() or two?One.
For example I do not understand the ?:term?: commandIt's just a group, like () but without capturing.
just yet or why you are looking for a digit right after "www" as in the case "?:www\d?\."A lot of sites number servers that they use for load-balancing, as in www3.mysite.com. I thought it's better to be safe than sorry.
james438
08-23-2007, 01:12 PM
Sorry, just not getting the code you posted to work. I recently learned a lot about captures in some more depth, but it was a hard concept to understand for me at first. Next up are some of the other commands that I don't often use like ?:, \b, ?=, or (?!pattern) and the like. I do have a working version anyway that I posted above that will convert both formats into hyperlinked urls. I just use two preg_replaces and a str_replace or two for cosmetic reasons.
jonas-e
11-06-2007, 10:50 AM
Been busy with other things - I haven't tried out all of your options above - some of them created problems. I figured out this method which works fairly well:
$text = "
Here is a text - www.ellehauge.net - it has some links with e.g. comma, www.one.com, in them.
Some links look like this: http://mail.google.com - mostly they end with a space or
carriage return www.unis.no
<br /> - but they may also end with a period: http://ellehauge.net. You may even put
the links in brackets (www.skred-svalbard.no) (http://one.com).
From time to time, links use a secure protocol like https://gmail.com - tricky, isn't it?
";
echo '<p>'.$text.'<br /> <br /></p>';
// anything which starts with "http" and has URL chars
$text = eregi_replace(
'http([-_./a-z0-9!&%#?+,\'=:;@~]+)',
'<a href="http\\1" rel="ext">http\\1</a>', $text);
// anything which starts with "www." and has URL chars
$text = eregi_replace(
'www.([-_./a-z0-9!&%#?+,\'=:;@~]+)',
'<a href="http://www.\\1" rel="ext">www.\\1</a>', $text);
// remove "," from the end of hrefs
$text=str_replace('," rel="ext">','" rel="ext">',$text);
// remove "." from the end of hrefs
$text=str_replace('." rel="ext">','" rel="ext">',$text);
echo '<p>'.$text.'</p>';
- except that I still get hyperlinks where the "," and "." still APPEAR within the link - although it has been removed from the link itself. I.e. technically it works - but it doesn't look great. Any ideas how to get the comma/period out of the text?
I might try out some of your solutions above if they proved to work?
NB: Ignore the <<rel="ext">> - it has something to do with some javascripting to avoid using the depricated "target:_blank" ...
jonas-e
11-07-2007, 03:58 PM
Yo - I seem to have found a way which is nearly fool proof:
$text = "
Here is a text - www.ellehauge.net - it has some links with e.g. comma, www.one.com,
in it. Some links look like this: http://mail.google.com - mostly they end with a
space or carriage return www.unis.no
<br /> - but they may also end with a period: http://ellehauge.net. You may even put
the links in brackets (www.skred-svalbard.no) (http://one.com).
From time to time, links use a secure protocol like https://gmail.com |
This.one.is.a.trick. Sub-domaines: http://test.ellehauge.net |
www.test.ellehauge.net | Files: www.unis.no/photo.jpg |
Vars: www.unis.no?one=1&~two=2 | No.: www.unis2_check.no/doc_under_score.php |
www3.one.com | another tricky one:
http://ellehauge.net/cv_by_id.php?id%5B%5D=105&id%5B%5D=6&id%5B%5D=100.<br />
Here is my [link=blog.php]blog»[endlink] <br />
This one contains http AND www: http://www.unis.no
";
echo '<p style="text-align:left;">'.$text.'</p></div>';
function hyper($text) {
// replace those starting with http(s)
$text = preg_replace(
'~((https?)(://[-/a-zA-Z0-9_\?=&;:#%!@\~]+)(\.[-/a-zA-Z0-9_\?=&;:#%!@\~]+){1,})~',
'<a href="\1" rel="ext">\1</a>', $text);
// replace those starting with www (but NOT http://www!)
$text = preg_replace(
'~(\s|\()((www[0-9]?)(\.[-/a-zA-Z0-9_\?=&;:#%!@\~]+){1,})~',
'\1<a href="http://\2" rel="ext">\2</a>', $text);
// links within the same site
$text = eregi_replace(
'\\[link=([-_./a-zA-Z0-9!&%#?+,\'=:;@~]+)]([^\\[]+)\\[endlink]',
'<a href="\\1">\\2</a>', $text); // internal links - NOT http or www!
return $text;
}
$content=hyper($text);
echo '<p>'.$content.'</p>';
There might still be some falts in it - let me know if you find any ...
blm126
11-07-2007, 08:04 PM
All the regular expressions above seem to forget that a site doesn't have to have www in it. I could setup an addresss to be something like https://myforums.sweetsite.co.uk . That address is perfectly valid. I threw the following expression together real quick.
<?php
$test = "
www.ebay.com
www.somesite.com
somesite.co.uk
http://thiis9cool.nr
https://www2.coolhuh.com. Interesting Huh?
";
$test_with_links = preg_replace("/(http(s?):\/\/)?([a-zA-Z0-9\.]+\.[a-zA-Z]{2,3})/",'<a href="http$2://$3">http$2://$3</a>',$test);
?>
I included my test text, just to show what it should handle.
jonas-e
11-08-2007, 08:27 AM
Hi blm126
I just tested my function on your example https://myforums.sweetsite.co.uk - it works fine - due to the {1,} part.
(://[-/a-zA-Z0-9_\?=&;:#%!@\~]+)
- defines the first part after the http(s) starting with "://"
(\.[-/a-zA-Z0-9_\?=&;:#%!@\~]+){1,}
- let's you have as many ".***" elements as you like after that.
(\s|\()
- in front of "((www[0-9]?)" prevents expressions like "http(s)://www.**" from being processed twice - otherwise giving odd results ...
james438
11-15-2007, 01:49 AM
I hope I am not repeating what Jonas-e just said, but is there a way to modify the code to incorporate
echo preg_replace("#(<a\s[^>]+>http://\S+</a>)|(<[^>]+http://[^>]+>)|http://\S+#ie",'"$0"=="$1" || "$0"=="$2" ? "$0" : "<a href=\"$0\">$0</a>"',$text);
into blm126's expression?
I have been away from coding for a bit, but the above code will hyperlink urls and ignore the ones that are used for image src and ones that are already hyperlinked. It does not hyperlink ones that do not have the http:// prefix however, but then again that might be a good thing too. I found this example at http://www.tote-taste.de/X-Project/regex/eval.html.
taylornolen
11-15-2007, 08:07 PM
I have a question about this...
I tried to figure it out but there doesn't seem to be any documentation I could find within the last hour pertaining to this.
I have an internal script that takes user input which can include internal network URLs to a share on the network like "\\fileserver\file\spreadsheet.xls". I cannot for the life of me figure out how to write the preg_replace parameters to get it to find the link then replace with the appropriate hyperlink tag. Has anyone done this? Can it be done? I always end up stripping the leading \.
Powered by vBulletin® Version 4.2.2 Copyright © 2021 vBulletin Solutions, Inc. All rights reserved.