PDA

View Full Version : Get Title



navid
01-15-2007, 12:49 PM
Hi
I need to get title of External page
for example : User insert www.google.com
my code return Title of google.com

(my mean of title is <title>Hello</title> //Hello)
any body help me???

thetestingsite
01-15-2007, 03:57 PM
There was a thread about something similar to this before, I think it was in the PHP section. I'll try to look for that thread and post a link to it here, but please do a simple search first.

trippin
01-16-2007, 08:34 AM
<script language="JavaScript">
function whatIsTheTitle()
{
document.write(document.title);
}
</script>
<title>JavaScript Lessons</title>

</head>

<body>

<input type="button" value="Show Title" onClick="whatIsTheTitle()">

</body>

jscheuer1
01-16-2007, 08:42 AM
<script language="JavaScript">
function whatIsTheTitle()
{
document.write(document.title);
}
</script>
<title>JavaScript Lessons</title>

</head>

<body>

<input type="button" value="Show Title" onClick="whatIsTheTitle()">

</body>

That would work great if you wanted to obliterate the page. Oh, wait - you'd have to get Google's permission to put that code on their page. I somehow don't think Google would go for that.

djr33
01-16-2007, 09:16 AM
<?php
ob_start();
include('http://google.com');
$page = ob_get_contents();
ob_end_clean();
list($null,$title) = explode('<title>',$page,2);
list($title,$null) = explode('</title>',$title,2);
echo $title;
?>
Untested. Copied from working/tested code, though.


EDIT: and here it is with comments:

<?php
ob_start(); //start output buffer
include('http://google.com'); //get page data, and 'output' [to buffer]
$page = ob_get_contents(); //$page = output buffer's contents
ob_end_clean(); //end output buffer
list($null,$title) = explode('<title>',$page,2); //split at <title>, $title = 2nd half
list($title,$null) = explode('</title>',$title,2); //split at </title>, $title = 1st half
echo $title; //output title as string
?>

jscheuer1
01-16-2007, 11:19 AM
I think you should test it.

djr33
01-16-2007, 11:34 PM
Well, one variable was wrong.... I had $list from script I borrowed it from. But it works fine. The reason I didn't test was that I can't say it'll work in all cases, depending on how the title is found in text, as I'm not sure exactly how it could vary. Testing on the page(s) required will show that.
The above code is fixed with the right variables.

So.... seems to work.

I changed the code a bit for the test page, to do some extra error checking:
<?php
$url= $_GET['url'];
if ($url == '') {$url = 'http://google.com'; }
ob_start();
if (@include($url)) {
$page = ob_get_contents();
}
else {$page = '<title>Page not found.</title>'; }
ob_end_clean();
list($null,$title) = explode('<title>',$page,2);
list($title,$null) = explode('</title>',$title,2);
if ($title == '') $title = $url;
echo $title;
?>

comments:
<?php
$url= $_GET['url']; //...page.php?url=URLHERE
if ($url == '') {$url = 'http://google.com'; } //if no url was given, use google
ob_start(); //start output buffer
if (@include($url)) { //if the include returns true, ie if page exists
$page = ob_get_contents(); //get html of page
} //endif
else {$page = '<title>Page not found.</title>'; } //if page doesn't exist, use this fake title
ob_end_clean(); //end output buffer
list($null,$title) = explode('<title>',$page,2); //chop off start before <title>
list($title,$null) = explode('</title>',$title,2); //chop off end after </title>
if ($title == '') $title = $url; //if title is blank (no title on page), use url
echo $title; //output title
?>

When using this, no need to include the get stuff, but some of the error checking might help, especially the line that uses the url if the title doesn't exist, as some pages don't have titles on them.

Test here:
http://ci-pro.com/misc/phptest/titletest.php
For a custom URL:
http://ci-pro.com/misc/phptest/titletest.php?url=http://your.custom/site/etc
(Note that due to parsing of the URI, it might not work if the URI has, for example, spaces, etc.)

jscheuer1
01-17-2007, 12:32 AM
It now works on a server I master a couple of sites on but only for sites on that server????

djr33
01-17-2007, 02:09 AM
Try it on my server... outside sites work.

include() is one of the few functions that works with outside files. However, I believe that if there's some security setting enabled, then it won't. I can't remember the details.

jscheuer1
01-17-2007, 04:57 AM
I meant already to have said that, of course it works on your server. It's just a pity that one cannot say with confidence that it will be suitable for any given situation without investigating. Makes it hard sometimes to know if it is the code itself or the server. This also makes it an open question as to whether or not it will do the trick for navid. I realize that PHP is like this and that there are also other reason why any particular bit of PHP code might not work on any particular PHP enabled server, this is a new twist on that for me though.

djr33
01-17-2007, 05:23 AM
It's not the code... it's the server. Now, my code might not be standard, or something, but, basically, it's just fine for the server that I'm running it on. According to the documentation, this is right. Also, since it's just an issue of security settings, it SHOULD work on your server if you turn that off, and, also, note that it isn't that my code won't work, but that no code will work because of security limitations.

The thing about PHP that is great is that it always works, no matter who is using it or with which browser, unlike javascript. But in changing servers, it acts like the pain of javascript compatibility, so you'll have trouble. This is sometimes true with windows<>linux switches and also with different versions. However, it also has a lot to do with security/settings on the server. If your server isn't allowing for remote files, then there's nothing that can be done, no matter what the code is, short of going around that setting (iow changing it).

jscheuer1
01-17-2007, 07:16 AM
I agree. :) I was just reiterating some of what was discussed earlier about the utility of PHP as a solution in the forums here and elsewhere on DD. I seriously doubt that the server admin on the particular server I am talking about would be willing to change the security settings. There may be a workaround for the just the domains I master there and, if I wanted to use this, I would ask Tony - he might say yes. This really isn't for me though. Isn't there some other way in PHP to fetch an offsite page?

djr33
01-17-2007, 07:35 AM
From my limited knowledge, the reason to use include in this case is because other functions, like file() and file_get_contents(), are only for local files on the server.
include works around it, but, only if that security thing is enabled. It's not uncommon for it to be enabled, though, and, I don't think, too much of a security risk.

I believe that Ajax, javascript, iframes (in some ways) and such have the same type of security limitations, no?

I don't know a huge amount about this, but just what I've picked up along the way. Let's see what Twey has to say when he checks the thread.

jscheuer1
01-17-2007, 08:25 AM
Yes, all those things have that limitation. PHP is often mentioned (by myself and others) as a possible workaround but, it appears at the moment that it may also at times be limited in this way. I'm mostly wanting to know what to tell people about PHP in these circumstances.

djr33
01-17-2007, 08:44 AM
Ok here's the info from PHP.net

http://www.php.net/manual/en/function.include.php says
Windows versions of PHP prior to PHP 4.3.0 do not support accessing remote files via this function, even if allow_url_fopen is enabled.

Which leads to:

http://www.php.net/manual/en/ref.filesystem.php#ini.allow-url-fopen


allow_url_fopen boolean

This option enables the URL-aware fopen wrappers that enable accessing URL object like files. Default wrappers are provided for the access of remote files using the ftp or http protocol, some extensions like zlib may register additional wrappers.

Note: This setting can only be set in php.ini due to security reasons.

Note: This option was introduced immediately after the release of version 4.0.3. For versions up to and including 4.0.3 you can only disable this feature at compile time by using the configuration switch --disable-url-fopen-wrapper.


On Windows versions prior to PHP 4.3.0, the following functions do not support remote file accessing: include(), include_once(), require(), require_once() and the imagecreatefromXXX functions in the Reference LXIII, Image Functions extension.



So... what I take from that is
1. You must have PHP 4+ (I think almost everyone does).
2. You must have it enabled.... so... take that as you want...

Twey
01-17-2007, 11:12 PM
define('BUFSIZ', 256);

function getRemotePage($server, $host, $request = '/', $port = 80) {
$f = fsockopen($server, $port);
$rq = 'GET ' . $request . ' HTTP/1.1' . "\r\n" .
'Host: ' . $host . "\r\n" .
'Connection: close' . "\r\n" .
"\r\n";
$page = '';
$onpage = false;

fwrite($f, $rq);
for($buff = ''; !feof($f); $buff = fread($f, BUFSIZ))
if(strpos($buff, "\r\n\r\n") !== false && !$onpage) {
$onpage = true;
$page .= substr($buff, strpos($buff, "\r\n\r\n") + 4);
} else
$page .= $buff;
fclose($f);
return $page;
}Should work, but not tested. Example usage:
echo(getRemotePage('google.com', 'www.google.com', '/search?q=moneys'));... should behave like:
include('http://www.google.com/search?q=moneys');

djr33
01-18-2007, 12:08 AM
Interesting. Would there be a simpler way to do it, using just one parameter for the function?

John will have to test, to get around the issue of the setting on his server.

jscheuer1
01-18-2007, 08:02 AM
It seems to have the same limitation. This is an Apache server with PHP 4.3, I believe. Seems to have, I say because no one has verified if it works for Google on their server yet.

Twey
01-18-2007, 09:17 AM
Interesting. Would there be a simpler way to do it, using just one parameter for the function?Yes, but I didn't have time to write a decent URL parser :)
It seems to have the same limitation.Hm? There's no security option (at least mentioned on php.net) to disable fsockopen(). Is your server in safe mode, perhaps?

djr33
01-18-2007, 09:47 AM
Maybe that's the problem, (safemode), not a php_ini / remote files issue.

No problem, Twey, on the URL parser... not complaining there... just wasn't sure why there were so many options. :)

Twey
01-18-2007, 09:58 AM
If the server is in safemode, I'm afraid I can think of no way of doing it.

djr33
01-18-2007, 10:33 AM
That would, then, seem to be the answer, John. But if it's in safe mode, then that's just a problem they have to deal with, like wanting to use PHP on a server with no PHP parser... just won't work.

jscheuer1
01-18-2007, 01:58 PM
Interesting. Would there be a simpler way to do it, using just one parameter for the function?


Yes, but I didn't have time to write a decent URL parser :)Hm? There's no security option (at least mentioned on php.net) to disable fsockopen(). Is your server in safe mode, perhaps?

What would be the best way to use the code as written to just get the default page of a domain?


That would, then, seem to be the answer, John. But if it's in safe mode, then that's just a problem they have to deal with, like wanting to use PHP on a server with no PHP parser... just won't work.

Well, does this mean that you tested it and that it worked for Google on your server? I'm happy with the answer, as long as I know with reasonable certainty that it is the answer.

Twey
01-18-2007, 02:28 PM
What would be the best way to use the code as written to just get the default page of a domain?Yes: pass "/" as the request, or don't pass that parameter: it's the default.

navid
01-18-2007, 07:25 PM
<?php
$url = $_GET['url'];
echo get_url_title($_GET['url']);

echo "<br>".$_GET['url'];

function get_url_title($url, $timeout = 2)
{
$url = parse_url($url);
if(!in_array($url['scheme'],array('','http')))
return "1";
$fp = fsockopen ($url['host'], ($url['port'] > 0 ? $url['port'] : 80), $errno, $errstr, $timeout);
if (!$fp)
{
//return " 2";
echo "$errstr ($errno)\n";
}
else
{
fputs ($fp, "GET /".$url['path'].($url['query'] ? '?'.$url['query'] : '')." HTTP/1.0\r\nHost: ".$url['host']."\r\n\r\n");
$d = '';
while (!feof($fp))
{
$d .= fgets ($fp,2048);
if(preg_match('~(</head>|<body>|(<title>\s*(.*?)\s*</title>))~i', $d, $m))
break;
}
fclose ($fp);
return $m[3].$m[1];
}
}
?>

Hi agin
Thanx to your Respond
I have use this code
TEsting

http://www.eyalat.com/NM/navid.php?url=http://www.php.net/manual/en/function.include.php

Thanx

navid
01-18-2007, 09:38 PM
it is work but
problem /
mean if get url of directori for examle : www.google.com/search
if it is insert have error
should insert :www.google.com/search/

have any solvition???

djr33
01-18-2007, 09:47 PM
My first guess is that you are missing http:// as part of the url. See if that helps.

Twey
01-18-2007, 09:52 PM
Hm, I didn't know PHP had a built-in URL-parsing function. Learn something new every day :)

navid
01-19-2007, 12:57 PM
My first guess is that you are missing http:// as part of the url. See if that helps.

this is true but should put / in end of url!!!