Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Grabbing A Website

  1. #1
    Join Date
    Mar 2010
    Location
    Florida
    Posts
    512
    Thanks
    9
    Thanked 61 Times in 59 Posts

    Default Grabbing A Website

    So I am trying to grab a website from my own domain. However I get an error like this:

    XMLHttpRequest cannot load http://thebcelements.com/NewWebsite/?_=1431538964656. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://localhost' is therefore not allowed access.

    I am not sure if this is JavaScript related or something else. I can grab the same website on my local machine just fine.
    -DW [Deadweight]
    Resolving your thread: First Post: => EDIT => Lower right: => GO ADVANCED => Top Advance Editor drop down: => PREFIX:Resolved

  2. #2
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    Just curious, but what do you mean by "grab" a website?
    To choose the lesser of two evils is still to choose evil. My personal site

  3. #3
    Join Date
    Mar 2010
    Location
    Florida
    Posts
    512
    Thanks
    9
    Thanked 61 Times in 59 Posts

    Default

    I will show you:
    Code:
    function loadObject(area, file){
    	//alert(file!='null'?area+file:area)
    	$.ajax({
    		url: file!='null'?area+file:area,
    		cache: false,
    		dataType:"html",
    		success: function(data){
    			var head_start = data.indexOf('</title>')+("</title".length+1)
    			var head_end = data.indexOf('</head>')
    			var head_html = data.substring(head_start, head_end);
    			
    			head_html = head_html.replace(/src="+/g, 'src="'+area+'');
    			head_html = head_html.replace(/href="+/g, 'href="'+area+'');
    			
    			var body_start = data.indexOf('<body>')+('<body>'.length+1)
    			var body_end = data.indexOf('</body>')
    			
    			var body_html = data.substring(body_start, body_end)
      			$("body").html(head_html+body_html);
    		}
    	})
    	
    }
    -DW [Deadweight]
    Resolving your thread: First Post: => EDIT => Lower right: => GO ADVANCED => Top Advance Editor drop down: => PREFIX:Resolved

  4. #4
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    That looks more like grabbing a page than an entire site. Regardless, when using AJAX we are mostly bound by the same origin policy. I know that at one time, and perhaps still, it was possible to get around that using only javascript if both domains were in on it. I'm not sure if this is still possible or not. It required certificates and was complicated. And this applies not only to AJAX, but also to virtually any cross domain javascript scripting.

    Now, when you are working on the localhost with a virtual (or even a live) server like WAMP, XAMP, etc., everything on localhost is considered to be on a single domain, so you can cross script and do all the AJAX you like, as long as all of the pages involved are on that 'server' and addressed as localhost. The same is also true for any single live domain. However, if you have more than one domain on a live server, even if you own both of them, if they have different root folders, most likely browsers will see them as two distinct domains. In fact, even if it is the same root folder, accessing http://www.thedomain.com from http://thedomain.com will run afoul of the same origin policy.

    If you are working simply on the localhost with no virtual server, it differs by browser how that's treated. Often it's considered all one server. Just about as often, each folder is treated as a separate domain, even child folders.

    Getting back to where actual domains and servers are involved, information can be sent cross domain via query strings and post data, but in the first case you are pretty limited as to the number of characters, which (unless they are very simple) almost always must be at least URL Component Encoded/Decoded as part of the process, and by the fact that the string is seen in the address bar, and in the second by the need for server side code to receive the post data. If you have available server side code, and assuming the permissions on both domains are liberal enough, you can more easily grab a page by using the various file read functions to read the offsite file on the server side and then process that data via server and/or client side code. At this point another consideration is copyright. In many cases you do not own the copyright to the remote site's content. In cases like that, even though you might be technically able to grab it, if you then publish that content, you are violating copyright. You can though, in most cases, use this content for your own purposes, as long as you are the only person able to view it. If you own both sites and their content already, or have permission to publish the other site's content, this is not a problem.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  5. #5
    Join Date
    Mar 2010
    Location
    Florida
    Posts
    512
    Thanks
    9
    Thanked 61 Times in 59 Posts

    Default

    Okay let me ask you something else.
    This might be a little confusing.

    You have two different domains. Domain A (DA) and Domain B (DB).
    You have a script on DA and you copy it into the head of DB. When in load DB the script would run and load the index page (hosted on DA) into DB page. Is that possible (without using iframes)
    -DW [Deadweight]
    Resolving your thread: First Post: => EDIT => Lower right: => GO ADVANCED => Top Advance Editor drop down: => PREFIX:Resolved

  6. #6
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Without frames, (iframe, frame, and I would include the object tag here as well as a sort of frame, because it can act like a frame), and using only javascript and/or AJAX, no.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  7. #7
    Join Date
    Mar 2010
    Location
    Florida
    Posts
    512
    Thanks
    9
    Thanked 61 Times in 59 Posts

    Default

    What about using PHP with ajax?
    -DW [Deadweight]
    Resolving your thread: First Post: => EDIT => Lower right: => GO ADVANCED => Top Advance Editor drop down: => PREFIX:Resolved

  8. #8
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Sure. If you have PHP on the domain that wants to grab the other domain, you can use file() or get_file_contents(), etc. (sometimes more extreme measures are needed depending upon the length and characteristics of the grabbed content) to grab the other domain's page and then use AJAX with that either to initiate it and/or process the content returned. But AJAX isn't needed. the content can be processed using PHP and then included on another PHP page. You really only need AJAX if you want to import the result to an html page without setting that extension for parsing via PHP (usually only .php is processed by PHP).

    However, the grabbing domain must have permission to use its file processing commands like file(), etc. on files from remote domains, and the remote domain must not have this action blocked.

    And, as mentioned in my other post, that's just the technical side. If you do not have permission/the right to publish the grabbed content on the grabbing domain, you are probably violating copyright and/or other laws.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  9. #9
    Join Date
    Mar 2010
    Location
    Florida
    Posts
    512
    Thanks
    9
    Thanked 61 Times in 59 Posts

    Default

    They will have the right to copy the website because their website (pages) will be hosted from my domain then give them a code to use to host their website on their own domain but not store their files on their own domain.
    -DW [Deadweight]
    Resolving your thread: First Post: => EDIT => Lower right: => GO ADVANCED => Top Advance Editor drop down: => PREFIX:Resolved

  10. #10
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    That's certainly doable. But it can get tricky with paths for links, css, js, and other resources. What's usually done in cases like that is that the user from the remote site actually comes to your domain. Information about them or what they're doing (if any) is posted to the page they enter your domain on. Any information that needs to go back to the site they came from is posted back either in more or less real time via AJAX and PHP, or as a form submission as they return to the originating site.

    If there are a lot of pages involved, this is generally easier and faster.

    If there is just one relatively simple third party service you are offering, then it might be feasible. But even with that - like, say how PayPla does it, the user goes to PayPal to make payment to the vendor and owner of the site they came from carrying post data with them about what they are paying for and to whom, and then are returned to the vendor's site with post data about the transaction. Separate data is available to the vendor by logging to their PayPal account to see records of all transactions, and both vendor and buyer receive confirmation emails.

    If you could give a concrete example, I could be of more help. It is possible that fetching the page(s) from your domain via PHP as we have been discussing would be best. It would require the client domain have PHP with permission set to allow that. There's more time lag though* with that approach than with almost any other way I can think of for doing something like this.

    *The client domain must request the content from your domain, download it to their server, then process it on their server and then serve it to their user.

    The best use of something like this would be for data - say, sports scores or stock prices that you host, that they could fetch and present to their users. Even at that, it would be better to have them fetch a data file (xml is good for this) then parse it and present the data to their user in a table on an otherwise ready made page on their end.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

Similar Threads

  1. Grabbing Text
    By Deadweight in forum JavaScript
    Replies: 1
    Last Post: 03-28-2014, 09:25 PM
  2. grabbing xml with AJAX
    By kasei in forum JavaScript
    Replies: 2
    Last Post: 11-07-2008, 06:42 AM
  3. Grabbing data
    By RandomFirework in forum JavaScript
    Replies: 1
    Last Post: 12-07-2007, 09:22 AM
  4. History Grabbing
    By ??? in forum JavaScript
    Replies: 1
    Last Post: 08-11-2007, 04:47 PM
  5. Replies: 2
    Last Post: 02-08-2007, 09:12 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •