Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: Rip text from website

  1. #1
    Join Date
    Sep 2010
    Posts
    12
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default Rip text from website

    Hi,

    I need some sort of script that when run, it will rip the text from this website http://m.2dayfm.com.au/playing and append to a text file. Each time it is run, it will keep appending. I can schedule the script using windows.

    It would be good too if just the songs could be appended and not any other text, but this part's not important.

  2. #2
    Join Date
    May 2007
    Location
    Boston,ma
    Posts
    2,127
    Thanks
    173
    Thanked 207 Times in 205 Posts

    Default

    Do you have access to php?
    Corrections to my coding/thoughts welcome.

  3. #3
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    This can probably be done with PHP. But depending upon what that content is and what you intend to do with it, may be a violation of copyright.

    What would you be using this for?

    You would need a PHP server or one that can run on your OS. If using windows, that could be WAMP. Then you just get the file from the remote server and append it to a file on your server. You could filter it before appending it using various PHP string functions (probably preg_replace()) in an attempt to only save the sort of information you're looking for.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  4. #4
    Join Date
    Sep 2010
    Posts
    12
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default

    Yep, got access to PHP, like a web server. But if I use this, in the script it would need to be set to run every 5 minutes for example or any numeral that I specify.

    Would you have an example of a PHP script that would do this?

    As your question why? Using this website as an example, a base so to speak. There might be others, but once I get the script, changing the URL is reads would be pretty easy.

  5. #5
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    That would be what I believe is known as a cron job. Something the server does every whatever. I don't know the specifics of that.

    Back to my question - I don't really care what site you are grabbing information from. What concerns me is what you are doing or intend to be doing with that information once you get it. What do you want the information for? What will you do with it once you have it?
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  6. #6
    Join Date
    Sep 2010
    Posts
    12
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default

    Hey John, this shouldn't be a concern of yours, I come here for help. Can you please help by posting me a script or pointing me in the right direction?

    I need some sort of script that when run, it will rip the songs from this website http://www.2dayfm.com.au/nowplaying/iframe and append to a text file. Each time it is run, it will keep appending. It can be either schedule to run using windows, or the script be set to loop or run every X minutes.
    Last edited by Marcymarc; 10-04-2010 at 07:47 PM.

  7. #7
    Join Date
    Apr 2008
    Location
    So.Cal
    Posts
    3,643
    Thanks
    63
    Thanked 516 Times in 502 Posts
    Blog Entries
    5

    Default

    Marcymarc,

    John is simply trying to point out potential problems with what you're trying to do. If I were in your position, I would appreciate his concern.

    Furthermore, it is entirely appropriate for him to be concerned with what he is helping someone to do. Everyone here helps others because they want to. If your purpose is legit, I'm sure he won't have any problem helping you figure it out.

    Beyond the ethical concerns, what you are trying to do will probably have a significant impact on what the best solution will be. If you want help, you need to be as forthcoming as possible.

  8. #8
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Yes. It's against forum rules to help with illegal requests. Since you will not answer the question, we pretty much have to assume that your purpose here is illegal. From the rules (http://www.dynamicdrive.com/forums/rules.htm):

    • 1.4) No illegal requests- Do not post requests that are illegal or break the usage terms of the service in question, such as where to download warez, disable pop up ads on your free host etc.
    That would include taking copyrighted material such as song or play lists and displaying them on your own site. I checked the terms of service of that site, and something like that is prohibited. If you have permission to do so, show us some proof.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  9. #9
    Join Date
    Sep 2010
    Posts
    12
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default

    Reason why I need the information is so I can keep track of what music is played on other radio stations, like a report, instead of me listening to the radio in real time writing down songs one by one, having a script to pump it to a text file is easier for me, I can then sit down and run my eye over the list in one go.

    Once the songs are published on a site, i can take this information for my own use, as long as i don't re-use or sell the information, it's okay. Same as me listening to the radio or recording the radio, people are allowed to record radio for their own use.

    I run a radio station too, I need to keep track of what other stations are playing. Quiet a normal thing to do in the industry.
    Last edited by Marcymarc; 10-05-2010 at 06:56 AM.

  10. #10
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Quote Originally Posted by Marcymarc View Post
    I run a radio station too, I need to keep track of what other stations are playing. Quiet a normal thing to do in the industry.
    Good enough for me.

    OK, let's get back to business here. I'm not aware of any specific program for this. And it would depend upon what PHP version, as to the exact commands required, but that can be worked out either by using a fall back for older versions or by your knowing what version of PHP you have. What version do you have?

    Not long ago I worked out a method whereby one website could capture a page of another. There is a form. The user inputs the desired page and submits. The page's content is fed back to the user with a Flash application superimposed over it in the lower right corner. There's quite a bit of detail to how this is done, but you don't need to know all that. The basic thing is to grab the page:

    PHP Code:
    $url 'http://www.somedomain.com/';
    $requested_page file_get_contents($url); 
    and write to the log file:

    PHP Code:
    $file 'somedomain.log'//May be any filename that you like, or optained from the $url variable in some fashion.
    if(phpversion() >= 5.1){
        
    file_put_contents($file$requested_pageFILE_APPEND LOCK_EX);
    } else {
        
    $afile fopen($file'a');
        
    fwrite($afile$requested_page);
        
    fclose($afile);

    There are all sorts of things one could do with the $requested_page variable to alter it (like strip out HTML tags, and/or everything except that data you are looking for) after it's obtained and before writing it to the log file.

    As I alluded to before, the page that this code is on could be run periodically. Data could be passed to the page to set the URL to grab and/or the filename of the file to be written to.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •