Results 1 to 3 of 3

Thread: script for specific url extraction from a url list

  1. #1
    Join Date
    Jul 2008
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default script for specific url extraction from a url list

    I've been looking for a script that will take my url list and extract a certain type of url from the list and put it into another list.

    Let's say that I had a list of url's that I would put in notepad or MS Excel like this:


    dogtaining.com/dogfood/order
    beagleobedience.com/beagle.care/products.html
    dogtraining.com
    catfancy.com/reviewcatbook/bestcatbook.html
    kittenbookreview.net
    cattoysforkittens.info/whatdocatslike/toy.html


    This script should take all original domains with no subdomains out of this list and put them into another notepad/excel document. So in the case above,

    dogtraining.com
    kittenbookreview.net

    would be removed from the original list and then put into a new list.

    This is a very simple script but I'm having problems locating anything like it. Does anyone know where I can find this type of script? Using windows.

    thanks

  2. #2
    Join Date
    Mar 2007
    Location
    Currently: New York/Philadelphia
    Posts
    2,735
    Thanks
    3
    Thanked 519 Times in 507 Posts

    Default

    You could accomplish that using some simple RegEx in either PHP or ASP. I'm horribly inept with RegEx, but it might help you in finding a script that suits you. Or, just post in the proper forum regarding this, I'm sure someone can whip something up fairly quickly.

  3. #3
    Join Date
    Mar 2006
    Location
    Nashville, TN
    Posts
    600
    Thanks
    5
    Thanked 4 Times in 4 Posts

    Default

    Quote Originally Posted by buzzbuilder View Post
    I've been looking for a script that will take my url list and extract a certain type of url from the list and put it into another list.

    Let's say that I had a list of url's that I would put in notepad or MS Excel like this:


    dogtaining.com/dogfood/order
    beagleobedience.com/beagle.care/products.html
    dogtraining.com
    catfancy.com/reviewcatbook/bestcatbook.html
    kittenbookreview.net
    cattoysforkittens.info/whatdocatslike/toy.html


    This script should take all original domains with no subdomains out of this list and put them into another notepad/excel document. So in the case above,

    dogtraining.com
    kittenbookreview.net

    would be removed from the original list and then put into a new list.

    This is a very simple script but I'm having problems locating anything like it. Does anyone know where I can find this type of script? Using windows.

    thanks
    Try this...From what I understand this is what you want man...


    Oh just copy and paste the results from your browser and save it as a .txt and you got what you need!!
    PHP Code:
    <?php

    $fp 
    fopen("domains.txt""r");
    $temp fread ($fpfilesize ("domains.txt"));
    fclose ($fp);
    $domainList explode ("\n"$temp); 


    foreach (
    $domainList as $url)
    {
      
    $domain explode('/'$url);
      echo 
    $domain[0] . "<br />\n";
      
      }
      
    ?>
    Oh and if you want it to strip out subdomains too let me know.. so it would take mail.google.com and display google.com

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •