Results 1 to 5 of 5

Thread: preg_match not matching

  1. #1
    Join Date
    Jun 2007
    Posts
    543
    Thanks
    3
    Thanked 78 Times in 78 Posts
    Blog Entries
    1

    Default preg_match not matching

    i'm developing a proxy server for a client that he will put all of his workers through, for internet, so i have to parse everything. The only thing i have a problem with is forms. This is my code
    PHP Code:
    $pageData preg_replace('/(<form(.*)action="(.*)"(.*)>)(.*)/''$1<input type="hidden" name="ac_url" value="$3" />$5'$pageData); 
    for some reason this is not working. any help would be much appreciated.
    thanks
    Last edited by Snookerman; 04-22-2009 at 07:41 AM. Reason: added "Resolved" prefix
    [Jasme Library (Javascript Motion Effects)] My Site
    /\/\@§†ê® §©®¡þ† /\/\@|{ê®
    There are 10 kinds of people in the world, those that understand binary and those that don't.

  2. #2
    Join Date
    Jun 2007
    Posts
    543
    Thanks
    3
    Thanked 78 Times in 78 Posts
    Blog Entries
    1

    Default

    does anyone have some help
    [Jasme Library (Javascript Motion Effects)] My Site
    /\/\@§†ê® §©®¡þ† /\/\@|{ê®
    There are 10 kinds of people in the world, those that understand binary and those that don't.

  3. #3
    Join Date
    Jan 2007
    Posts
    629
    Thanks
    10
    Thanked 28 Times in 28 Posts

    Default

    For one thing, it is set to greedy. You can place a u outside of the pattern, or follow every .* with a ?.

    .* = go as far as you can
    .*?= stop as soon as you can

    Another thing, you might need to be more specific. For example, something like [^>]*? tells it to find everything that is not a > (and the ? tells it to stop at the first >).
    --Jas
    function GreatMinds(){ return "Think Like Jas"; }
    I'm gone for a while, but in the meantime: Try using my FTP script | Fight Bot Form Submissions

  4. The Following User Says Thank You to Jas For This Useful Post:

    Master_script_maker (06-13-2008)

  5. #4
    Join Date
    Aug 2007
    Location
    Ohio
    Posts
    79
    Thanks
    0
    Thanked 15 Times in 15 Posts

    Default

    Whenever I write a regular expression that isn't matching, I break it down and rebuild it slowly piece by piece.

    Writing proxies can be a total b!tch. I made a stab at one for fun once, and it worked well enough that it let me pass through a Webwasher filter and browse blocked sites. I remember it had a few problems I gave up on because I really had no reason to further develop it. If you'd like to take a look, here's the crude class I used:

    PHP Code:
    <?php

    // A proxy class for creating a proxy to bypass a filter 


    class jproxy {
        protected 
    $proxyBaseUrl NULL;
        protected 
    $currentUrl NULL;
        protected 
    $pageDomain NULL;
        protected 
    $fullContentType NULL;
        protected 
    $contentType NULL;
        protected 
    $html NULL;
        
        function 
    __construct($baseUrl) {
            
    $this->setBaseUrl($baseUrl);
        }
        
        function 
    setBaseUrl($url) {
            
    $this->proxyBaseUrl $url;
        }

        function 
    setUrl($url) {
            
    $this->currentUrl $url;
        }

        function 
    getUrl() {
            return 
    $this->currentUrl;
        }

        function 
    fetchHtml() {
            if(!
    $this->getUrl()) {
                
    // no url to grab html for
                
    return false;
            }

            
    // 
            
    $ch curl_init();
            
    curl_setopt($chCURLOPT_RETURNTRANSFER1);
            
    curl_setopt($chCURLOPT_TIMEOUT4);
            
    curl_setopt($chCURLOPT_URL$this->getUrl());
            
    curl_setopt($chCURLOPT_FOLLOWLOCATIONtrue);
            
    curl_setopt($chCURLOPT_COOKIESESSIONtrue);
            
    curl_setopt($chCURLOPT_COOKIEJAR$GLOBALS['cookieFile']);
            
    curl_setopt($chCURLOPT_COOKIEFILE$GLOBALS['cookieFile']);
            
    curl_setopt($chCURLOPT_USERAGENT$_SERVER['HTTP_USER_AGENT']);

            
            if(
    $_POST) {
                
    $count 1;
                foreach(
    $_POST as $key => $val) {
                    
    $headers .= $key.'='.$val;
                    if(
    $count != count($_POST)) {
                        
    $headers .= "&";
                    }
                    
    $count++;
                }
                
    curl_setopt($chCURLOPT_POST1);
                
    curl_setopt($chCURLOPT_POSTFIELDS$headers);
            }
            
            
    /*foreach($_GET as $key => $val) {
                if($key != 'url') {
                    echo $key . " => " . $val . "<br />";
                }
            }*/
            
            
            
            
    $this->html curl_exec($ch);
            
            
    $this->fullContentType curl_getinfo($chCURLINFO_CONTENT_TYPE);
            
    preg_match'@([\w/+]+)(;\s+charset=(\S+))?@i'$this->fullContentType$matches );
            
            if(isset(
    $matches[1])) {
                
    $this->contentType $matches[1];
            }
            
            
            
    curl_close($ch);
        }
        
        
        function 
    getDomain() {
            if(
    $this->pageDomain) {
                return 
    $this->pageDomain;
            }
            else {
                
    preg_match('/http:\/\/(www\.)?([^\/]+)/'$this->currentUrl$matches);
                return 
    "http://".$matches[2]."/";
            }
        }
        
        protected function 
    modifyHref($url) {
            
    // is a css or favicon file?
            
    if(stripos($url'.css') || stripos($url'style.php') || strpos($url'favicon.ico')) {
                if(
    substr($url07) == 'http://') {
                    
    $new "href=\"".$this->proxyBaseUrl."?url=".$url."\"";
                }
                else {
                    
    $new "href=\"".$this->proxyBaseUrl."?url=".$this->getDomain().$url."\"";
                }
            }
            else {
                if(
    substr($url07) == 'http://') {
                    
    $new "href=\"".$this->proxyBaseUrl."?url=".$url."\"";
                } else {
                    
    $new "href=\"".$this->proxyBaseUrl."?url=".$this->getDomain().$url."\"";
                }
            }
            
            
    //$new = str_replace('//','/',$new);
            
            
    return $new;
        }
        
        protected function 
    modifySrc($text) {
            if(
    substr($text07) == 'http://') {
                
    $new "src=\"".$this->proxyBaseUrl."?url=".$text."\"";
            } else {
                
    $new "src=\"".$this->proxyBaseUrl."?url=".$this->getDomain()."/".$text."\"";
            }
            
            
    //$new = str_replace('//','/',$new);
            
            
    return $new;
            
        }
        
        protected function 
    modifyActions($text) {
            if(
    substr($text07) == 'http://')
            {
                
    $new "action=\"".$this->proxyBaseUrl."?url=".$text."\"";
            }
            else
            {
                
    $new "action=\"".$this->proxyBaseUrl."?url=".$this->getDomain().$text."\"";
            }
            
            
    //$new = str_replace('//','/',$new);
            
    return $new;
        }
        
        protected function 
    modifyFlash($text) {
            if(
    substr($text07) == 'http://') {
                
    $new "<param value=\"".$this->proxyBaseUrl."?url=".$text."\"";
            }
            else {
                
    "<param value=\"?url=".$this->proxyBaseUrl."?url=".$this->getDomain().$text."\"";
            }   
            
    //$new = str_replace("//","/", $new);
            
            
    return $new;
        }

        protected function 
    getExtension() {
            
    preg_match('/\.([^.]+)$/'$this->currentUrl$matches);
            return 
    $matches[0];
        }

        protected function 
    stripShit() {
            
    $return str_replace('/','',$this->currentUrl);
            
    $return str_replace(':','',$return);
            
    $return str_replace('.','',$return);
            return 
    $return;
        }
        
        protected function 
    modifyForm($whole$formAttr$content) {
            
    $content '\n\n<!--INSERTED FORM ELEMENT BY PROXY -->\n';
            
    $content .= '<input type=\"hidden\" name=\"url_204s52bg\" value=\"\" />';
            
    $whole str_replace($content$myForm.$content$whole);
            return 
    $whole;
        }
        
        protected function 
    modifyInlineStyles($props,$content) {
            
    $content $this->modifyCssLocations($content);
            return 
    "<style".$props.">".$content."</style>";
        }
        
        protected function 
    modifyLocations() {
            
    // replace href (links, stylesheets, etc.)
            
    $this->html preg_replace('/href=(\'|")(.+?)\1/ie''$this->modifyHref("$2")'$this->html);
            
            
    // replace src (images, etc.)
            
    $this->html preg_replace('/src=(\'|")(.+?)\1/ie''$this->modifySrc("$2")'$this->html);
            
            
    // replace form actions
            
    $this->html preg_replace('/action=("|\')(.+?)\1/ie''$this->modifyActions("$2")'$this->html);
            
            
    // add form url thing
            //$this->html = preg_replace('/(<form(.?)>(.?)<\/form>)/', '$this->modifyForm("$1","$2","$3")', $this->hrml);
            
            // replace flash paths
            
    $this->html preg_replace('/<param value=("|\')(.+?)\1/ie''$this->modifyFlash("$2")'$this->html);
            
            
    // replace any @imports in inline stylesheets
            
    $this->html preg_replace('/<style([^>]*)>(.+?)<\/style>/ie''$this->modifyInlineStyles("$1","$2")'$this->html);
            
            
    $this->html stripslashes($this->html);
        }
        
        private function 
    modifyCssUrl($url) {
        
            if(
    substr($url07) == 'http://') {
                
    $return $this->proxyBaseUrl.'?url='.$url;
            }
            else {
                
    $return $this->proxyBaseUrl.'?url='.$this->getDomain().$url;
            }
            
            return 
    'url('.$return.')';
        }
        
        private function 
    modifyCssImport($file) {
            if(
    substr($url07) == 'http://') {
                
    $return $this->proxyBaseUrl.'?url='.$file;
            }
            else {
                
    $return $this->proxyBaseUrl.'?url='.$this->getDomain().$file;
            }
            
            return 
    '@import "'.$return.'";';
        }
        
        private function 
    modifyCssLocations($content) {
            
    $content preg_replace('/url\((.+?)\)/ie''$this->modifyCssUrl("$1")'$content);
            
            
    // replace imports
            
    $content preg_replace('/@import(|\s)(\'|")(.+?)\2;/ie','$this->modifyCssImport("$3")'$content);
            
            return 
    $content;
        }

        function 
    getHtml() {
            return 
    $this->html;
        }
        
        function 
    getFormattedPage() {
            
            if(
            (
    $this->contentType != NULL) && (stripos($this->contentType,'text/html') === false || stripos($this->contentType,'application/xhtml') != false)
            ) {
                
    header("Content-Type: ".$this->fullContentType);
                if(
    stripos($this->contentType,'css')) {
                    
    // css file, modify that *****
                    
    $this->html $this->modifyCssLocations($this->html);
                }
            }
            else {
                
    // we only want to modify locations if it's a regular old webpage
                
    $this->modifyLocations();
            }

            
    //$this->modifyLocations();
            
            
    return $this->html;
        }

    }

    ?>
    And I used the class like so:

    PHP Code:
    <?php


    if(!$_COOKIE['proxSession']) {
        
    // set the cookie
        
    setcookie('proxSession'md5('prox'.microtime().$_SERVER['HTTP_USER_AGENT']), time()+60*20'/''.example.com');
    }
    else {
        
    // rewnew the time for the cookie
        
    setcookie('proxSession'$_COOKIE['proxSession'], time()+60*20,
        
    '/''.example.com');
    }

    $cookieFile "cook_".$_COOKIE['proxSession'];

    // get the class
    require_once('jproxy.class.php');


    if(
    $_GET['url']) {
        
    $p = new jproxy("http://www.example.com/page.php");
        
    $p->setUrl($_GET['url']);
        
    $p->fetchHtml();

        print 
    $p->getFormattedPage();
    }

    ?>
    Looking back at the class, it looks like I used this regular expression to match forms:
    PHP Code:
    // replace form actions
    $this->html preg_replace('/action=("|\')(.+?)\1/ie''$this->modifyActions("$2")'$this->html); 
    Which used the function:

    PHP Code:
        protected function modifyActions($text) {
            if(
    substr($text07) == 'http://')
            {
                
    $new "action=\"".$this->proxyBaseUrl."?url=".$text."\"";
            }
            else
            {
                
    $new "action=\"".$this->proxyBaseUrl."?url=".$this->getDomain().$text."\"";
            }
            
            
    //$new = str_replace('//','/',$new);
            
    return $new;
        } 
    I hope some of that was some help. Sorry I couldn't help with your expression, but I don't have time to break it down and test it.

  6. The Following User Says Thank You to jackbenimble4 For This Useful Post:

    Master_script_maker (06-13-2008)

  7. #5
    Join Date
    Jun 2007
    Posts
    543
    Thanks
    3
    Thanked 78 Times in 78 Posts
    Blog Entries
    1

    Default

    thanks guys. i didn't find the problem, but rewrote and it worked fine.
    [Jasme Library (Javascript Motion Effects)] My Site
    /\/\@§†ê® §©®¡þ† /\/\@|{ê®
    There are 10 kinds of people in the world, those that understand binary and those that don't.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •