现在的位置: 首页 > 综合 > 正文

分析HTML,并将结果存到一个数组中。看看里面的注释吧。:)

2013年03月27日 ⁄ 综合 ⁄ 共 2713字 ⁄ 字号 评论关闭

简介:这是分析HTML,并将结果存到一个数组中。看看里面的注释吧。:)的详细页面,介绍了和php,有关的知识、技巧、经验,和一些php源码等。

class='pingjiaF' frameborder='0' src='http://biancheng.dnbcw.info/pingjia.php?id=323742' scrolling='no'>

<?php

           /* 

            * parseHtml.php 

            * Author: Carlos Costa Jordao 

* Email: carlosjordao@yahoo.com
            * 

            * My notation of variables: 

            * i_ = integer, ex: i_count 

            * a_ = array,       a_html 

            * b_ = boolean, 

            * s_ = string 

            * 

            * What it does: 

            * - parses a html string and get the tags 

            *   - exceptions: html tags like <br> <hr> </a>, etc 

            * - At the end, the array will look like this: 

            *      ["IMG"][0]["SRC"] = "xxx" 

            *      ["IMG"][1]["SRC"] = "xxx" 

            *      ["IMG"][1]["ALT"] = "xxx" 

            *      ["A"][0]["HREF"] = "xxx" 

            * 

            */ 

           function parseHtml( $s_str ) 

           { 

            $i_indicatorL = 0; 

            $i_indicatorR = 0; 

            $s_tagOption = ""; 

            $i_arrayCounter = 0; 

            $a_html = array(); 

            // Search for a tag in string 

            while( is_int(($i_indicatorL=strpos($s_str,"<",$i_indicatorR))) ) { 

                    // Get everything into tag... 

                    $i_indicatorL++; 

                    $i_indicatorR = strpos($s_str,">", $i_indicatorL); 

                    $s_temp = substr($s_str, $i_indicatorL, ($i_indicatorR-$i_indicatorL) ); 

                    $a_tag = explode( ' ', $s_temp ); 

                    // Here we get the tag's name 

                    list( ,$s_tagName,, ) = each($a_tag); 

                    $s_tagName = strtoupper($s_tagName); 

                    // Well, I am not interesting in <br>, </font> or anything else like that... 

                    // So, this is false for tags without options. 

                    $b_boolOptions = is_array(($s_tagOption=each($a_tag))) && $s_tagOption[1]; 

                    if( $b_boolOptions ) { 

                            // Without this, we will mess up the array 

                            $i_arrayCounter = (int)count($a_html[$s_tagName]); 

                            // get the tag options, like src="htt://". Here, s_tagTokOption is 'src'

and s_tagTokValue is '"http://"'

                            do { 

                              $s_tagTokOption = strtoupper(strtok($s_tagOption[1], "=")); 

                              $s_tagTokValue  = trim(strtok("=")); 

                              $a_html[$s_tagName][$i_arrayCounter][$s_tagTokOption] =

$s_tagTokValue; 

                              $b_boolOptions = is_array(($s_tagOption=each($a_tag))) &&

$s_tagOption[1]; 

                            } while( $b_boolOptions ); 

                    } 

            } 

            return $a_html; 

           } 

           ?>

“分析HTML,并将结果存到一个数组中。看看里面的注释吧。:)”的更多相关文章 》

爱J2EE关注Java迈克尔杰克逊视频站JSON在线工具

http://biancheng.dnbcw.info/php/323742.html pageNo:15

【上篇】
【下篇】

抱歉!评论已关闭.