Re: [问题] PHP抓原始码内容

楼主: godspeedlee (妳,我可以)   2011-08-06 17:36:32
※ 引述《SABER101 (None)》之铭言:
: http://www.epinions.com/review/Canon_PowerShot_SX210_IS_Digital_Camera/content_536777166468
: 想从上面网址抓中间Review部分
: Review是在本文位置
: <span class=rkr>
: 本文
: <br>
: 表达式这样写<span class=rkr>\s(.*)\s<br>
: 完全抓不到东西
: 然后\s(.*)\s<br>这样抓到文章九成
: 剩下这段抓不到
: Straight photography (Street, documentary, and environmental portraiture) is
: primarily concerned with capturing images of people in uncontrived, naturally
: lit and candid settings that evocatively depict or dramatically reveal some
: aspect of the human condition. In addition to being a first rate general
: purpose digicam, the nifty little SX210 is almost perfect for “straight”
: photography - it is compact, responsive, unobtrusive, features a 14x zoom
: (for a little extra standoff room) and dependably generates first rate images.
: 明明就有写换行为什么会抓不到在同一行的这一段
: 然后第一个和第二个的差别只差<span class=rkr>
: 然后就全没了?????
: 好苦恼啊
: 希望有人能解答
: 谢谢
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
加上 s (PCRE_DOTALL) 就行了
然后改成下面这样比较妥当,比较不会捞到多余的资料
<span class=rkr>(.*?)<br>
示意图
http://imageshack.us/f/706/pttregexp.png/

Links booklink

Contact Us: admin [ a t ] ucptt.com