John Davidson

php regular expression html - the last occurence behind the end match

0 comments
Message:


Trying analyze a webpage
https://www.zap.co.il/model.aspx?modelid=593364


(which compare same product price on different stores - each store has a block with it's logo, price...)


i want to seperate each store to a different array element with: preg_match_all


I want somehow to skip the advertisers area (the stores at the top - in this example the two first stores)


as i see - each element starts with:


<div class="StoreLine">

ends with (only elements that are not advertizements)


<div class="SmartBuyButtons">

or


<div class="BuyButtons">

I use this expression:


preg_match_all("/<div\sclass=\"StoreLine\">*.*?(<div\sclass=\"SmartBuyButtons\">|<div\sclass=\"BuyButtons\">)/s",$str,$zapElements); 

which works fine, however i dont know how to skip the the advertisers area
which those elements also begins with


<div class="StoreLine">

So somehow i need the last occurence behind the end match



My Answer:
php
$html = file_get_contents('https://www.zap.co.il/model.aspx?modelid=593364');

preg_match_all('/
.*?<\/div>/s', $html, $matches);

$stores = array_slice($matches[0], 2); // Skip the first two elements

print_r($stores);

Rate this post

4 of 5 based on 2896 votes

Comments




© 2024 Hayatsk.info - Personal Blogs Platform. All Rights Reserved.
Create blog  |  Privacy Policy  |  Terms & Conditions  |  Contact Us