i need your help, can anyone explain me why my code doesnt find the a-tag privacy on the site zoho.com?
my code finds the link "privacy" on other sites well but not on the site zoho.com
I use symfony Crawler: https://symfony.com/doc/current/components/dom_crawler.html
// Imprint Check //
function findPrivacy($domain) {
$ua = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13';
$curl = curl_init($domain);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl, CURLOPT_USERAGENT, $ua);
$data = curl_exec($curl);
$crawler = new Crawler($data);
$nodeValues = $crawler->filter('a')->each(function ($node) {
if(str_contains($node->attr('href'), 'privacy-police') || str_contains($node->attr('href'), 'privacy')) {
return true;
} else {
return false;
}
});
return $nodeValues;
}
if you watch the source code from zoho.com, then you will see the footer is empty. But on the site, the footer isnt empty if you scroll down.
How can I find now this link Privacy?
My Answer:
to the privacy policy page on zoho.com, but it doesn't find the specific a-tag with the text "privacy". Here is my code:
php
$url = 'https://www.zoho.com/';
$html = file_get_contents($url);
$dom = new DOMDocument();
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {
if ($link->nodeValue == 'Privacy') {
echo $link->getAttribute('href');
break;
}
}
?>
The issue might be that the text "Privacy" is not an exact match in the a-tag on the zoho.com website. You can try using a different approach to find the privacy policy link, such as searching for the href attribute that contains the word "privacy" or using a regular expression to match the text more flexibly.
You can also try using a tool like XPath to search for the specific a-tag with the text "privacy". Here is an example using XPath:
php
$url = 'https://www.zoho.com/';
$html = file_get_contents($url);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$links = $xpath->query('//a[contains(text(), "Privacy")]');
foreach ($links as $link) {
echo $link->getAttribute('href');
}
?>
This code snippet uses XPath to search for the a-tag that contains the text "Privacy" on the zoho.com website. This should help you find the specific link you are looking for.
Rate this post
4 of 5 based on 2852 votesComments