John Davidson

php - html2pdf : Prevent the crawl of the google robot

0 comments
Message:


I use the following script :


https://www.html2pdf.fr/en/home


This script convert my php file to pdf file.


Example the url : mywebsite.com/pdf/url.php?id=8 will generate a PDF file.
Another example : https://github.com/spipu/html2pdf/blob/master/examples/example01.php


I don't want the google robot to index these pages.


I added the code below in my htaccess file but it doesn't prevent google from crawling the page because it's in PHP :
#Block indexing of Word and PDF files
<files ~ ".(doc|docx|pdf)$">
Header Set X-Robots-Tag "noindex, nofollow


I can't block it how do I do it?



My Answer:
To prevent Google from crawling and indexing your PDF files generated by the html2pdf script, you can add a meta tag to the generated PDF files.

You can add the following meta tag to the head section of your PHP file before converting it to a PDF:

html



This meta tag tells search engines not to index or follow the links on the page.

Alternatively, you can also add the X-Robots-Tag header directly in your PHP file before converting it to a PDF:

php
header("X-Robots-Tag: noindex, nofollow");


This will send the X-Robots-Tag header to the browser when the PDF file is generated, instructing search engines not to index the page.

By adding either of these options to your PHP file before converting it to a PDF, you can prevent Google from crawling and indexing the generated PDF files.

Rate this post

4 of 5 based on 4734 votes

Comments




© 2024 Hayatsk.info - Personal Blogs Platform. All Rights Reserved.
Create blog  |  Privacy Policy  |  Terms & Conditions  |  Contact Us