I use the following script :
https://www.html2pdf.fr/en/home
This script convert my php file to pdf file.
Example the url : mywebsite.com/pdf/url.php?id=8 will generate a PDF file.
Another example : https://github.com/spipu/html2pdf/blob/master/examples/example01.php
I don't want the google robot to index these pages.
I added the code below in my htaccess file but it doesn't prevent google from crawling the page because it's in PHP :
#Block indexing of Word and PDF files
<files ~ ".(doc|docx|pdf)$">
Header Set X-Robots-Tag "noindex, nofollow
I can't block it how do I do it?
My Answer:
To prevent Google from crawling and indexing your PDF files generated by the html2pdf script, you can add a meta tag to the generated PDF files.
You can add the following meta tag to the head section of your PHP file before converting it to a PDF:
html
This meta tag tells search engines not to index or follow the links on the page.
Alternatively, you can also add the X-Robots-Tag header directly in your PHP file before converting it to a PDF:
php
header("X-Robots-Tag: noindex, nofollow");
This will send the X-Robots-Tag header to the browser when the PDF file is generated, instructing search engines not to index the page.
By adding either of these options to your PHP file before converting it to a PDF, you can prevent Google from crawling and indexing the generated PDF files.
Rate this post
4 of 5 based on 4734 votesComments