I'm using the following code to successfully download a webpage:
header('Content-Type: text/html');
header('Content-Disposition: attachment; filename="'.$title.'"'); // needed for save-as dialog popup
print $content;
However, I get a single static page with no images resources. What I hoped for is the full folder download as you would get if you use the save as feature of a browser.
The code above triggers the popup pictured below :
I have found a number of post suggesting to use exex(wget url)
but I found that just downloads entire directories in my file structure into itself - duplicating multiple files and causing a lot of confusion. Curl did not yield desired results either.
With both wget and curl, neighter presented the popup window which is essential for a user to rename the download as they see fit.
My Answer:
To achieve the functionality of saving a webpage with all its resources (images, CSS, etc.) and triggering the save-as dialog for the user to choose the download location and filename, you can use PHP to create a zip archive of the webpage and its resources.
Here's an example code snippet to achieve this:
php
// URL of the webpage to download
$url = 'https://example.com/page-to-download';
// Function to download a file and add it to the zip archive
function addFileToZip($url, $zip) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$file_content = curl_exec($ch);
curl_close($ch);
$file_name = basename($url);
$zip->addFromString($file_name, $file_content);
}
// Create a new zip archive
$zip = new ZipArchive();
$zip_name = 'download.zip';
if ($zip->open($zip_name, ZipArchive::CREATE) === TRUE) {
// Download the webpage content
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$content = curl_exec($ch);
curl_close($ch);
// Add the webpage content to the zip archive
$zip->addFromString('index.html', $content);
// Parse the webpage content to find and download all resources (images, CSS, etc.)
$doc = new DOMDocument();
@$doc->loadHTML($content);
$tags = $doc->getElementsByTagName('img');
foreach ($tags as $tag) {
$img_url = $tag->getAttribute('src');
addFileToZip($img_url, $zip);
}
// Close the zip archive
$zip->close();
// Trigger the download of the zip archive
header('Content-Type: application/zip');
header('Content-Disposition: attachment; filename="'.$zip_name.'"');
readfile($zip_name);
// Delete the zip archive after download
unlink($zip_name);
} else {
echo 'Failed to create zip archive';
}
?>
This code snippet uses the ZipArchive class in PHP to create a zip archive containing the webpage content and all its resources. It uses cURL to download the webpage content and resources, and then adds them to the zip archive. Finally, it triggers the download of the zip archive with the save-as dialog for the user to choose the download location and filename.
Please note that you may need to adjust the code to handle different types of resources (e.g., CSS, JavaScript) and handle relative URLs properly.
Rate this post
3 of 5 based on 4666 votesComments