I'm coding an image uploader in PHP. It will allow the user to upload JPG and PNG images on a website. Next will be MP4 videos (as in the picture linked). Most importantly, my aim is to make this uploader as secure as possible.
( As a side note if you're interested, the uploader currently:
- Checks the file type
- Rewrites the original filename completely
- Limits the file size
- Uses getimagesize() to check if the file is an image
- Removes execution rights of the file from all users )
File content checking:
For instance, it's clear that inserting malicious PHP or Javascript code into a .JPG or any other file is very easy. Because of this, I've also prepared my uploader to remove all tags like '<?php', '<style...' or '<script...' from the contents of each file.
That seems to fix one problem, but does it create another? For instance, this media file (please see the linked picture) contains characters like '<?ph'. This totally harmless, non-functional '<?ph' is obviously generated programmatically without ill will. So are several ? > tags that can be found in the same media file. I mentioned this just to lead you to my real question:
Does something prevent JPG, PNG and MP4 encoders or other related programs from generating full <?php, <style..., <script... and other tags into the files? We got close without trying, so I think it's fair to ask.
If nothing is preventing that, then I should find better methods to deal with malicious code in media files. And even if my remover worked, I'm still interested in the "right" ways of doing it.
I hope my question wasn't too broad as I mentioned multiple file types. Any help is highly appreciated. Many thanks.
Bonus question: What about PDF, WEBM, FLV and other common media files: can they natively contain such full tags?
My Answer:
No, media files like JPG or MP4 cannot natively contain script tags like