I have not heard about any clever solutions to this problem and have not met in my experience. All options are fairly obvious:
- limiting the total amount of data transferred per user over a period of time. It can be made quite large. For example - 1GB per hour, in my opinion, for the eyes in your case.
- limiting the number of files transferred per user per unit of time.
- setting ttl (lifetime) for "free" files and deleting them after time expires. Here proceed from the probable time of filling out your form with all possible delays. Although, I would put ttl for at least a day anyway.
- The last option is very custom and is required, in my opinion, only if you really have such a problem. Based on the logic of your application, we write a code that tracks the number and speed of the appearance of "free" files per user (taking into account whether he fills the form completely or not) and, when the value reaches a certain value, we start to issue a loading error. We block for an hour, for example. This, again, depends on how low the limit we set. If you are allowed to upload 5 files in 10 minutes, no more than 100MB - block for 10 minutes. If 20 files per hour for 200MB - block for an hour.
The most important and paramount is to understand how acute the problem is. In my experience, excessive “forethought” only takes time and complicates the project. If you don’t even have a hint of this problem, it’s better not to make any filters. Make sure to clean the files once a day and add logging according to interesting criteria - the number of files, the size of files or something like that. Plus - for a serious project, it is imperative to connect a monitoring system with metrics and notifications (such as zabbix) and set up monitoring of the parameters of interest there with notifications at a critical increase. Thus, if the number of files suddenly goes up sharply - you will not miss it, take action and think about how to filter them in the future.
I have done a similar pattern in several recent projects. Projects were closed, so there was no question for intruders. But you are not worth it yet, and only preventive measures are needed. So. All uploaded files are added to the database. When the form is saved, these records appear in the database with the entities of the forms (or something else). Once a day, the script for cron goes through the file system and checks for each file whether it has a corresponding record in the database and whether this record has a useful connection in the database. If there are no links, the file is garbage and is deleted. It should be noted that bypassing the file system is fraught with a heavy load on the disk, moreover, many queries are generated in the database. Therefore, all this happens in the deep night and the detour occurs in parts. Those. there is a given number of days for a full detour, for example 10, and every day only 1/10 of all files are checked. At the end of the check, the current state is saved and next time it continues from the same place.
PS Having expressed my opinion, nevertheless, I join the author and I will be very happy if someone tells about elegant ways to solve the issue.