PrivateBin icon indicating copy to clipboard operation
PrivateBin copied to clipboard

PHP-CLi tool for regularly triggering softcron independently of number of paste creations

Open fboender opened this issue 7 years ago • 7 comments

From the configuration file, it seems like expired pastes are only removed when a new paste is created:

[purge]
; minimum time limit between two purgings of expired pastes, it is only
; triggered when pastes are created
; Set this to 0 to run a purge every time a paste is created.
limit = 300

In the installation instructions, I found no reference to having to create a cronjob or something, so I assume the above is true. That is problematic in scenarios where PrivateBin is not used often. If a new paste is not created for weeks, it can take weeks before an expired paste is removed, leaving the paste vulnerable in the meantime.

Is there a proper way to purge expired pastes automatically? Say a cronjob that calls a script or something?

fboender avatar Jan 29 '18 08:01 fboender

Indeed we use s "softcron" approach (see https://github.com/PrivateBin/PrivateBin/issues/3). See also the doc for this in the wiki.

So do you think a cronjob is necessary for that? It could be offered as an additional feature...

rugk avatar Jan 29 '18 10:01 rugk

it is only triggered when pastes are created

r4sas avatar Jan 29 '18 10:01 r4sas

Hi, thanks for your quick reply!

This is my current understanding of how / when pasted are deleted:

  • Random expired pastes are deleted (with a limit) when creating a new paste.
  • When a paste is accessed, but it is expired, it is deleted and the paste is not served to the requester.

So it seems possible that pastes that are not accessed directly may linger for a longer period of time than their actual expiration date. If the application becomes defunct (for example, because the virtualhost is disabled), the expired pastes are never cleaned up. If an attacker obtains an URL and gains privileged access to the system, they can access pastes that have already expired.

Another potential problem I spotted (though I haven't dived deep enough into this yet) is that it seems that empty directories (e.g. "data/6b/8b") aren't removed. Given the current algorithm in "_getExpiredPastes", that seems to indicate that it becomes less and less likely for expired pastes to be deleted as the number of pastes, including deleted ones, increases. Especially if the number of new pastes is low. There's a check for empty directories, which are skipped on the next scan, but the check is performed on the number of files in the second level (i.e. the number of entries in "6b"). Since those directories ("8b") are not cleaned up, more and more entries will accumulate, making it less likely that the random pick actually removes an expired paste. A system cronjob would also solve this problem.

I'd prefer a solution that guarantees that expired pastes are removed within a certain period. However, since the application offers never-expiring pastes, that might prove difficult to implement efficiently without all kinds of crazy caching schemes. In which case users should probably just switch to the database-backed storage.

I suggest the following changes:

  • Remove empty dirs if the last paste / dir is removed from that dir.
  • A (php-cli?) system cron job that implements the same algorithm as the current soft cronjob. It's not perfect, but for it should alleviate my concerns outlines above.

fboender avatar Jan 29 '18 12:01 fboender

You understand it correctly. And when using the database storage the problem with empty dirs is non-existent.

I think your suggestions are good ideas, I'll just open a new issue for the first point, as it is relatively separate from the previous one. -> https://github.com/PrivateBin/PrivateBin/issues/277

As for the other point, yeah, why not? Just a small PHP CLI tool that executes the existing softcron. This would make sure that pastes are deleted in a "reasonable" time after they expire. One could even add an option to disable the internal softcron mechanism then, in order to e.g. speed up the paste creation.

rugk avatar Jan 29 '18 12:01 rugk

my 2c. This is a really good idea for low traffic installations which are likely to be in the majority. It is not simply a matter of creating a cli entry point to the existing Model::purge() function?

macropin avatar Mar 08 '18 05:03 macropin

Has anyone figured this out yet or implemented? It would be awesome to have something I could use with a cron job to delete/purge expired pastes on a fairly regular basis for a low traffic site?

mbressman avatar Apr 23 '20 04:04 mbressman

Alternative for empty folder.

Delete folder empty from /var/www/domain/data/

sudo crontab -e

# 6h 00. 
0 6 * * * find /var/www/domain/data -type d -empty -delete

Waiting for a better solution.

ZerooCool avatar Jun 24 '20 09:06 ZerooCool