filegator icon indicating copy to clipboard operation
filegator copied to clipboard

Bug report: Zipping 2

Open tobiasgruber1995 opened this issue 3 years ago • 24 comments

Describe the bug

When trying to zip a specific folder I get the error noted in the logs below. On other folders it works fine, so I assume it must have something to do with the folder I'm trying to zip, but I can not figure out what it is? Did you encounter that problem before or have an idea what might cause it?

Expected behavior

Zipping the folder as it works on other folders.

Logs

First various hundreds/thousands of entries like this are created:

default.NOTICE: E_USER_NOTICE: Invalid size {"code":1024,"message":"Invalid size","file":"/path/to/filegator/vendor/phpseclib/phpseclib/phpseclib/Net/SSH2.php","line":3422} []
default.NOTICE: E_USER_NOTICE: Connection closed by server {"code":1024,"message":"Connection closed by server","file":"/path/to/filegator/vendor/phpseclib/phpseclib/phpseclib/Net/SSH2.php","line":3757} []

At some point it changes to errors like this:

default.NOTICE: E_USER_NOTICE: Connection closed prematurely {"code":1024,"message":"Connection closed prematurely","file":"/path/to/filegator/vendor/phpseclib/phpseclib/phpseclib/Net/SSH2.php","line":3387} []
default.NOTICE: E_USER_NOTICE: Connection closed by server {"code":1024,"message":"Connection closed by server","file":"/path/to/filegator/vendor/phpseclib/phpseclib/phpseclib/Net/SSH2.php","line":3757} []

And then it eventually crashes with a timeout:

default.ALERT: Fatal Error (E_ERROR): Maximum execution time of 60 seconds exceeded {"code":1,"message":"Maximum execution time of 600 seconds exceeded","file":"/path/to/filegator/vendor/monolog/monolog/src/Monolog/Logger.php","line":324,"trace":null} []
default.ALERT: Fatal Error (E_ERROR): Maximum execution time of 60 seconds exceeded {"code":1,"message":"Maximum execution time of 600 seconds exceeded","file":"/path/to/filegator/vendor/league/flysystem-ziparchive/src/ZipArchiveAdapter.php","line":85,"trace":null} []

Environment

  • FileGator Version: 7.5.2
  • Server: Apache 2.4.41 on Debian Linux
  • PHP Version: PHP 7.4
  • Browser: Chrome, Firefox

tobiasgruber1995 avatar Jul 01 '21 11:07 tobiasgruber1995

Quick question: why are you using PHP's ZIP extension? Any specific reason? Why not delegate to the zip binary directly (or Powershell's Compress-Archive cmdlet), through a pipe using 'popen' (or something similar, like the Symfony Process component)?

On production servers, PHP's zip/unzip functionality can become a problem if the file is larger than 40MB, for example. I'm just asking before we start patching this code for our production servers. If you are interested in our patch, we can certainly share it with you once it is ready.

Many thanks for Filegator! It's such a great project!

Cheers!

andrewscaya avatar Jul 08 '21 23:07 andrewscaya

Hi,

Thanks for your question. The main reason is simplicity and compatibility. You can share your patch so others can try your approach. We can then see how it goes and maybe replace the native extension.

alcalbg avatar Jul 09 '21 07:07 alcalbg

I do think that might be exactly the solution I need, because the errors only appear for bigger files! So yes, please share your patch, I would really like to try it!

tobiasgruber1995 avatar Jul 09 '21 07:07 tobiasgruber1995

Hey @alcalbg and @tobiasgruber1995 !

Many thanks for your answers!

Since you seem to say that this way of zipping files might be truly useful to the project, let's promote this request from a patch to a feature upgrade. We'll work on a solution over the next two weeks in order to guarantee cross-platform compatibility (Windows, Mac/Unix), with all of the appropriate unit tests. We'll be developing the solution on the https://github.com/linuxforphp/filegator/tree/dev-zip-proc-feature branch, and you can follow our progress there. Once done, we'll issue a pull request. Any form of participation is always welcome, of course!

Cheers!

andrewscaya avatar Jul 09 '21 20:07 andrewscaya

Hi @alcalbg ,

I am now experiencing similar issues when deleting bigger folders. Only here I do not even get an error, I just see the loading circle for a short while and it disappears again - without having done anything to the folder that was supposed to be deleted. Should I open another issue for that?

tobiasgruber1995 avatar Jul 14 '21 05:07 tobiasgruber1995

Hi,

Is there anything in the logs? Can filegator write to each of those folders? If one folder has a wrong permissions then everything could fail. It can also happens if some file or folder name is encoded as non-utf8 or include some weird symbols. Also, check if there is enough memory for php scripts and enough execution time to complete the task.

alcalbg avatar Jul 14 '21 07:07 alcalbg

Hi,

I did some more investigating and found the problem for the deleting. It's not related to the size though, therefore I will open another issue.

tobiasgruber1995 avatar Jul 14 '21 12:07 tobiasgruber1995

Hi @alcalbg and @tobiasgruber1995,

We were thinking of adding a config parameter in Filegator's configuration to allow users to choose if they want to continue using Filegator's native Archiver object, or if they want to delegate zipping to a third-party process. What do you guys think?

Also, if we want the users to be able to delegate all of the disk I/O operations (move, copy, delete, zip) to third-party binaries through the app's config, maybe we should develop a PHP library that would do all of these basic operations, by "extending" the Symfony Process class, and use that library in Filegator (more modular solution). Thoughts?

andrewscaya avatar Jul 14 '21 16:07 andrewscaya

@alcalbg @tobiasgruber1995 --

Please see the proof of concept commit #f5828ae2269 https://github.com/linuxforphp/filegator/commit/f5828ae2269d121ce6a830eaf5162f51f3daa555). To use the new methods, one only has to change the configuration files accordingly.

I haven't created the service yet (or the corresponding unit tests), nor have I done any of the OS specific code, because I prefer to validate with you, before I continue working on this solution any further.

Also, the possibility of putting this new code in a separate module in order to use it for the other disk I/O commands still stands.

Please let me know what you think.

Cheers!

andrewscaya avatar Jul 14 '21 22:07 andrewscaya

Hi,

Thanks for this.

Process Control Extensions (Process) is not very widely used php extension, many people don't have this installed and they have a good reason. Further more, as noted in the docs, this extension is not available on Windows platforms.

However, I like the idea of external modules and filegator already has an interface for this - ArchiverInterface.

If you could implement this interface and add additional Zip adapter that would be a way to go.

alcalbg avatar Jul 15 '21 07:07 alcalbg

Oh wait, this is a Symfony component, my bad. Let me actually try this...

alcalbg avatar Jul 15 '21 10:07 alcalbg

Yup, this works great, but again, it should be implemented using the ArchiverInterface interface.

Something like:

configuration.php:

        'Filegator\Services\Archiver\ArchiverInterface' => [
            'handler' => '\Filegator\Services\Archiver\Adapters\SymfonyProcessArchiver',
            'config' => [
                'bin' => '/usr/bin/zip',
            ],
        ],

backend/Services/Archiver/Adapters/SymfonyProcessArchiver.php:

namespace Filegator\Services\Archiver\Adapters;

use Symfony\Component\Process\Process;
                                                                                                                                                                           
class SymfonyProcessArchiver implements Service, ArchiverInterface
{
    // implementation details
}

If implemented this way, both zipping and batch downloading can use this adapter seamlessly.

alcalbg avatar Jul 15 '21 10:07 alcalbg

The only real catch here it how to make this work with remote filesystems, s3 or ftp. The current version uses tmp files to solve this problem.

alcalbg avatar Jul 15 '21 15:07 alcalbg

@alcalbg OK. Yes, Symfony Process is a very powerful and useful wrapper class for popen, fread, fwrite, and pclose. No extensions involved.

I can definitely abandon the idea of making this a service factory that would return a Process object, with a requested wrapped command that suits the OS on which a user might be running this code. Instead, I can indeed implement this with the ArchiverInterface, making the interface a contract to a proxied object (strange, but feasible). The only thing is that some elements of the contract will be useless placeholder methods. Here is the current signature of the interface:

interface ArchiverInterface
{
    public function createArchive(Filesystem $storage): string;

    public function uncompress(string $source, string $destination, Filesystem $storage);

    public function addDirectoryFromStorage(string $path);

    public function addFileFromStorage(string $path);

    public function closeArchive();

    public function storeArchive($destination, $name);
}

For example, if we create an unnamed pipe to the zip binary, we can't actually "closeArchive()", and adding a directory or a file to the list of arguments doesn't require two different methods, because the zip binary doesn't distinguish between these two elements of the filesystem. Moreover, the interface requires the presence of a 'Filegator\Services\Storage\Filesystem' object, which is totally useless to a Symfony Process per se. I could make it work anyway, but the validity of the interface's contract would be, at best, made relative.

Please let me know what you think. I'll be working on the code today.

andrewscaya avatar Jul 15 '21 17:07 andrewscaya

@alcalbg --

I've finished working on the new zip process feature. I've tried using the ArchiverInterface, but the interface's contract was just getting in the way, and making things messy. Moreover, the interface was no longer serving its purpose, since most of the method signatures were no longer going to be useful to the developers anyway. Of course, you can decide otherwise, and try to make everything fit as much as possible into the interface signature. It'll be up to you.

You can have a look at the completed patch (commit #f5828ae2269 https://github.com/linuxforphp/filegator/commit/f5828ae2269d121ce6a830eaf5162f51f3daa555). Please let me know if you want me to send you the PR. If so, I'll remove the 'Archiver' namespace (class and interface). Also, the unit tests now have a code coverage of 99.88%.

Please let me know if there's anything else I can do, ok?

Also, if you merge, when will you expect to do the next release of FileGator? Just asking because we will need to upgrade our Cloud servers.

Many thanks in advance!

Cheers! :)

andrewscaya avatar Jul 16 '21 01:07 andrewscaya

Hi @alcalbg @tobiasgruber1995 --

I just made a last minute change. The new commit is #f5828ae2269 https://github.com/linuxforphp/filegator/commit/f5828ae2269d121ce6a830eaf5162f51f3daa555).

Cheers! :)

andrewscaya avatar Jul 16 '21 02:07 andrewscaya

Hi,

Thank you for this, I appreciate your hard work.

Unfortunately, I cannot accept this for several reasons:

  • Zip feature doesn't work for users with a custom home folder, it only works if a single repository path is used for everyone.
  • Zip and batch download will only work for local filesystem, remote filesystems like S3 or FTP are not supported.
  • Configuration file has been moved/changed, all existing installations will fail to load. The upgrade path for older versions must be documented.
  • I'm not sure why, but php tests are failing here: Tests\Unit\SymfonyProcessFactoryTest::testCreatingASymfonyProcessWithFailure

I've also tried to make this work using the ArchiverInterface but it got messy as well, and I eventually gave up. It seems that this is a hard one to solve without breaking something along the way.

Cheers!

alcalbg avatar Jul 16 '21 08:07 alcalbg

Hey @alcalbg --

OK, I've got a solution for you. We keep the current Archiver (with the underlying filesystem abstraction layer for S3 and SFTP), and we add a configuration parameter that works only for a local filesystem and that allows to turn on 'Fast Zipping' with Symfony Process. This way, the users won't see any difference for the S3 and FTP repos, and the local repos CAN use the new fast zipping feature. Also, a user that successfully mounts an S3 bucket onto his local filesystem COULD turn on fast zipping, and get the best of both worlds.

Here are my answers to your objections:

  • Zip feature now works with custom home folders (sorry, I missed that when I coded the original solution yesterday),

  • S3 and FTP issue solved (see above),

  • I had to change the configuration files, because there are some elements in the configuration array that are references to other parts of the array. The new setup respects AOP principles, by merging configuration while respecting key hierarchies. I think that if this added feature is merged, you could release it as FileGator 8.0, which would warn the users of the possible BC breaks caused by changing the location of the files. Honestly, taking the router configuration file out of the 'Controller' namespace is a good thing,

  • The failing test was too environment-specific and I changed it by making more general (added abstraction).

The new commit can be found here: commit #f5828ae2269 https://github.com/linuxforphp/filegator/commit/f5828ae2269d121ce6a830eaf5162f51f3daa555.

Please let me know if you need anything else. I think the FileGator users will be happy when they realize that they can now zip 40MB+ folders in seconds instead of minutes!

Cheers!

/cc @tobiasgruber1995

andrewscaya avatar Jul 16 '21 23:07 andrewscaya

Hi,

Thanks!

However, I am still with the idea of a custom zip adapter and ArchiverInterface. I've tried again and I think I have something that could actually work. I've used your approach with Symfony's Process and wrapped everything in a single file.

Please try it out and let me know your thoughts: https://github.com/filegator/process_archiver

Cheers, Milos

alcalbg avatar Jul 17 '21 12:07 alcalbg

Hey @alcalbg --

I just tried your solution on one of our production servers, and got this: "Fatal error: Uncaught League\Flysystem\NotSupportedException: Links are not supported, encountered link at [folder]...". I suspect that there will be many other problems like these if we go down this path. The file system is a complex piece of hardware/software in a modern computer, and it has been proven and hardened over the years. I would not replace it with software that was built only recently. This is only my two cents, of course...

I really appreciate you trying to make it work with the ArchiverInterface, but my solution fixes many other problems, like symlinks (both soft and hard), sparse files, and a lot of other stuff that one can find on a production server's file system. In fact, I thought we were still exploring the possibility of replacing ALL file system commands (move, copy, delete, etc.) with Symfony Process commands, thus the Service Factory that I have in my solution.

Please let me know where you want to go from here, because the ArchiverInterface path doesn't solve the problem for us.

Many thanks, my friend!

/cc @tobiasgruber1995

andrewscaya avatar Jul 17 '21 17:07 andrewscaya

Hi,

Filegator is built on top of flysystem which does not support symlinks:

https://flysystem.thephpleague.com/v1/docs/adapter/local/

If you need more speed, low-level access to files, and symlink support, then maybe this is not the best tool for the job. You could replace everything with raw Linux commands and make it work, but perhaps you can find much better software for your specific use case. If you ask me, I would look for something written in Rust or Go.

Cheers!

alcalbg avatar Jul 17 '21 19:07 alcalbg

Hi @alcalbg --

That's too bad. When a software solution reveals itself to be unfit for scalable and high-end solutions, we usually try to change the software, not the end-user's needs.

At our company, we build entire operating systems from scratch (yes, indeed!), and we can tell you that Rust and Go are NOT the best solutions for Web hosting (ex. very bad garbage collection, unstable modules, etc.).

No sweat, my friend! We'll fork the FileGator project, and start a new one.

Many thanks anyway, and good luck!

Cheers!

/cc @tobiasgruber1995

andrewscaya avatar Jul 17 '21 19:07 andrewscaya

Hi,

Filegator is built on top of flysystem which does not support symlinks:

https://flysystem.thephpleague.com/v1/docs/adapter/local/

If you need more speed, low-level access to files, and symlink support, then maybe this is not the best tool for the job. You could replace everything with raw Linux commands and make it work, but perhaps you can find much better software for your specific use case. If you ask me, I would look for something written in Rust or Go.

Cheers!

just a note... why throwing a exception instead of just skipping links as said in documentation?

If you are using filegator to browse and there is a link in the destination folder, isn't better to just skip the file/link not showing it instead of throwing an exception and disallow browsing this folder?

I noticed just modifying code:

  • vendor/league/flysystem/src/Adapter/Local.php (line 74)

public function __construct($root, $writeFlags = LOCK_EX, $linkHandling = self::SKIP_LINKS, array $permissions = [])

and that's enough!

Just can browse directories while links are just "hidden"...

I think it would be nice to be the default for filegator.

feanorknd avatar Dec 28 '22 20:12 feanorknd

@feanorknd you should be able to configure that directly in your configuration.php file:

        'Filegator\Services\Storage\Filesystem' => [
            'handler' => '\Filegator\Services\Storage\Filesystem',
            'config' => [
                'separator' => '/',
                'config' => [],
                'adapter' => function () {
                    return new \League\Flysystem\Adapter\Local(
                        __DIR__.'/repository',
                        LOCK_EX,
                        \League\Flysystem\Adapter\Local::SKIP_LINKS,
                    );
                },
            ],
        ],

alcalbg avatar Dec 29 '22 08:12 alcalbg