papermark icon indicating copy to clipboard operation
papermark copied to clipboard

Preserve Folder Hierarchy in Multi-file Uploads (#937) Completed

Open SubhPB opened this issue 1 year ago • 7 comments

In this PR, i enabled the user to upload the folder where folder hierarchy will be preserved (does not matter how deeply nested the folder is) and all the corresponding documents are correctly uploaded in the right folders.

After reading this code you will find that the code that i written is fully consistent with existing application's code and logic. Upon the drop event of folder new folders are correctly created at the backend along with their document.

I tested this code over multiple browsers and it is working correctly. I had faced some compatibility issues in firefox which are been addressed (with full explanation as comments in the code file).

After the completion of uploading files and folders, i made sure to use mutate to display the most recent data.

Video : Please check through attached video (#937 )

https://github.com/user-attachments/assets/cc8ec03c-e6ac-40c6-abf3-ba83493aa09f

More explanation and reasoning can be found in code comments.

@mfts Please feel free to ask questions if you have Byimaan

SubhPB avatar Oct 18 '24 23:10 SubhPB

@SubhPB is attempting to deploy a commit to the mftsio Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Oct 18 '24 23:10 vercel[bot]

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

github-actions[bot] avatar Oct 18 '24 23:10 github-actions[bot]

@SubhPB also please sign the CLA as indicated above :)

mfts avatar Oct 19 '24 11:10 mfts

@mfts Great to hear that you liked this implementation. I have totally understood your concerns. I would be happy to work on things that you have mentioned.

SubhPB avatar Oct 19 '24 15:10 SubhPB

In my recent commit (49f359a) #937 , i worked on the functionalities you told me. I solved all the issues plus refactored existing code.

https://github.com/user-attachments/assets/4e7b96ec-3579-4f1d-b507-6cff9d6d2a2d

Here is how i got to the solution and lessons i learned

**Section A: Finding a latent bug **

        // BUG: in the existing code
        const items  = event.dataTransfer.items;

        for(let i = 0; i < items.length; i++){
            let item: FileSystemEntry | null = 
                items[0]?.webkitGetAsEntry() 
                //@ts-ignore
                ?? items[0]?.getAsEntry() 
                ?? null;
            
            if (item){
              filesToBePassedToOnDrop.push(
                ...(await traverseFolder(item))
              )
            }
        }

You see there is items[0] but not items[i], thought i got its solution but this was not the real problem, i was curious to know why this bug was not causing expected error behavior. Imagine if user uploads two folders so according to bug the first folder should have uploaded twice right! (because items[0] ran two times). In actual only the first selected item was getting upload only once . There is a very interesting reason behind it which is covered in next section. Conclusion of this section: if you even set items[0] as items[i] you will still get the same outcome.

** Section B : Debugging process** During debugging i found that for loop was running only once for the first iteration even if items.lengthis shown as 2 or any larger number which made this problem more interesting. After hours of intense debugging i found that when we usefor` loop to iterate over items this what happens.

let say items.length = 4 means four items (files or folders are being dropped) and we expect for loop to run four times.

` when i = 0 (First selected item) we called traverseFolder which internally calls the API methods such as ".createReader()" , ".readEntries" etc. It turns out that after execution of those methods. items.length was getting set to zero.

when i = 1 and then item.length is found to be 0 that means this iteration wouldn't be run `

** Section C Finding appropriate solution**

So, using for loop was not getting me anywhere near the solution. then tried many other ways which even worked. But simplest and effective method was to use Promisse.all or Promise.allSettled where i used Promise.all in the code

Finally, everything is working very good and all this explanation would make more sense after going through the actual code.

@mfts If you have any question or other concern please let me know and i left comments in code to explain changes for the lines that i refactored

happy coding

  • Byimaan

SubhPB avatar Oct 20 '24 18:10 SubhPB

I have read the CLA Document and I hereby sign the CLA

SubhPB avatar Oct 20 '24 18:10 SubhPB

@mfts Please let me know if there is any modification or new feature implementation is required on top of current solution. Thanks

SubhPB avatar Oct 21 '24 19:10 SubhPB

Wow @SubhPB this is incredible work!

Thank you so much for your effort and detailed explanation! I was able to full understand the situation from your explanation.

I think the PR is ready for merging. One small thing I realized, that might be better solved in a separate PR though:

  1. when a user uploads a document to a dataroom, we create a document and a dataroomDocument. So the document will appear in the dataroom and in the /documents list. This works as intended.
  2. when a user uploads a folder with documents (and perhaps subfolders with documents) to a dataroom, then the folder structure is preserved only in the dataroom, however in the /documents list the documents do not have a corresponding folder.

Do you think we can handle case 2 so that the folder structure is preserved in /documents list even though the user uploads directly in the dataroom?

Let me know your thoughts and if you think it makes sense to add to this PR or open a new one after this is merged.

mfts avatar Oct 22 '24 07:10 mfts

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
papermark ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 22, 2024 7:48am

vercel[bot] avatar Oct 22 '24 07:10 vercel[bot]

I'm feeling great that you find my code helpful, and i was able to make it readable, even though it was my first PR that i ever made.

Case 2 I understand that this is going a bit outside of the scope of our issue topic. It is totally up to you and whatever seems easy to you. I'm able to do this in any way either by separate PR or by sticking to this PR. First i would need to go deeper into the existing code of datarooms in order to come up with the best solution. Currently i'm not much aware of code written at datarooms side which unable me to take this decision about Case 2.

Question At datarooms user uploads a deeply nested folder named FolderA by which at /documents do you want to create FolderA holding all the documents in a flat way without any sub-folder. is this the outcome you need? or it is different

@mfts Either you want to create a new issue, or you want to make it done in this PR. i do not have any problem with any decision. But a few input-outcome examples will help me to come up the best solution.

Meanwhile, the time i'm waiting for your response i will go through the datarooms code. Thanks very much for making this functionality an open issue, i enjoyed it a lot to solve it.

  • Byimaan

SubhPB avatar Oct 22 '24 16:10 SubhPB

Great! i got the problem statement so you do not need to give any input-output example. i just have read the code and have seen the behavior what happens if user uploads something in datarooms and its corresponding outcomes at /documents we just want to preserve the folder hierarchy at both endpoints.

@mfts i will soon start to work on this feature and will wait for your decision about where to publish the solution

  • Byimaan

SubhPB avatar Oct 22 '24 19:10 SubhPB

From an overview i can see but not really sure that there is nothing much to do in upload-zone.tsx but it is more with dealing api-logic or backend (At backend how we are handling the datarooms upload ). But I will see and find the solution that works best for us once getting into the code.

SubhPB avatar Oct 22 '24 19:10 SubhPB

What our goals?

  1. We want if user uploads folder or file at /documents it will not impact on /datarooms (as it works now)

  2. But if user uploads folder (perhaps with subfolders) at /datarooms we want to preserve the folder hierarchy also at '/documents'

Questions

Q1 If user deletes a folder at /datarooms and after deletion, do we still need to keep to its symmetric folder at /documents?

  • (According to my project observation, I guess your answer would be 'NO' and you do not want keep folder at /documents if it is deleted at /datarooms. Excuse me if my guess is wrong)?

Q2 Same with other CRUD operations too, what are your plans to maintain consistency so that plan does not contradict with existing code?

Q3 We should get ready for multiple edge cases! - Imagine at /documents there is one folder with the name of D-1 consists of file-1 and file-2.Currently, It is independent from /datarooms. Now let's say user goes at /datarooms and try to create a folder with same name D-1 then we can easily create this as DataroomFolder but when we try to create Folder entity to make them symmetric then it would not accepted with this name because it already exists on Folder and already having documents in it. - If im able to find this serious edge case just by having the first glance of code. My intuition says that there must be even more edge cases should exist of which we are not aware of yet. - So the point is, we need to make a good solution strategy, so that an unknown edge case will not lead us to change our strategy, and that strategy should be adaptable and will not bring new problems in the future.

Here is my observation and how i see this problem after code analysis:

  • There is a small bug that i just found in our application during code analysis and thought i should let you know, and sorry that I'm bit lazy to raise an issue and here is how you can repeat . if you try creating new folder with a name which already exists then folder will not be created but it shows an empty error message to the user rather than displaying something like "A folder with this name already exists"
  • What is root cause that brings inconsistency? I think, in our schema DataroomDocument points to Document which can be linked to DocumentFolder or Folder, now our desired outcomes demand us to create something like a bridge in between them.

@mfts So this feature is not that simple to implement perfectly, requires more negotiation, clear outcome statements plus proper plan before doing any single line of code.

  • Byimaan

SubhPB avatar Oct 22 '24 23:10 SubhPB

@SubhPB these are excellent points.

I think it's best to create a new PR for this preserving folder structure in /documents when uploading a folder to /datarooms

Just to answer your questions:

If user deletes a folder at /datarooms and after deletion, do we still need to keep to its symmetric folder at /documents?

The folder in dataroom is a DataroomFolder, which can be safely deleted alongside DataroomDocument. We do not want to delete Document and Folder records in this case.

Same with other CRUD operations too

We don't care about other CRUD operations. Only the initial upload / create operation because that's when we create the instance for Document, Folder, DataroomDocument, and DataroomFolder. After that they can be managed independently.

if you try creating new folder with a name which already exists then folder will not be created but it shows an empty error message to the user rather than displaying something like "A folder with this name already exists"

That's a good observation and yes, there's currently a unique path required for a folder on a team


I will go ahead and merge this PR because it is better than the current implementation.

mfts avatar Oct 23 '24 03:10 mfts

Awarding SubhPB: 750 points 🕹️ Well done! Check out your new contribution on oss.gg/SubhPB

oss-gg[bot] avatar Oct 23 '24 03:10 oss-gg[bot]