markitdown icon indicating copy to clipboard operation
markitdown copied to clipboard

Add AsyncMarkItDown as a wrapper

Open 0xRaduan opened this issue 1 year ago • 9 comments

My take on how this code should support async, given all underlying libraries are not supporting async.

For more details/context, see: https://github.com/microsoft/markitdown/issues/13#issuecomment-2543834157

0xRaduan avatar Dec 15 '24 11:12 0xRaduan

@microsoft-github-policy-service agree

0xRaduan avatar Dec 15 '24 11:12 0xRaduan

@0xRaduan -- thanks for the PR.

@afourney @jackgerrits @0xRaduan, I love this PR, but I am worried about the code duplication because of asynchronous tests. What if we stuck with sync implementation, but added an FAQ showing how a user could create an asynchronous wrapper similar to your approach.

This was users know how to create an asynchronous wrapper, but we avoid the code/test duplication? Please lmk your thoughts.

gagb avatar Dec 16 '24 22:12 gagb

Hey @gagb - good feedback, I agree.

In general, this async code is just a wrapper over sync code [1], so technically you don't need to have a full suite of async tests, as all of the functionality should be tested in normal tests.

I would suggest keeping just one async test to verify that the wrapper works, and remove everything else. Initially I've migrated all tests to just verify that it works identically for my own's sake.

0xRaduan avatar Dec 17 '24 06:12 0xRaduan

Hey @gagb - good feedback, I agree.

In general, this async code is just a wrapper over sync code [1], so technically you don't need to have a full suite of async tests, as all of the functionality should be tested in normal tests.

I would suggest keeping just one async test to verify that the wrapper works, and remove everything else. Initially I've migrated all tests to just verify that it works identically for my own's sake.

I like your framing. Would love @jackgerrits feedback on the strategy.

gagb avatar Dec 17 '24 19:12 gagb

@0xRaduan can you fix the precommit and test errors

gagb avatar Dec 17 '24 21:12 gagb

@gagb - fyi, updated the code, fixed linter, removed all tests and just kept one async test to make sure wrapper works in async environment.

0xRaduan avatar Dec 18 '24 09:12 0xRaduan

Can we keep this moving (sorry to sound like a PM hahahaha)

markthepixel avatar Jan 14 '25 05:01 markthepixel

@markthepixel - let me take a look tomorrow, sorry for the wait!

(actually needs this myself, so will take a look asap)

0xRaduan avatar Jan 14 '25 20:01 0xRaduan

I gatta be honest I thought this was my issue but it turns out when you feed a string in instead of a path/link it just gives a vague error 😆

markthepixel avatar Jan 14 '25 20:01 markthepixel