office-text-extractor
office-text-extractor copied to clipboard
Yet another library to extract text from MS Office and PDF files
### Description When pulling text from a spreadsheet, the current extractor does not return the sheet names in the text. It would be GREAT if there was an options to...
### Description An error occurred when reading a .doc file. ``` Error: text-extractor: could not find a method to handle application/x-cfb ``` I looked into the code and the type...
### Description When reading a PDF that contains Arabic text, it can't read. It outputs a text such as ` ̜̺͙ͯ̀ ͳ̮ /` ### Library version 3.0.2 ### Node version...
### Description It was working perfectly before, but recently I'm seeing undefined responses. No error is logged in the console. I was testing the same file/url that was working before,...
### Description It would be a nice feature if we could override the built-in extractors. This could be achieved as simple as changing https://github.com/gamemaker1/office-text-extractor/blob/main/source/lib.ts#L59 from `this.methods.find` to `this.methods.findLast` or alternative....
### Description Im getting the following error when trying to use the library on NextJS v14 with turbo mode enabled: ```shell https://nextjs.org/docs/messages/module-not-found ✓ Compiled / in 4.2s ⨯ ModuleBuildError: ./node_modules/keyv/src/index.js:22:15...
### Description `Warning: Ran out of space in font private use area.` `(node:439521) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or...