metadata-extractor-dotnet icon indicating copy to clipboard operation
metadata-extractor-dotnet copied to clipboard

PDF support

Open oliver021 opened this issue 3 years ago • 5 comments

Hello friends, I was wondering if it would not be a good idea to include a metadata extractor for other types of files, such as pdf, excel sheets, word documents, etc, since these types of files contain a lot of metadata as well, and I have not seen any library topic on metadata extraction that covers that function, it would be very good since the title of this library is not really limited to metadata of multimedia files.

oliver021 avatar Mar 17 '21 18:03 oliver021

The library is open to the addition of support for other kinds of data, with the following guidelines:

  • No dependencies on external libraries (we have only one exception to this for XMP processing)
  • Metadata must be representable using the directory/tag structure we use throughout

Support for PDF is being tracked in the sibling Java library in https://github.com/drewnoakes/metadata-extractor/issues/327. I have no issue with supporting other document types as you suggest.

drewnoakes avatar Mar 18 '21 00:03 drewnoakes

Okay, I make a pull request now, thanks for responding!

oliver021 avatar Mar 18 '21 03:03 oliver021

@oliver021 fantastic, thanks.

drewnoakes avatar Mar 18 '21 03:03 drewnoakes

Hello, Is there any status about this? I don't find the mentionned pull request.

VincentMarnier avatar Jun 22 '21 07:06 VincentMarnier

Hello, I have not been able to do anything about it, I had a drastic change of plans in my schedule, and I find myself with a very short time

oliver021 avatar Jun 22 '21 18:06 oliver021