komga
                                
                                
                                
                                    komga copied to clipboard
                            
                            
                            
                        Parse PDF Title, Author, Creation date and tags for metadata
Describe your suggested feature
PDF contains metadata information like Title, Author, CreationDate and tag. It could be useful to parse them when analyzing the file and populate the book information with theses.
Other details
I see that the code is using org.apache.pdfbox.pdmodel.PDDocument to parse PDF document. There is a method getDocumentInformation() that can retrieve this information.
Acknowledgements
- [X] I have searched the existing issues and this is a new ticket, NOT a duplicate or related to another open issue.
 - [X] I have written a short but informative title.
 - [X] I have updated the app to the latest version.
 - [X] I will fill out all of the requested information in this form.
 
Duplicate of #277
Biggest hurdle is that most pdf contain really crappy metadata.
Sorry for the duplicate, I didn't go far enough in the list of issue.
My understanding is that the crappy metadata come from random document (User manual, Spec). I think that Komga is intended to read book and generally, author of those book try to make those information more clean to keep a signature in it (at least for the Title, Author and ModDate).
I think it can be good to parse those information, and if they are bad we can always edit them inside Komga.