file-type
file-type copied to clipboard
Support for iWork files (.pages, .numbers, .key)
Description
Common file types from the macOS-native Pages, Numbers and Keynote apps.
Similar to Microsoft Office documents, these are zip files and can be detected by the Index/Document.iwa file, which contains snappy-compressed protobuf messages. Keynote files can also be recognized by the contained Index/MasterSlide*.iwa and Index/Slide*.iwa files, while for Pages and Numbers files probably the Index/Document.iwa file has to be decompressed and parsed. Currently I have no time for this, for anyone else wanting to tackle this, here is some more information:
- I found this tool quite helpful https://sheetjs.com/tools/iwa-inspector/, its source code for decoding .iwa files can be found here.
- Also, this description of IWA files (especially "Determining File Type" at the bottom), as well as this one.
- Lastly, the Apache Tika toolkit also uses this method to detect iWork files (source), however they also have not implemented proper support for numbers and pages files yet: https://issues.apache.org/jira/browse/TIKA-4464
Existing Issue Check
- [x] I have searched the existing issues and could not find any related to my problem.
File-Type Scope Acknowledgment
- [x] I understand that file-type detects binary file types and not text or other formats.