gitbook-convert
gitbook-convert copied to clipboard
35 character limit on images are creating duplicate image file names and overwriting other images
Hi all,
When importing images from a document and looking at the raw xml, I noticed the docx converter uses the image description instead of the image name. This leads to many images, especially those taken from the internet with long file paths for their description, to be truncated and considered as duplicates due to the 35 character limit. This overwrites existing images that have been imported.
// imgExporter exports inline images to the assets folder and apply src attribute to HTML correctly
var imgExporter = mammoth.images.inline(function(element) {
return element.read().then(function(imageBuffer) {
// Set image file name
var imgFilename;
// Use altText for image name
if (!!element.altText) {
imgFilename = element.altText;
// Remove extension in altText if is equal to contentType
var contentType = 'image/'+path.extname(imgFilename).slice(1);
if (element.contentType === contentType) {
imgFilename = imgFilename.split('.').slice(0, -1).join('.');
}
// Shorten if too long
imgFilename = imgFilename.slice(0, 35).trim();
}
// Normalize filename
imgFilename = normall.filename(imgFilename);
The shortening bit is what overwrites existing files without checking. The default name selection seems to take that into consideration by adding an increment to the name.
// Or use default name -> img-NN.ext
if (!imgFilename) {
imgFilename = 'img-'+imgCounter;
imgCounter++;
}
Is the 35 character limit something imposed by mammoth.js and reused here for consistency, or is this something that could be removed?