libreoffice-lambda-layer icon indicating copy to clipboard operation
libreoffice-lambda-layer copied to clipboard

error: Fontconfig error: Cannot load default config file

Open d3netxer opened this issue 5 years ago • 20 comments

I am getting this error logged in my lambda function:

Fontconfig error: Cannot load default config file

Many of my pdf to docx conversions are looking bad. Although it works if I convert a very basic pdf that just has a few words on it. Could this be maybe because a lack of fonts?

docx_output_from_pdf_conversion

d3netxer avatar Jan 17 '20 23:01 d3netxer

I'm having the exact same problem, did you manage to work this out?

viict avatar Feb 12 '20 03:02 viict

My thoughts are that the LibreOffice version is not installed/compiled as correctly as it should be, or at least was in the previous version; and it is having a tougher time converting pdfs to docs. I eagerly await a new version of libreoffice-lambda-layer, because I am not skilled enough to know how to compile libreoffice myself on lambda.

d3netxer avatar Feb 12 '20 03:02 d3netxer

I'm willing to give a shot in a self compiled version this week, if it works I'll update here

viict avatar Feb 12 '20 04:02 viict

Has anyone found a solution for this problem as yet?

skale1 avatar Apr 03 '20 01:04 skale1

+1 - to @skale1 - anyone found a solution yet?

AdamAH avatar Apr 13 '20 22:04 AdamAH

I found that if you have fonts other than libreoffice default fonts, you'll need to add your font in the layers manually. You can create another layer or remake the current layer. But the response you get is normal, you need to convert it to base64 encoded string, the log you write there is just stream of pdf bits, once you convert it to base64 string, you can then get the real pdf content. reference here: https://github.com/shelfio/aws-lambda-libreoffice/issues/43#issuecomment-472721536 and here: https://ask.libreoffice.org/en/question/7127/calibri-font-not-rendering-correctly/

jimmdd avatar Apr 16 '20 17:04 jimmdd

@jimmdd How did you solve adding fonts?

I made a solution, but not sure if this is the most effective approach.

  • Create a lambda layer with fonts. This is available at /opt/fonts at runtime -Then before the handler add the following that copies each font to tmp/instdir...:
let filesFolder = `/opt/fonts`;
let targetFolder = "/tmp/instdir/share/fonts/truetype";

if (!fs.existsSync(targetFolder)) {
  fs.mkdirSync(targetFolder, { recursive: true });
  fs.readdirSync(filesFolder).forEach((file) => {
    let fileToCopy = `${filesFolder}/${file}`;
    let fileOutput = `${targetFolder}/${file}`;
    fs.copyFileSync(fileToCopy, fileOutput);
  });
}
module.exports.handler = ...

I found that I needed to change the unpack.js in the @shelf/aws-lambda-libreoffice dependency as well, as the unpack.js checks for 'tmp/instdir' and does not unpack if this dir exists (and it does, because its created when I add the fonts to tmp/...). So around line 28 i added/modified:

let newOutput = '/tmp/instdir/program'" //added line
if (_fs.default.existsSync(newOutput) === true) { //change output to newOutput
      return resolve(output);
    }

This way I dont need to mess with the tar file and re-compress/upload each time I add a font. I can just add a font.ttf to my fontslayer and run the code. But if there are better solutions out there please share :-) Edit: Comments in code

simon-dk avatar May 01 '20 09:05 simon-dk

@Simon-SDK @jimmdd I've been trying your approach .. although I switch the ordering so it unpacks libreoffice first then adds the fonts.. but.. for me, it just crashes. If anyone can explain what the fontconfig path ought to be, and what filename ought to be in there, and what that file should contain! That would be great. I am just guessing in the dark right now. Do I need to set an ENV variable? Thanks.

peterwake avatar May 25 '20 09:05 peterwake

Hi @peterwake, I use the canBeConvertedToPDF() and convertTo() functions (which is also the reason why I add the fonts first, as I can,t add fonts after a conversion).

The code below is activated by an s3 event (created object) which downloads the file, converts it and uploads the converted file to the same bucket.


const AWS = require("aws-sdk");
const s3 = new AWS.S3();
const fs = require("fs");
const {convertTo, canBeConvertedToPDF} = require("@shelf/aws-lambda-libreoffice");

/* Changes have been made to to @Shelf dependancy. 
In ./node_modules/@shelf/aws-lambda-libreoffice/lib/unpack.js
- add a new line "let newOutput = '/tmp/instdir/program'"
- change the if at line 27 to "if (_fs.default.existsSync(newOutput)..." */

//Copies files from fonts layer to libreoffices tmp folder
let filesFolder = `/opt/fonts`;
let targetFolder = "/tmp/instdir/share/fonts/truetype";

if (!fs.existsSync(targetFolder)) {
  fs.mkdirSync(targetFolder, { recursive: true });
  fs.readdirSync(filesFolder).forEach((file) => {
    let fileToCopy = `${filesFolder}/${file}`;
    let fileOutput = `${targetFolder}/${file}`;
    fs.copyFileSync(fileToCopy, fileOutput);
  });
}

module.exports.handler = async (event) => {
  let bucket = event.Records[0].s3.bucket.name;
  let key = decodeURIComponent(
    event.Records[0].s3.object.key.replace(/\+/g, " ")
  );

  //takes a key and matched the end, e.g. "123/321/filename.docx" becomes "filename.docx"
  let tempPath = key.match(/(\w+\.\w+)$/)[1];

  //saves the file to tempPath
  await s3download(bucket, key, tempPath);
  
  //Checks if file can be converted
  if (!canBeConvertedToPDF(tempPath)) return 'File cannot be converted';
  
  //Create a target key.
  let targetKey = "myOutputFile.pdf"

  //returns /tmp/myOutputFile.pdf
  let convertedFile = await convertTo(tempPath, "pdf"); 

  //Uploads the converted file again.
  let upload = await s3upload(bucket, targetKey, convertedFile);

  return upload;
};


async function s3download(bucket, key, tempPath) {
  let params = { Bucket: bucket, Key: key };

  return new Promise((resolve, reject) =>
    s3
      .getObject(params)
      .createReadStream()
      .pipe(fs.createWriteStream(`/tmp/${tempPath}`))
      .on("error", reject)
      .on("finish", resolve)
  );
}

async function s3upload(bucket, key, filePath) {
  let readStream = fs.createReadStream(filePath);
  let params = { Bucket: bucket, Key: `${key}`, Body: readStream };
  let res = await s3.upload(params).promise();
  return res.Location;
}

Hope that helps :) Edit: forgot the insert as code.

simon-dk avatar May 25 '20 11:05 simon-dk

@Simon-SDK thank you. I'm trying to do something very similar but with presigned AWS urls for the upload and download (which I think works, it's the libreoffice conversion that's broken). I'll give this a try later. Could I ask:

  • which AWS region you are using?
  • do you have any ENV variables set?

Best regards Peter

peterwake avatar May 26 '20 07:05 peterwake

@peterwake Sounds like it should work as long as the file to be converted is in tmp. I'm in eu-central-1 and no env's.

simon-dk avatar May 26 '20 10:05 simon-dk

Hi @Simon-SDK, thanks so much for your code and advice.

For anyone else wrestling with this issue..

There are two ENV variables you can create to try to fix the problem, FONTCONFIG_FILE and FONTCONFIG_PATH. FONTCONFIG_FILE is the one to go for (I think).

I ended up creating a file fonts.conf in a Lambda folder /fonts. The problem I didn't realise is that the __dirname in your running function is /var/task.. so your ENV variable FONTCONFIG_FILE needs to be /var/task/fonts/fonts.conf. You can read more on this here: https://www.freedesktop.org/software/fontconfig/fontconfig-user.html

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
  <dir>/tmp/instdir/share/fonts/truetype</dir>
  <cachedir>/tmp/fonts-cache/</cachedir>
  <config></config>
</fontconfig>

Using this plus Simon's code above, I finally got something to work. Once I have tidied up I may post a longer code snippet.

peterwake avatar May 27 '20 15:05 peterwake

That’s good to hear! Good find reg. the env. Does that mean that libreoffice just gets the fonts from the opt/fonts layer without needing to copy the fonts to tmp? That would make it stupidly more simple, faster and make it possible to load more fonts before running into space limitations.

simon-dk avatar May 27 '20 16:05 simon-dk

Erm, well I just copied your code :), but I guess if you didn't want the bundled libreoffice fonts you could! I am feeling exhausted by the whole process but if I get a chance I will have a try.

peterwake avatar May 27 '20 17:05 peterwake

Does anyone know if I need to worry about this error? Fontconfig error: Cannot load default config file I'm able to generate PDFs, however, it produces this error every time I run the lambda. It doesn't seem to be an issue. I'm assuming its using some default font. Is there a way to suppress the error?

garyholiday13 avatar Jun 01 '20 15:06 garyholiday13

@garyholiday13 It just mean you LibreOffice cannot load any extra font files. It should be okay.

I have created a repo that shows how to use it here: https://github.com/viezel/lambda-office-to-pdf-processor

viezel avatar Dec 16 '20 14:12 viezel

Just add 2 env vars:

  • FONTCONFIG_FILE: /tmp/instdir/share/fonts/truetype/fc_local.conf

  • FONTCONFIG_PATH: /tmp/instdir/share/fonts/truetype

QS03 avatar Apr 07 '21 19:04 QS03

@d3netxer have you managed to get this working in the end?

I spent the whole day trying to figure out why I can convert a PDF to a DOC locally with LO 6.4.0.1 totally fine, even when with FONTCONFIG_FILE and FONTCONFIG_PATH overwritten to a directory containing only the Arial font, but in AWS it doesn't work - the result is similar to the one posted by OP at the top of this issue.

This is the command I use: soffice --headless --invisible --nodefault --view --nolockcheck --nologo --norestore --nofirststartwizard --infilter=writer_pdf_import --convert-to doc <file>

There is no error in the logs:

START RequestId: b8c5f311-4efe-4c38-9df7-016d47e36206 Version: $LATEST
--
2022-03-08T17:31:41.142Z	b8c5f311-4efe-4c38-9df7-016d47e36206	INFO	convert /tmp/93585b4f-1cf6-41f9-b684-6bae576d167c -> /tmp/93585b4f-1cf6-41f9-b684-6bae576d167c.doc using filter : MS Word 97
---
END RequestId: b8c5f311-4efe-4c38-9df7-016d47e36206

I am using Serverless Framework and load the fonts via:

package:
  include:
    - fonts/**

And these env variables:

FONTCONFIG_FILE: /var/task/fonts/fonts.conf
FONTCONFIG_PATH: /var/task/fonts

Can someone please point me out to something else I could try? Thank you in advance!

iuliancmarcu avatar Mar 08 '22 17:03 iuliancmarcu

In the past after various tests and looking online, I made the determination that the LibreOffice headless UI is not capable of converting PDFs to DOCs. If you are able to do it locally, then you have gotten further than me.

I have had some success converting PDFs to DOCs using the pdf2docx python library. My plan is to make a lambda that uses pdf2docx, and another lambda that uses libreoffice-lambda-layer and route the request to the appropriate lambda using API Gateway. But I haven't gotten around to implementing this yet.

d3netxer avatar Mar 08 '22 18:03 d3netxer

I was getting the same issue. Resolved via this link: https://medium.com/creditorwatch/aws-lambda-fonts-libfontconfig-5e837281a4ce

birishabgupta avatar Jun 01 '22 05:06 birishabgupta