gmail-processor icon indicating copy to clipboard operation
gmail-processor copied to clipboard

PDF password removal

Open MihaiSprinceana opened this issue 1 year ago • 9 comments

I'm using the below code to remove a password from a PDF file, but the integration with convertapi can do more. Prerequisite: secret key from convertapi.

/**
 * If the file is password protected than remove the password
 */
function createFileInsideFolderUsingRuleSettings(rule, attachment, folder) {
  if(rule.password) {
      var obj = {
        secretkey: "testKey",  // Your secret key.
        pass: rule.password, // Password of the protected PDF file.
        blob: attachment.getAs(MimeType.PDF), // pdf blob;
      }
      var blob = decrypting(obj);
      return folder.createFile(blob);
  } else {
      return folder.createFile(attachment);
  }
}

/**
 * Decrypts the PDF file
 */
function decrypting(obj) {
  var url = "https://v2.convertapi.com/convert/pdf/to/decrypt?Secret=" + obj.secretkey;
  var options = {
    method: "post",
    payload: {File: obj.blob, Password: obj.pass},
  }
  var res = UrlFetchApp.fetch(url, options);
  res = JSON.parse(res.getContentText());
  var blob = res.Files.map(function(e) {return Utilities.newBlob(Utilities.base64Decode(e.FileData), MimeType.PDF, e.FileName)});
  return blob[0];
}

Note: is based on the Gmail2Drive project and it will require an account on convertapi website.

MihaiSprinceana avatar May 14 '24 17:05 MihaiSprinceana

@MihaiSprinceana that's an interesting use case which should be possible with the next feature release. I'm currently working on introducing custom actions, which will allow to trigger any automation one can imagine.

ahochsteger avatar May 15 '24 05:05 ahochsteger

@MihaiSprinceana I've just released version 2.11.0 which adds support for custom actions. Have a look at this example to see how it can be used. It would be great to get some feedback, if you are able to migrate to Gmail Processor with this new feature.

ahochsteger avatar May 25 '24 14:05 ahochsteger

I was forced to work with encrypted pdfs as well. I managed to use pdf-lib in apps script to decrypt such pdfs and was loking for a way to 'trigger' it directly from gmail-processor. Custom actions sound promising.

For reference, to use pdf-lib you need to take the @cantoo/pdf-lib fork because it allows decryption, get the minified file and copy paste it as another .gs file. Then remove exports at the end of it.

Then

const file = DriveApp.getFileById(...);
  const fileName = file.getName();
  const newName = getNewName(fileName);

  const fileBase64 = Utilities.base64Encode(file.getBlob().getBytes());


  const pdfDoc = await PDFDocument.load(fileBase64, { password: getPassword(fileName), ignoreEncryption: true});
  const unencrypted = await pdfDoc.save();

  const bb = Utilities.newBlob(unencrypted, 'application/pdf', newName);
  DriveApp.createFile(bb);

@ahochsteger Would be great to get some docs on what ctx has and how to get the ref to the attachment being processed.

MikeDabrowski avatar Jul 12 '24 12:07 MikeDabrowski

@MikeDabrowski you can have a look at the types AttachmentInfo and AttachmentContext in Context.ts that define the context for attachment processing. Use ctx.attachment.object to get the currently processed attachment that is of type GmailAttachment. Example:

(ctx, args) =>{
  ...
  const fileBase64 = Utilities.base64Encode(ctx.attachment.object.getBytes());
  ...
}

If you need more help or have a working example let me know, maybe we can even integrate it into the core functionality.

ahochsteger avatar Jul 12 '24 16:07 ahochsteger

Great! I did not found a location where this file is stored. Am I just blind and it is in fact provided in the ctx somewhere?

If it is not provided, I can use the location string from the config - is there any util that would transform it to folder?

In the end I need a folder object to be able to call folder.createFile

MikeDabrowski avatar Jul 14 '24 10:07 MikeDabrowski

@MikeDabrowski you may use ctx.proc.gdriveAdapter to get access to GDriveAdapter which provides the function createFile(location, fileData, conflictStrategy). Have a look at the source code of the linked file for the details of the function and its parameters. The fileData object encapsulates the relevant file content like the blob data you want to create. I hope that helps - if not don't hesitate to ask. I'll reopen this issue, since there's now more activity. Regards, Andreas

ahochsteger avatar Jul 14 '24 15:07 ahochsteger

Thank you. Getting closer :) Is there a way to handle async in custom actions?

Btw, I could create a PR with pwd removal fn but it requires pdf-lib fork (quite a lengthy script) to be copied over OR converted to GAS library. I tried to peak at their code, hoping to extract only the pwd removal bit, but it requires much more time to invest. Unless npm install would work in this project and the lib would be included?

MikeDabrowski avatar Jul 14 '24 19:07 MikeDabrowski

const customActions = [{ name: 'decryptPdf', action: decryptPdfAction }];

...

function decryptPdfAction(ctx, args) {
  getDecryptedPdf(ctx.attachment.object, args.pwd)
    .catch(e => {
      console.error('Error while decrypting pdf', e);
    })
    .then(decryptedPdf => {
      console.info("Successfully decrypted pdf", decryptedPdf);
      console.info(`Attempting to save "${decryptedPdf.getName()}" to location "${locationFormatForDecryptedPdfs}"`);

      const {file, actionMeta} = ctx.proc.gdriveAdapter.createFileFromAction(
          ctx,
          locationFormatForDecryptedPdfs,
          decryptedPdf,
          'skip',
          '',
          'decrypted PDF',
          'custom',
          'custom.decryptPdf',
        );

      console.info(`Successfuly stored decrypted file at: "${actionMeta['custom.stored.location'].value}"`, file)
    })
    .catch(e => {
      console.error('Error while saving decrypted pdf', e);
      throw e;
    });
}

async function getDecryptedPdf(processedFileObject, password) {
  const bytes = processedFileObject.getBytes();
  const fileBase64 = Utilities.base64Encode(bytes);
  const pdfDoc = await PDFDocument.load(fileBase64, { password, ignoreEncryption: true});
  const unencrypted = await pdfDoc.save();
  return Utilities.newBlob(unencrypted, 'application/pdf', 'temp_decrypted_name.pdf');
}

Here is how you might decrypt PDFs.

I had to use createFileFromAction because it can parse location expression.

MikeDabrowski avatar Jul 15 '24 09:07 MikeDabrowski

@MikeDabrowski nice to see that you're making progress :-).

Concerning async in custom actions: Unfortunately Google Apps Script is fully synchronous and does not support asynchronous processing, that's why custom actions cannot be asynchronous as well. There are some hacky workarounds available like this but I fear they lead to more complexity and instability down the road.

Concerning the integration of pdf-lib: I suggest to add @cantoo/pdf-lib to the dependencies and really integrate it as a real function. This way it is integrated as a fully-supported dependency, uses tree-shaking of rollup and is automatically updated by renovate.

Writing custom functions with new dependencies might be too complicated.

ahochsteger avatar Jul 15 '24 09:07 ahochsteger