sharp icon indicating copy to clipboard operation
sharp copied to clipboard

Best way to crop around a specified focal point

Open MaximilianLloyd opened this issue 3 years ago • 4 comments

I am trying to resize an image around a face. I have used another package to extract the bounding boxes of faces within the image. How could I use that position to correctly resize the image around it?

The only way I can see how to set it up with sharp is to set up a bunch of if statements to map it to one of the following alignment strategies. image

MaximilianLloyd avatar Jun 02 '21 12:06 MaximilianLloyd

Hi, did you see the extract operation? https://sharp.pixelplumbing.com/api-resize#extract

lovell avatar Jun 02 '21 19:06 lovell

Thanks for the fast response :). Yes, but I'm not sure how I would go about using that in the usecase I outlined. The specific reason I ask is that I am cropping an input image to x number of output images with different ratios. The face should fit as best as possible inside all of the images outputted. One could be 4:3, the other 16:9. How could I go about using the extract operation to do this?

MaximilianLloyd avatar Jun 02 '21 22:06 MaximilianLloyd

@MaximilianLloyd Were you able to make any progress with this? If not, please can you provide sample code and images to help explain your question.

lovell avatar Jul 10 '21 20:07 lovell

@MaximilianLloyd Were you able to make any progress with this? If not, please can you provide sample code and images to help explain your question.

Hi Lovell. No I wasn't able to make any progress with this. What I opted for was to just use the alignment strategies based on where the bounding box for the face was. But that solution is not optimal and I haven't implemented anything for the y axis yet.

The user uploads an image that can be outputed in x amount of cropping formats, for example 5. I run facial recognition on the uploaded image and get the bounding boxes of the face. Based on that I want to make sure that the face is centered as much as possible in each of the different formats. Below you can find the relevant code and an example of the output.

The config.formats contains an image which is a PNG overlay this is composited.

router.js

import express from 'express';
import multer from 'multer';
import { fileURLToPath } from 'url';
import path from 'path';
import Sharp from 'sharp';
import { v4 as uuidv4 } from 'uuid';
import fs from 'fs';
import archiver from 'archiver';
import tf from '@tensorflow/tfjs-node';
import * as faceapi from '@vladmandic/face-api';
import * as canvas from 'canvas';
import makeDir from 'make-dir';
import slugify from 'slugify';

import { storage, imageFilter } from '../config/multer.js';
import setConnectionTimeout from '../middleware/timeout.js';
import getAlignmentStrategy from '../helpers/getAlignmentStrategy.js';
import getConfigFromSlug from '../helpers/getConfigFromSlug.js';

const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

const upload = multer({ storage: storage, fileFilter: imageFilter });
const router = express.Router();

export async function init() {
  const { Canvas, Image, ImageData } = canvas;
  faceapi.env.monkeyPatch({ Canvas, Image, ImageData });

  await faceapi.nets.ssdMobilenetv1.loadFromDisk(
    path.resolve(__dirname, '../assets/models')
  );
}

async function processImage(file, config) {
  const { buffer } = file;
  const sharpImage = Sharp(buffer);
  const imageMetaData = await sharpImage.metadata();
  console.log(config);
  console.log('fit: ', config.useFacialRecognition);

  const { useFacialRecognition } = config;

  let detection;

  if (useFacialRecognition) {
    const tfBuffer = await sharpImage.jpeg().toBuffer();
    const img = tf.node.decodeImage(tfBuffer);
    const [_detection] = await faceapi.detectAllFaces(img);
    detection = _detection;
  }

  const imagesToReturn = config.formats.map(async (format, formatIndex) => {
    const overlay = format.image;

    const overlayMetaData = await overlay.metadata();
    const { width: overlayWidth, height: overlayHeight } = overlayMetaData;

    const clonedImage = sharpImage.clone();

    let alignmentStrategy = config.position;

    if (useFacialRecognition === true) {
      alignmentStrategy = getAlignmentStrategy(
        detection,
        imageMetaData,
        overlayMetaData
      );
    }

    clonedImage
    .rotate()
    .resize({
      height: overlayHeight,
      width: overlayWidth,
      fit: config.fit,
      position: alignmentStrategy,
    });

    clonedImage.composite([
      {
        input: await overlay.toBuffer(),
      },
    ]);

    return clonedImage.jpeg({
      quality: 95,
      chromaSubsampling: '4:4:4',
    });
  });

  return await Promise.all(imagesToReturn);
}

router.post(
  '/api/upload',
  setConnectionTimeout('1h'),
  upload.array('images'),
  async (req, res) => {
    const { configSlug } = req.body;

    const fetchedConfig = await getConfigFromSlug(configSlug);

    const imagesPerOverlay = await Promise.all(
      req.files.map((file) => processImage(file, fetchedConfig))
    );

    const imageNames = req.files.map((file) => file.originalname);
    const zipFileName = `${uuidv4()}.zip`;

    const output = fs.createWriteStream(
      path.resolve(__dirname, `../uploads/${zipFileName}`)
    );
    const archive = archiver('zip', {
      zlib: { level: 9 },
    });

    archive.pipe(output);

    for (const [formatIndex, images] of imagesPerOverlay.entries()) {
      const imageName = imageNames[formatIndex];

      for (const [index, image] of images.entries()) {
        const format = fetchedConfig.formats[index];
        const bufferData = await image.toBuffer();

        const folderPath = format.title;

        archive.append(bufferData, {
          name: `${fetchedConfig.title}/${folderPath}/${folderPath}_${imageName}`,
        });
      }
    }

    archive.finalize();

    res.status(200).send(zipFileName);
  }
);


router.post(
  '/api/upload-single',
  setConnectionTimeout('1h'),
  upload.single('image'),
  async (req, res) => {
    const { configSlug } = req.body;

    const fetchedConfig = await getConfigFromSlug(configSlug);
    const sharpImages = await processImage(req.file, fetchedConfig);

    const generatedDirectoryName = uuidv4();

    const rootPath = path.resolve(__dirname, `../uploads/${generatedDirectoryName}`);
    await makeDir(rootPath);

    for (const [index, image] of sharpImages.entries()) {
      const format = fetchedConfig.formats[index].title;

      const fileName = `${slugify(format)}.jpg`
      await image.toFile(`${rootPath}/${fileName}`);
    }

    res.status(200).send(generatedDirectoryName);
  }
);

export default router;

getAlignmentStrategy.js

import Sharp from 'sharp';

function getAlignmentStrategy(detection, imageMetaData, overlayMetaData) {
  if (!detection) {
    console.log('No detection');
    return Sharp.strategy.attention;
  }

  const { width: overlayWidth, height: overlayHeight } = overlayMetaData;
  const { width: imageWidth, height: imageHeight } = imageMetaData;

  const xScalar = imageWidth / overlayWidth;
  const yScalar = imageWidth / overlayHeight;

  // scaled to overlay
  const x = detection.box._x * xScalar;
  const y = detection.box._y * yScalar;

  const imageCenter = overlayWidth / 2;
  const oneThirdOfImageWidth = overlayWidth / 3;

  if (Math.abs(imageCenter - x) < oneThirdOfImageWidth) {
    return 'center';
  }

  if (x < imageCenter) {
    return 'left'
  }

  if (x > imageCenter) {
    return 'right'
  }
}

export default getAlignmentStrategy;

2021-07-28-12-28-demo batcher io

MaximilianLloyd avatar Jul 28 '21 10:07 MaximilianLloyd

Closing as this logic is custom to a particular use case, and sharp already provides enough API to achieve what's required.

lovell avatar Aug 24 '22 16:08 lovell

Just dropping by to say that I also would love to see inbuilt focal point support for image resizing and I think it's a common use case.

For everyone interested on how you could realize this right now: This is how I would probably do it:

Rastergrafik

  1. Scale (with fit=cover) the original image (green) to the needed size (yellow) by factor x so that it covers the requested dimensions (red dashed)
  2. Also scale the focal point coordinates by factor x.
  3. Calculate the corner coordinates of the result rectangle if it would be aligned to the scaled focal point at its center
  4. The calculated coordinates may be out of bounds from the scaled image. So you have to shift them a minimal amount (differences of edge coordinates) until all of them are valid again.
  5. Use the extract operation with the shifted corner coordinates on the scaled image.

lukas-schaetzle avatar Oct 14 '22 22:10 lukas-schaetzle

@Tummerhore Thanx for this, have you had by any chance the occasion of writing the code and calculations for this to work?

JeanLucEsser avatar Apr 12 '23 13:04 JeanLucEsser

Sorry for the late reply @JeanLucEsser. I don't have any code for it right now but it should be straight forward if you follow my previous post. Maybe I find some time on the weekend to put something together.

lukas-schaetzle avatar Apr 17 '23 21:04 lukas-schaetzle

This is not a custom use case. It's SOP on most social networks to have a focal point for fitting images to a certain aspect ratio. It's necessary when automatically cropping photos with faces or logos

Some prior art: https://docs.joinmastodon.org/api/guidelines/#focal-points

This feature is blocking consumers: https://www.youtube.com/watch?v=viURaw3oiBA&lc=Ugy6NHJiuZGEXlOsRZp4AaABAg.9qVP7KZROnw9qbR_gtdjIo

Typically an alignment value is provided:

Alignment {
  x = number(min: -1, max: 1),
  y = number(min: -1, max: 1);
  
  .center => (x: 0, y: 0);
  .topLeft => (x: -1, y: -1);
  .bottomRight => (x: 1, y: 1);
}

And the current implementation in sharp is equivalent to Alignment.center

being able to provide a custom alignment would resolve this issue

lukepighetti avatar Jun 06 '23 15:06 lukepighetti

+1 for adding focal point or custom alignment options

JessChowdhury avatar Sep 12 '23 14:09 JessChowdhury

+1 as a lot of other image services already provide this feature by now

thomas4Bitcraft avatar Jan 17 '24 22:01 thomas4Bitcraft

If anyone would like to work on a PR for this, please open a new enhancement issue first with a proposed API as well as a list of edge cases that will need to be catered for (and tested).

lovell avatar Jan 18 '24 07:01 lovell

Prior work:

  • Imgix: https://docs.imgix.com/apis/rendering/focalpoint-crop
  • CraftCMS: https://craftcms.com/docs/4.x/assets.html#focal-points

leevigraham avatar Feb 14 '24 08:02 leevigraham