cobalt icon indicating copy to clipboard operation
cobalt copied to clipboard

[Pinterest] Downloaded pin image has low resolution

Open iambtshft opened this issue 7 months ago • 3 comments

Brief

When I try to download image from Pinterest, returned result sometimes has low resolution. Example (https://www.pinterest.com/pin/70437489341156/).

Technical analysis

Here's the parser line https://github.com/imputnet/cobalt/blob/4b9644ebdfbfe7bc6f7ec2d476692e3619cb59bd/api/src/processing/services/pinterest.js#L34-L38

The service takes the first picture with proper extension matched by Regex. However, for the specified example picture the first picture is not of best quality, see output

[0]  {src="https://i.pinimg.com/236x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"}  
[1]  {src="https://i.pinimg.com/736x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"}  
[2]  {src="https://i.pinimg.com/75x75_RS/9e/

Potential solution

Option 1 - Lookup for better resolution

I'm not expert in how Pinterest structures the data, but from names looks like it's possible to get image identifier part from first image 7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg and lookup for better image with the same id but better resolution {vvv}x

Option 2 - Parse images from json

When I was investigating page content I found that besides images provided as src=<something> there's a json structured pin data. It has much more information, such as original image URL (that is not present in src=<> pattern)

<script data-relay-response="true" type="application/json">
      {
      <OMITTTED>
                "imageSpec_236x": {
                  "height": 295,
                  "width": 236,
                  "url": "https://i.pinimg.com/236x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"
                },
                "imageSpec_orig": {
                  "url": "https://i.pinimg.com/originals/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"
                },
    <OMITTTED>

Not sure again if such data is available for every pin, but it looks like a more robust solution while src parsing could be used as fallback

reproduction steps

  1. Go to cobalt.tools
  2. Insert https://www.pinterest.com/pin/70437489341156/
  3. Hit download

Actual result: Image has low quality Expected result: Image has the same quality as on pinterest page.

screenshots

links

https://www.pinterest.com/pin/70437489341156/

platform information

additional context

iambtshft avatar May 28 '25 16:05 iambtshft

+1, reproduced accidentally with https://pinterest.com/pin/333618284916219545

Downloaded image was 236x236, original image is 736x736

agvantibo-again avatar Jul 10 '25 12:07 agvantibo-again

After further digging(testing on this), it seems that on every image there's a script tag named "PWS_INITIAL_PROPS" that has a list of image sizes, including the original.

https://regex101.com/r/IAmYqE/1

Image
const matchdigits = /(\d+)/gm;
JSON.parse(document.getElementById("__PWS_INITIAL_PROPS__").innerText).initialReduxState.pins[document.URL.match(matchdigits)[0]].images

Note that, as far as I've tested, this only works for when you're signed in - otherwise the "pins" object is empty

potatolover68 avatar Aug 12 '25 21:08 potatolover68

After even more further digging(testing on this), when you're not signed in, you can use the following regex:

let p = /https:\/\/i.pinimg.com\/(\d{3}x)\/[0-9a-f/]{41}\.jpg/gm;
[...new Set(document.body.innerHTML.match(p))];

to match all the image URLs.

Image
pitfalls
  • This is time-sensitive, so it's best to run when the page is just loaded in; otherwise, it can't differentiate between the endless scroll content and the main content.

  • Note that the first image in the list is always the main content; perhaps this could be used to filter the list

potatolover68 avatar Aug 13 '25 03:08 potatolover68