cloudinary_gem icon indicating copy to clipboard operation
cloudinary_gem copied to clipboard

Integrity check fails in ActiveStorage::AnalyzeJob

Open yeslad opened this issue 5 years ago • 5 comments

ActiveStorage schedules an AnalyzeJob when an image is uploaded. Since moving to Cloudinary the job fails with an ActiveStorage::IntegrityError because the original checksum saved when uploading the image does not match the downloaded image checksum. Does the CloudinaryService#download method not download the original file that was uploaded?

Environment and Libraries

Cloudinary Ruby SDK version - 1.14.0 Ruby Version - 2.5.3 Rails Version - 6.0.2.1 Other Libraries - ActiveStorage 6.0.2.1

yeslad avatar May 21 '20 08:05 yeslad

Hi @yeslad, Can you please share the code responsible for the error and the full stack trace so we can debug this further?

roeeba avatar May 21 '20 10:05 roeeba

The code comes from activestorage/lib/active_storage/analyzer/image_analyzer.rb

module ActiveStorage
  class Downloader #:nodoc:
    attr_reader :service

    def initialize(service)
      @service = service
    end

    def open(key, checksum:, name: "ActiveStorage-", tmpdir: nil)
      open_tempfile(name, tmpdir) do |file|
        download key, file
        verify_integrity_of file, checksum: checksum
        yield file
      end
    end

    private
      def open_tempfile(name, tmpdir = nil)
        file = Tempfile.open(name, tmpdir)

        begin
          yield file
        ensure
          file.close!
        end
      end

      def download(key, file)
        file.binmode
        service.download(key) { |chunk| file.write(chunk) }
        file.flush
        file.rewind
      end

      def verify_integrity_of(file, checksum:)
        unless Digest::MD5.file(file).base64digest == checksum
          raise ActiveStorage::IntegrityError
        end
      end
  end
end

The line service.download(key) { |chunk| file.write(chunk) } is callingActiveStorage::Service::CloudinaryService#download.

Stack trace:

/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/downloader.rb:39:in `verify_integrity_of'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/downloader.rb:14:in `block in open'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/downloader.rb:24:in `open_tempfile'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/downloader.rb:12:in `open'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/service.rb:86:in `open'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/app/models/active_storage/blob.rb:219:in `open'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/analyzer.rb:27:in `download_blob_to_tempfile'
/root/shared/bundled_gems/ruby/2.5.0/gems/activestorage-6.0.2.1/lib/active_storage/analyzer/image_analyzer.rb

yeslad avatar May 21 '20 10:05 yeslad

Hi @yeslad. Based on our logs, we noticed that your uploads are using the transformations 'f_auto,q_auto' as an incoming transformation which applies the requested transformation to the image prior to storing it in your account, hence why the checksums differ. Could you try removing this incoming transformation from the uploads and seeing how that looks?

You can also verify the checksums manually rather than waiting for the job if you upload a sample image without the f_auto,q_auto incoming transformation and view the image in your Cloudinary account. You'll have access to the Checksum which will be added as Context metadata (via the Manage page of the asset) and you can use the UI to download the image and calculate it locally to compare.

aleksandar-cloudinary avatar May 22 '20 16:05 aleksandar-cloudinary

@aleksandar-cloudinary Your theory would appear to be correct. I uploaded an image manually and the checksums matched. My question is therefore the following: the incoming transformations are being applied because I have specified them in my storage.yml. I want them to be applied when fetching all images. I don't need them applied as incoming transformations. I couldn't find anything in the documentation about a way to distinguish between parameters for uploading and downloading in storage.yml (except for eager transformations, which I don't think is relevant here). Is there? Alternatively, is it possible for ActiveStorage to store the checksum of the image post the incoming transformation? Or, in other words, can ActiveStorage be made to work with incoming transformations?

yeslad avatar May 25 '20 07:05 yeslad

@yeslad Apologies for the delay. The most common workflow is to upload high-quality/large originals and optimise/transform them on delivery by specifying any transformation options. Please see the following section in the documentation that describes ways/methods of requesting different versions of the assets (such as f_auto,q_auto) on delivery - https://cloudinary.com/documentation/rails_image_manipulation. The SDKs have helper methods that you can feed different transformation parameters to and they'll build the URLs to those assets for you. In addition, I'll also include our ActiveStorage service documentation page - https://cloudinary.com/documentation/rails_activestorage. To answer your second question - there isn't a way to configure the Cloudinary ActiveStorage service to record the post-upload checksum as it's implemented solely as a standard/general ActiveStorage service similar to Azure or S3 services.

aleksandar-cloudinary avatar Jun 04 '20 19:06 aleksandar-cloudinary

Getting the same error here. Is it still an ongoing bug?

adriennhem avatar Jan 18 '23 19:01 adriennhem

@adriennhem This doesn't appear to be an issue that is resolvable from the Cloudinary side of things, please see Aleksandars message above.

rnamba-cloudinary avatar Jan 19 '23 00:01 rnamba-cloudinary

On my side the above answer does not work. I ended up writing a background job that would remove AciveStorage::AnalyzeJob with retry count > 0. Works like a charm.

adriennhem avatar Jan 19 '23 14:01 adriennhem

@adriennhem Glad you were able to get this working and thank you for posting this for others to find!

rnamba-cloudinary avatar Jan 19 '23 20:01 rnamba-cloudinary

We are also experiencing a similar issue as what is described above. We have implemented the below to try and address this issue:

MAX_ANALYZE_RETRIES = 1

def safe_analyze(retries: 0)
  return if retries > MAX_ANALYZE_RETRIES

  banner.analyze
rescue ActiveStorage::IntegrityError => _e
  safe_analyze(retries: retries + 1)
end

jeremylynch avatar Dec 30 '23 02:12 jeremylynch

Hey @jeremylynch ,

Did this fix resolve the issue?

rnamba-cloudinary avatar Jan 01 '24 18:01 rnamba-cloudinary