cbor-ruby icon indicating copy to clipboard operation
cbor-ruby copied to clipboard

Catching invalid format for CBOR.decode

Open grzuy opened this issue 5 years ago • 6 comments

CBOR.decode can raise several error types when fed with invalid cbor input.

require "cbor"
require 'securerandom'

errors = {}

1_000_000.times do |i|
  begin
    CBOR.decode(SecureRandom.random_bytes(64))
  rescue => ex
    errors[ex.class.name] = ex
  end
end

pp errors

returns

{"CBOR::MalformedFormatError"=>
  #<CBOR::MalformedFormatError: extra bytes follow after a deserialized object>,
 "EOFError"=>#<EOFError: end of buffer reached>,
 "TypeError"=>#<TypeError: can't convert CBOR::Simple into an exact number>,
 "URI::InvalidURIError"=>#<URI::InvalidURIError: bad URI(is not URI?): 2>,
 "FloatDomainError"=>#<FloatDomainError: NaN>,
 "RegexpError"=>
  #<RegexpError: invalid multibyte character: /\x1Er\x13w\xBCx0N:n\xD3aw{\x98\xB1Q\x1D\xC0\u06FC5\xF0/>}

Do you think it's possible for the CBOR decoder to either provide some sort fo CBOR.valid?(input) method, or for CBOR.decode(input) to always return the same exception when input is detected to be invalid CBOR?

This was raised in https://github.com/cedarcode/cose-ruby/issues/40.

Thank you in advance!

grzuy avatar Oct 03 '19 19:10 grzuy

or for CBOR.decode(input) to always return the same exception when input is detected to be invalid CBOR?

I guess already used CBOR::MalformedFormatError could be a possible candidate?

grzuy avatar Oct 04 '19 21:10 grzuy

For what is worth, these are the occurrences...

{"EOFError"=>100332,
 "CBOR::MalformedFormatError"=>895855,
 "TypeError"=>3734,
 "URI::InvalidURIError"=>29,
 "FloatDomainError"=>1,
 "RegexpError"=>3}

Script:

require "cbor"
require 'securerandom'

errors = {}

1_000_000.times do |i|
  begin
    CBOR.decode(SecureRandom.random_bytes(64))
  rescue => ex
    errors[ex.class.name] ||= 0
    errors[ex.class.name] += 1
  end
end

pp errors

grzuy avatar Oct 04 '19 21:10 grzuy

I believe it is useful to know whether the CBOR item ended prematurely (EOFError), some other kind of malformedness occurred, or a type-specific error occurred (obviously on a Tag, e.g., URI::InvalidURIError). What is your motivation for wanting to turn this into Kernighan's car?

cabo avatar Oct 04 '19 21:10 cabo

BTW, errors = Hash.new(0) saves having to do the errors[ex.class.name] ||= 0.

cabo avatar Oct 04 '19 21:10 cabo

But the number of exceptions you found actually points to a genuine bug: An invalid tag (e.g., a URI tag with an invalid URI) is not presented to the application as such, but causes an exception that denies the application access to the entire piece of CBOR. This is great for purists, but not the best strategy for interoperability.

cabo avatar Oct 05 '19 07:10 cabo

What is your motivation

Hi @cabo,

Just trying to understand what could be the best way to gracefully handle wrong/corrupt cbor data on at the application level when data fed by the user agent.

I guess that by doing

begin
  CBOR.decode(input)
rescue CBOR::MalformedFormatError, EOFError, TypeError
  # handle error
end

one would already catch > 99% of the cases, so that may be enough.

The thing is that from the perspective of the caller, EOFError and TypeError are quite generic and they don't even let you know if the error is coming from CBOR or from someplace else.

It crossed my mind that it might be desirable from the perspective of the caller to be able to do something like:

begin
  CBOR.decode(input)
rescue CBOR::MalformedFormatError
  # handle error
end

or

begin
  CBOR.decode(input)
rescue CBOR::DecodingError
  # handle error
end

or in case it makes more sense to have several different possible decoding errors, all those could inherit from a CBOR-generic CBOR::Error and you could just:

begin
  CBOR.decode(input)
rescue CBOR::Error
  # handle error
end

grzuy avatar Oct 07 '19 19:10 grzuy