psych icon indicating copy to clipboard operation
psych copied to clipboard

YAML.load_file returns false for empty files

Open robinboening opened this issue 11 years ago • 10 comments

When loading an empty yml file with YAML.load_file it returns false. This behavior feels a bit strange to me and I would expect to get an empty Hash instead.

I often notice myself writing code like this and adding a comment to remember why I did this:

YAML.load_file(file) || {} if File.exists?(file) # YAML.load_file returns false if file is empty.

Is there a certain reason for returning false? How big would be the impact when changing its behavior?

cheers! Robin

robinboening avatar Jun 25 '13 21:06 robinboening

Hi! Honestly, I think it would make more sense for it to return nil rather than false (given that a YAML document like --- represents nil), but false is what Syck returned. Why would you think a hash?

We could change the return value, but it would have to be a major version.

tenderlove avatar Jul 05 '13 19:07 tenderlove

Ran into this strange behaviour this morning. I'd love to have nil instead of false (as it makes more sense), but my ideal solution would be to have some sort of fallback option:

YAML.load_file(File.join("i-am", "empty.yml"), Hash.new).fetch(my_key)

If the file is there but empty, we presently get a NoMethodError on FalseClass, which makes me :crying_cat_face:. With some sort of fallback solution (à la Hash#fetch), one could get a desired default value back instead of nil or false.

What do you think?

parkr avatar Aug 05 '13 09:08 parkr

Like this? https://github.com/tuexss/psych/commit/870fcd76b567a485462c5dd829dd69ffe8dcfdfb

tuexss avatar Jan 06 '16 21:01 tuexss

@tuexss I think adding a default like that would work well, but the patch might need to be a little different. I think currently it will raise an ENOENT exception in the case that the file doesn't exist where your patch will return false. The case @parkr is talking about is when there is an empty, existing file.

tenderlove avatar Jan 07 '16 21:01 tenderlove

@tenderlove ah, got it. so more like this: https://github.com/tenderlove/psych/commit/f5bad0151d733aff5b4affddaa9630b37ce470f2

I guess I should also add some tests for this

tuexss avatar Jan 07 '16 23:01 tuexss

@tuexss (I think) that won't work either because you could have a non-empty yaml file that legitimately represents nil or false.

For example:

irb(main):001:0> require 'psych'
=> true
irb(main):002:0> File.write 'x.yml', Psych.dump(false)
=> 14
irb(main):003:0> Psych.load_file 'x.yml'
=> false
irb(main):004:0> File.write 'x.yml', Psych.dump(nil)
=> 9
irb(main):005:0> Psych.load_file 'x.yml'
=> nil
irb(main):006:0> File.write 'x.yml', Psych.dump(false)
=> 14
irb(main):007:0> Psych.load_file 'x.yml', Object.new
=> false
irb(main):008:0> File.write 'x.yml', Psych.dump(nil)
=> 9
irb(main):009:0> Psych.load_file 'x.yml', Object.new
=> nil
irb(main):010:0> File.write 'x.yml', ''             
=> 0
irb(main):011:0> Psych.load_file 'x.yml'            
=> false
irb(main):012:0> Psych.load_file 'x.yml', Object.new
=> #<Object:0x007fb4c3002dd0>
irb(main):013:0>

I think this patch covers all the bases, but I'm not sure if I really love it:

diff --git a/lib/psych.rb b/lib/psych.rb
index 8ba6ef6..fbc20e7 100644
--- a/lib/psych.rb
+++ b/lib/psych.rb
@@ -228,6 +228,8 @@ module Psych
   # The version of libyaml Psych is using
   LIBYAML_VERSION = Psych.libyaml_version.join '.'

+  DEFAULT       = Struct.new :to_ruby # :nodoc:
+
   ###
   # Load +yaml+ in to a Ruby data structure.  If multiple documents are
   # provided, the object contained in the first document will be returned.
@@ -247,8 +249,8 @@ module Psych
   #     ex.file    # => 'file.txt'
   #     ex.message # => "(file.txt): found character that cannot start any token"
   #   end
-  def self.load yaml, filename = nil
-    result = parse(yaml, filename)
+  def self.load yaml, filename = nil, default = false
+    result = parse(yaml, filename, default)
     result ? result.to_ruby : result
   end

@@ -320,11 +322,11 @@ module Psych
   #   end
   #
   # See Psych::Nodes for more information about YAML AST.
-  def self.parse yaml, filename = nil
+  def self.parse yaml, filename = nil, default = false
     parse_stream(yaml, filename) do |node|
       return node
     end
-    false
+    default
   end

   ###
@@ -466,8 +468,10 @@ module Psych
   ###
   # Load the document contained in +filename+.  Returns the yaml contained in
   # +filename+ as a Ruby object
-  def self.load_file filename
-    File.open(filename, 'r:bom|utf-8') { |f| self.load f, filename }
+  def self.load_file filename, default = false
+    File.open(filename, 'r:bom|utf-8') { |f|
+      self.load f, filename, DEFAULT.new(default)
+    }
   end

   # :stopdoc:

I'd accept it though if there's tests and better variable names (I don't like DEFAULT). :smile:

tenderlove avatar Jan 07 '16 23:01 tenderlove

Actually I am happy with the patch I posted. If you add tests I'd accept it.

tenderlove avatar Jan 07 '16 23:01 tenderlove

But this would still return false for an empty yaml. Even though it is overridable now. I thought the issue was, that it should be returning an nil for empty yamls?

tuexss avatar Jan 08 '16 14:01 tuexss

Let's move the discussion to the PR :)

tuexss avatar Jan 08 '16 15:01 tuexss

Can this be closed?

patbl avatar Aug 21 '18 20:08 patbl