mongodb-erlang
mongodb-erlang copied to clipboard
dynamically created atoms in soundrop/bson-erlang
Hi, I'm trying to write an Erlang application that connects to MongoDB and I looked into using this client driver.
I notice that soundrop/bson-erlang (the rebar dependency) maps document field names to Erlang atoms using binary_to_atom. This is visible in the bson parser code (bson_binary:get_fields) and also mentioned in the blog post, http://blog.mongodb.org/post/7270427645/design-of-the-erlang-driver-for-mongodb .
I think it's very dangerous to use binary_to_atom like that, since atoms are not garbage collected and so document collections containing too many different field names can overrun memory in the Erlang node. I'd consider this a critical bug (in my application I can't control what's in documents and might even get malicious ones) so I can't use the driver like that. I'd urge a refactoring to not use atoms in this unsafe way, that the Erlang documentation advises against.
I thought about just modifying the driver and sending you a patch, but since it would be an API change, I decided I better ask your advice about it first. It's of course also possible that I'm missing something, in which case enlightenment is always appreciated. Please let me know your thoughts.
(Note: I tried emailing this but it bounced).
Thanks
--Paul
I second that, to reproduce the behavior, just execute this code and watch beam.smp's memory consumption:
[list_to_atom(integer_to_list(X)) || X <- lists:seq(1,1000000)].