joern icon indicating copy to clipboard operation
joern copied to clipboard

[ruby] Add Support for ERB files

Open AndreiDreyer opened this issue 8 months ago • 3 comments

  • Added support for ERB files
  • Added templateOutRaw (<%== %>) and templateOutEscape (<%= %>) operator calls
  • Added RETURN for the joern__buffer which holds all of the appended strings from the ERB lowering
  • Added .html.erb files as config files

AndreiDreyer avatar Apr 23 '25 09:04 AndreiDreyer

I'm still missing a lot of .erb files. E.g. for https://github.com/chatwoot/chatwoot:

ocular> cpg.file.name(".*.html.erb").size
val res18: Int = 40

ocular> cpg.configFile.name(".*.html.erb").size
val res19: Int = 88

The frontend spits out some warnings for the project, e.g. Yield expression outside of method scope: yield(:head) and Type 'String' is considered a 'core' type, not a 'Kernel-contained' type - but not enough to explain 48 missing files.

maltek avatar Apr 24 '25 09:04 maltek

@maltek these are the new results on chatwoot with all of the changes made for ERB handling.

joern> cpg.configFile.name(".*.html.erb").l.map(_.name).l.diff(cpg.file.name(".*.html.erb").map(_.name).l)
val res1: List[String] = List()

joern> cpg.file.name(".*.html.erb").l.size
val res2: Int = 88

joern> cpg.configFile.name(".*.html.erb").l.size
val res3: Int = 88

AndreiDreyer avatar May 14 '25 12:05 AndreiDreyer

all the ruby code in https://github.com/forem/forem/blob/0772f2d49b18d94f3b982b39420ea31235c1c8aa/app/views/layouts/application.html.erb#L88-L92 exists in the CPG only as Literal nodes, instead of the actual calls and control structures. Also the lines are off-by-one (code from line 90 has line 91 in the CPG).

maltek avatar May 23 '25 14:05 maltek

for the raw call here I'm getting the wrong line number (8 instead of 12): https://github.com/OWASP/railsgoat/blob/c1e8ff1e3b24a1c48fcfc9fbee0f65dc296b49d9/app/views/layouts/application.html.erb#L12

and the render call here has line 35 instead of 56: https://github.com/forem/forem/blob/0772f2d49b18d94f3b982b39420ea31235c1c8aa/app/views/admin/badge_achievements/index.html.erb#L56

maltek avatar Jun 16 '25 14:06 maltek

the rails_lambda_0 calls are also problematic:

  • ~~the joern__inner_buffer variable is missing a closure capture binding from the outer scope~~ I missed that the lambda call is inside of a joern__buffer << rails_lambda_0.call() call
  • the self parameter is not captured or passed as argument from the outer scope. e.g. at https://github.com/forem/forem/blob/4fee5c0bd36049a35b7966a0fad0f9edfa1aeaa5/app/views/users/_sidebar.html.erb#L56 @user is accessed from the outer scope, but there's no way to find a dataflow from the <module> methods's self parameter to the self.@user within that lambda

maltek avatar Jun 17 '25 17:06 maltek

current problem:

class UsersController < ActionController::Base
  def show
    respond_to do |format|
      format.json { render partial: "foo" }
    end
  end
end

some self identifiers refer to multiple locals:

cpg.identifier.name("self").filter(_.refsTo.size != 1).l

Identifiers should only ever have REF edges to the single variable of that name in the exact same method.

maltek avatar Aug 08 '25 15:08 maltek

@TNSelahle the problem with the self variable is fixed with your last change. But there is still a related problem: the format identifiers are not referencing the format MethodParameterIn node.

maltek avatar Sep 01 '25 11:09 maltek

@maltek thanks for the review! I've made and pushed up the changes. Integration tests on CS master are all green.

TNSelahle avatar Oct 17 '25 14:10 TNSelahle

https://github.com/joernio/joern/pull/5447#discussion_r2440321788

@maltek the snippet you highlighted applies to astForFieldAccess for the self.joernBufferAppend call. The special case for you identified in the astForFieldAccess method was for field access for self.joernBuffer and self.joernInnerBuffer.

I'll rename isErbCall to isErbBufferApppendCall to make it clearer.

TNSelahle avatar Oct 20 '25 07:10 TNSelahle

@TNSelahle The only field access in the AST for <operator>.joernBufferAppend calls that we have in the CPG are field accesses for self.joern__buffer. But now I see that this location I pointed to creates the "receiver ast" - and then just discards it in the joernBufferAppend case. Please change this, so the nodes for the receiver are only created when it's going to be used. (A frontend that creates AST nodes but then fails to actually add them to the AST is a somewhat common source of runtime errors.)

maltek avatar Oct 20 '25 11:10 maltek