codeql-go icon indicating copy to clipboard operation
codeql-go copied to clipboard

how can i taint a field from fields set?

Open Cosydays opened this issue 3 years ago • 21 comments

Assumption A is a struct with fields {birthday, name, id}, i got instanceA from unmarshal input data. I want to TaintTracking the field "instanceA.birthday", How do I write the ql to location the source node?

Cosydays avatar Aug 03 '22 08:08 Cosydays

Could you write a code example showing the taint flow you want to identify?

smowton avatar Aug 03 '22 08:08 smowton

@smowton func (s *AccessControlServiceImpl) TestCodeQL(ctx context.Context, req *access_control.TestCodeQLRequest) (resp *access_control.TestCodeQLResponse, err error) { resp = &access_control.TestCodeQLResponse{} res, err := method.AnotherFunc(ctx, req.Birthday) _, err = method.AnotherFunc2(ctx, res) return resp, nil }

type TestCodeQLRequest struct { Birthday string thrift:"Birthday,1,required" json:"Birthday" Name string thrift:"Name,2,required" json:"Name" ID string thrift:"ID,3,required" json:"ID" Base *base.Base thrift:"Base,255" json:"Base,omitempty" }

just like this, input "req" is a struct, i just care about the field "Birthday" in req, how do i write the ql to location the source node

Cosydays avatar Aug 03 '22 09:08 Cosydays

The simplest thing: if we can assume TestCodeQLRequest is only ever used to carry user-controlled (tainted) data, then we can consider as a source any read of the Birthday field. For example, here's a snippet from ElazarlGoproxy.qll that identifies reads of the ProxyCtx.UserData field:

  private class UserControlledRequestData extends UntrustedFlowSource::Range {
    UserControlledRequestData() {
      exists(DataFlow::FieldReadNode frn | this = frn |
        // liberally consider ProxyCtx.UserData to be untrusted; it's a data field set by a request handler
        frn.getField().hasQualifiedName(packagePath(), "ProxyCtx", "UserData")
      )
    }
  }

Is that good enough for your use case, or do you need to ensure that the struct arrived through the req parameter of a request-handler method too (i.e., is the struct sometimes used to carry non-user-controlled data?)

smowton avatar Aug 03 '22 10:08 smowton

@smowton thanks! But i have another question how can i get the packagePath() ? I write like this, but it does not work.

   class KitexParam extends UntrustedFlowSource::Range {
      KitexParam() {
          exists(DataFlow::FieldReadNode fieldReadNode | 
          this = fieldReadNode
          and
          fieldReadNode.getField().hasQualifiedName("code.byted.org/kitex/rpc_demo_liwenyuan_unu/kitex_gen/tiktok/cmpl_demo/rpc_liwenyuan", "SetDataRequest", "Birthday")
          )
      }
  }
image

Full ql is blow:

import go
import DataFlow::PathGraph

module KitexCommon {
    class FunctionModels extends TaintTracking::FunctionModel {
        FunctionInput functionInput;
        FunctionOutput functionOutput;
    
        FunctionModels() {
          (functionInput.isParameter(0) or functionInput.isParameter(1) or functionInput.isParameter(2) or functionInput.isParameter(3)  or functionInput.isParameter(4) or functionInput.isParameter(5)  or functionInput.isParameter(6) or functionInput.isParameter(7))
          and 
          (functionOutput.isResult(0) or functionOutput.isResult(1) or functionOutput.isResult(2) or functionOutput.isResult(3) or functionOutput.isResult(4) or functionOutput.isResult(5) or functionOutput.isResult(6) or functionOutput.isResult(7) )
        }
    
        override predicate hasTaintFlow(FunctionInput input, FunctionOutput output) {
          input = functionInput and output = functionOutput
        }
    }

    class KitexParam extends UntrustedFlowSource::Range {
        KitexParam() {
            exists(DataFlow::FieldReadNode fieldReadNode | 
            this = fieldReadNode
            and
            fieldReadNode.getField().hasQualifiedName("code.byted.org/kitex/rpc_demo_liwenyuan_unu/kitex_gen/tiktok/cmpl_demo/rpc_liwenyuan", "SetDataRequest", "Birthday")
            )
        }
    }
}

private class RedisSink extends DataFlow::Node {
    RedisSink() {
      exists(DataFlow::MethodCallNode callNode | 
        callNode.getAnArgument() = this
        and 
        callNode.getTarget().getQualifiedName().regexpMatch(".*Set.*")
      )  
    }
  
    override string toString() { 
      result = this.(DataFlow::ArgumentNode).getCall().(DataFlow::MethodCallNode).getReceiver().asExpr().(SelectorExpr).getSelector().toString()
    }
  }

class Configuration extends TaintTracking::Configuration {
    Configuration() { this = "test" }
    override predicate isSource(DataFlow::Node source) { source instanceof KitexCommon::KitexParam  }
    override predicate isSink(DataFlow::Node sink) { sink instanceof RedisSink }
    override predicate isAdditionalTaintStep(DataFlow::Node fromNode, DataFlow::Node toNode) {
      any(DataFlow::Write write).writesComponent(toNode.(DataFlow::PostUpdateNode).getPreUpdateNode(),  fromNode)
    }
  }

Cosydays avatar Aug 03 '22 12:08 Cosydays

Try

from DataFlow::FieldReadNode frn, string pname
where frn.getField().hasQualifiedName(pname, "SetDataRequest", "Birthday")
select pname

What actual package name is returned?

smowton avatar Aug 03 '22 12:08 smowton

@smowton thanks! The actual package name is returned, can it convert into an expression? Input (string tp, string f), Output string pname

Cosydays avatar Aug 03 '22 13:08 Cosydays

If you mean arguments to the package predicate, then either split it according to the point where any semantic versioning can occur (e.g. if we have x/v1/y and x/v2/y then use package("x", "y")), or if the package doesn't use semantic versioning you can simply use a string literal instead of the package predicate.

smowton avatar Aug 03 '22 18:08 smowton

@smowton I mean covert this ql to predicate with result, Is that okay?

from DataFlow::FieldReadNode frn, string pname
where frn.getField().hasQualifiedName(pname, "SetDataRequest", "Birthday")
select pname

like

string getPkgPath(string tp, string f){
    result = xxx
}

Cosydays avatar Aug 04 '22 03:08 Cosydays

I don't follow what you'd want tp or f to be here. If your select pname query returned x.com/y/z then you could just use frn.getField().hasQualifiedName("x.com/y/z", "SetDataRequest", "Birthday")

smowton avatar Aug 04 '22 08:08 smowton

@smowton If use your method,i need select twice, first select pname, then select the taint flow tp or f is to be SetDataRequest and Birthday

Cosydays avatar Aug 05 '22 02:08 Cosydays

I don't mean you should use my method in your query, I mean you should run my query once to discover the right package name, then thereafter hard-code the answer into your query. Your final query should look like

exists(DataFlow::FieldReadNode fieldReadNode | 
          this = fieldReadNode
          and
          fieldReadNode.getField().hasQualifiedName("insert-the-result-from-my-query-here", "SetDataRequest", "Birthday")
          )

smowton avatar Aug 05 '22 08:08 smowton

@smowton Thanks for your reply! I mean can I get the result I want in one query, that looks like i can only query twice to TaintTracking the field.

Cosydays avatar Aug 05 '22 09:08 Cosydays

My suggestion was just a debugging method, a way to discover a package name that you're not sure of. Your eventual query should be a single query like the one in my comment above. For example if my suggested debugging tool told you the package name is "x.com/y/z", your query should say

exists(DataFlow::FieldReadNode fieldReadNode | 
          this = fieldReadNode
          and
          fieldReadNode.getField().hasQualifiedName("x.com/y/z", "SetDataRequest", "Birthday")
          )

Normally you shouldn't need a second query at all because you know the package name is x.com/y/z, but if you're not sure what the package name should be you can use a second throwaway query as a way to discover it easily. This step only happens when debugging your query, not in routine use.

smowton avatar Aug 05 '22 10:08 smowton

@smowton Thanks, I just want to make this ql a bit more general. I usually query some repository that i am not familiar, so i do not know the package name.

Cosydays avatar Aug 05 '22 10:08 Cosydays

So what do you know about the field you want to target? You don't know the package name; do you know the type name? The field name? The field's type?

smowton avatar Aug 05 '22 10:08 smowton

@smowton I just know the field Birthday is a string

Cosydays avatar Aug 06 '22 07:08 Cosydays

In that case you just have fieldReadNode.getField().getType() instanceof StringType and don't use hasQualifiedName at all (unless you really know the name is "Birthday" in which case getName() = "Birthday"), but obviously this is a very broad query and may be very expensive if you can't restrict the field-reads you're interested in any further.

smowton avatar Aug 08 '22 14:08 smowton

@smowton Can I find any string in the code, like a search in the IDE

Cosydays avatar Aug 09 '22 10:08 Cosydays

To find a string literal simply select any(StringLit sl). If you mean searching the code lexically (e.g. finding the exact char sequence "x int, y int"), then no you can't do that using CodeQL because our database describes an abstract syntax tree; it doesn't contain the literal code (though it does use hasLocationInfo to relate abstract syntax tree nodes back to the source code which could be used to search for a node corresponding to a source location)

smowton avatar Aug 09 '22 10:08 smowton

@smowton Does it have the ability to location configuration files? like ymal?

Cosydays avatar Aug 09 '22 10:08 Cosydays

Not yaml. We do extract some XML/HTML in a few languages where we have queries that can use information stored in those files, and in Java we extract .properties filess.

smowton avatar Aug 09 '22 10:08 smowton

@smowton Can I restrict FieldReadNode to a certain method/function? For example, i just care about FunctionA‘s field.

Cosydays avatar Aug 12 '22 08:08 Cosydays

You mean a read made inside function A? Like func functonA() { return someStruct.someField; }? If so then yes, you can use DataFlow::Node.getRoot() to find out what function, if any, the node occurs within.

smowton avatar Aug 12 '22 15:08 smowton

@smowton Thanks, I'm still confused about using DataFlow::Node.getRoot(). Can you give me some example? I want to a read field made inside function FuncA.

class KitexParam extends UntrustedFlowSource::Range {
    KitexParam() {
        exists(DataFlow::FieldReadNode fieldReadNode | 
        this = fieldReadNode
        and
        fieldReadNode.getField().getName().regexpMatch("(?i).*birthday.*")
        )
    }
}

Cosydays avatar Aug 15 '22 02:08 Cosydays

For example, with Go source:

package xyz

type hasBday struct {

  birthday string;

}

func seeThis(b hasBday) string {

  return b.birthday

}

func dontSeeThis(b hasBday) string {

  return b.birthday

}

If I only want to see the read inside func seeThis, I can adapt your QL like:

class KitexParam extends UntrustedFlowSource::Range {
  KitexParam() {
      exists(DataFlow::FieldReadNode fieldReadNode |
      this = fieldReadNode
      and
      fieldReadNode.getField().getName().regexpMatch("(?i).*birthday.*")
      and fieldReadNode.getRoot().(FuncDef).getName() = "seeThis"
      )
  }
}

from KitexParam p select p

smowton avatar Aug 15 '22 08:08 smowton

@smowton Thank you for your reply! But i have another question: code is here: i want to track userID, FieldReadNode seems does not work, Is there any other way to define this fielduserIDimage image My ql is

class KitexParam extends UntrustedFlowSource::Range {
    KitexParam() {
        exists(DataFlow::FieldReadNode fieldReadNode | 
        this = fieldReadNode
        and
        fieldReadNode.getField().getName().regexpMatch("(?i).*userID.*")
        and
        fieldReadNode.getRoot().(FuncDef).getName() = "UpdateUserPrivate"
        )
    }
}

Cosydays avatar Aug 15 '22 09:08 Cosydays

Your problem is that the actual field read doesn't occur in UpdateUserPrivate, it occurs inside GetUserId. The thing you highlight with a box is not a field but rather a local variable.

smowton avatar Aug 15 '22 10:08 smowton

@smowton Yes, you are right. I want to track the userId in UpdateUserPrivateRequest, is there any way? I think this is a very common scenario in TaintTracking

Cosydays avatar Aug 15 '22 10:08 Cosydays

You can find reads from such local variables using a ReadNode rather than the more specific FieldReadNode.

For example:

from DataFlow::ReadNode rn
where rn.reads(any(LocalVariable lv | lv.getName() = "getsName"))
select rn

This identifies the read from the local variable getsName in the context:

package xyz

func test(p string) { 

  getsName := p
  f(getsName)

}

func f(s string) { }

smowton avatar Aug 15 '22 11:08 smowton

@smowton Thank you for your reply, ReadNode seems to work. But I don't think it's common enough. If I want to track the UpdateUserPrivateRequest.UserId, I first need to determine if it is assigned to a local variable to determine whether to use ReadNode or FieldReadNode . image

Cosydays avatar Aug 15 '22 12:08 Cosydays