Skip to content

CPP: IR inconsistency with global variable and address-of related to constant-folding(?) #21241

@intrigus-lgtm

Description

@intrigus-lgtm

Hey, I was trying to understand why my query didn't find specific alerts.
Basically, I wanted to find uses of &foo which are being checked for whether it is (un)equal to NULL, because &foo should never be NULL in UB-free code (as far as I understand the standard).
However, it failed for if (&global_var == NULL) while it works for if (&global_var) and local variables and parameter variables.

(I thought that maybe CodeQL does some similar optimizations where it knows that &foo is always non-NULL (but only for global variables) and that's why it's not being detected, but that doesn't really make sense, because equivalent formulations were also not detected)
So I looked at the IR graph using this query:

/**
 * @name ff
 * @description aa
 * @kind graph
 * @id aaaaa
 * @tags security
 */

import cpp
import semmle.code.cpp.ir.IR
import semmle.code.cpp.ir.PrintIR
import semmle.code.cpp.Declaration

private class Foo extends PrintIRConfiguration {
  override predicate shouldPrintDeclaration(Declaration decl) {
    decl.(Function).getName() = "test_global_variables"
  }
}

and then running codeql database analyze --format=dot --output=lol.dot PATH_TO_TEST_DB IR_Print.ql which gave me the dot file in this issue.

And importantly, there is a big difference between the same patterns and global_var, local_var (and param_var).
For global_var there is some constant folding being done, but also not for all cases.
I don't know whether the folding is intentional and local and parameter vars just not yet perform this folding?
Anyway, is there a way to "fix" this problem and if so, how?

Version: 2.24.0

Testfile
#define NULL ((void*)0)

int global_var;

// Test global variables
void test_global_variables(int param_var) {
    int local_var;

    if (&global_var == NULL) {}      // $ Missing: Alert
    if (&global_var != NULL) {}      // $ Missing: Alert
    if (&global_var) {}              // $ Alert
    if (!&global_var) {}             // $ Missing: Alert

    if (&local_var == NULL) {}       // $ Alert
    if (&local_var != NULL) {}       // $ Alert
    if (&local_var) {}               // $ Alert
    if (!&local_var) {}              // $ Alert

    if (&param_var == NULL) {}       // $ Alert
    if (&param_var != NULL) {}       // $ Alert
    if (&param_var) {}               // $ Alert
    if (!&param_var) {}              // $ Alert
}
IR Graph dot
digraph {
  compound=true;
  subgraph cluster_0 {
    0[shape=point, width=0];
    "label"="void test_global_variables(int)"; 
    
    subgraph cluster_1 {
      1[shape=point, width=0];
      "label"="Block 0"; 
      
      2["label"="v6_1(void) = EnterFunction : "; ];
      3["label"="m6_2(unknown) = AliasedDefinition : "; ];
      4["label"="m6_3(unknown) = InitializeNonLocal : "; ];
      5["label"="m6_4(unknown) = Chi : total:m6_2, partial:m6_3"; ];
      6["label"="r6_5(glval<int>) = VariableAddress[param_var] : "; ];
      7["label"="m6_6(int) = InitializeParameter[param_var] : &:r6_5"; ];
      8["label"="r7_1(glval<int>) = VariableAddress[local_var] : "; ];
      9["label"="m7_2(int) = Uninitialized[local_var] : &:r7_1"; ];
      10["label"="r9_1(int) = Constant[0] : "; ];
      11["label"="r9_2(int) = Constant[0] : "; ];
      12["label"="r9_3(bool) = CompareNE : r9_1, r9_2"; ];
      13["label"="v9_4(void) = ConditionalBranch : r9_3"; ];
    }
    subgraph cluster_14 {
      14[shape=point, width=0];
      "label"="Block 1"; 
      
      15["label"="r10_1(int) = Constant[1] : "; ];
      16["label"="r10_2(int) = Constant[0] : "; ];
      17["label"="r10_3(bool) = CompareNE : r10_1, r10_2"; ];
      18["label"="v10_4(void) = ConditionalBranch : r10_3"; ];
    }
    subgraph cluster_19 {
      19[shape=point, width=0];
      "label"="Block 2"; 
      
      20["label"="v10_5(void) = NoOp : "; ];
      21["label"="r11_1(glval<int>) = VariableAddress[global_var] : "; ];
      22["label"="r11_2(int *) = CopyValue : r11_1"; ];
      23["label"="r11_3(int *) = Constant[0] : "; ];
      24["label"="r11_4(bool) = CompareNE : r11_2, r11_3"; ];
      25["label"="v11_5(void) = ConditionalBranch : r11_4"; ];
    }
    subgraph cluster_26 {
      26[shape=point, width=0];
      "label"="Block 3"; 
      
      27["label"="v11_6(void) = NoOp : "; ];
    }
    subgraph cluster_28 {
      28[shape=point, width=0];
      "label"="Block 4"; 
      
      29["label"="r12_1(int) = Constant[0] : "; ];
      30["label"="r12_2(int) = Constant[0] : "; ];
      31["label"="r12_3(bool) = CompareNE : r12_1, r12_2"; ];
      32["label"="v12_4(void) = ConditionalBranch : r12_3"; ];
    }
    subgraph cluster_33 {
      33[shape=point, width=0];
      "label"="Block 5"; 
      
      34["label"="r14_1(glval<int>) = VariableAddress[local_var] : "; ];
      35["label"="r14_2(int *) = CopyValue : r14_1"; ];
      36["label"="r14_3(int *) = Constant[0] : "; ];
      37["label"="r14_4(bool) = CompareEQ : r14_2, r14_3"; ];
      38["label"="v14_5(void) = ConditionalBranch : r14_4"; ];
    }
    subgraph cluster_39 {
      39[shape=point, width=0];
      "label"="Block 6"; 
      
      40["label"="v14_6(void) = NoOp : "; ];
    }
    subgraph cluster_41 {
      41[shape=point, width=0];
      "label"="Block 7"; 
      
      42["label"="r15_1(glval<int>) = VariableAddress[local_var] : "; ];
      43["label"="r15_2(int *) = CopyValue : r15_1"; ];
      44["label"="r15_3(int *) = Constant[0] : "; ];
      45["label"="r15_4(bool) = CompareNE : r15_2, r15_3"; ];
      46["label"="v15_5(void) = ConditionalBranch : r15_4"; ];
    }
    subgraph cluster_47 {
      47[shape=point, width=0];
      "label"="Block 8"; 
      
      48["label"="v15_6(void) = NoOp : "; ];
    }
    subgraph cluster_49 {
      49[shape=point, width=0];
      "label"="Block 9"; 
      
      50["label"="r16_1(glval<int>) = VariableAddress[local_var] : "; ];
      51["label"="r16_2(int *) = CopyValue : r16_1"; ];
      52["label"="r16_3(int *) = Constant[0] : "; ];
      53["label"="r16_4(bool) = CompareNE : r16_2, r16_3"; ];
      54["label"="v16_5(void) = ConditionalBranch : r16_4"; ];
    }
    subgraph cluster_55 {
      55[shape=point, width=0];
      "label"="Block 10"; 
      
      56["label"="v16_6(void) = NoOp : "; ];
    }
    subgraph cluster_57 {
      57[shape=point, width=0];
      "label"="Block 11"; 
      
      58["label"="r17_1(glval<int>) = VariableAddress[local_var] : "; ];
      59["label"="r17_2(int *) = CopyValue : r17_1"; ];
      60["label"="r17_3(int *) = Constant[0] : "; ];
      61["label"="r17_4(bool) = CompareEQ : r17_2, r17_3"; ];
      62["label"="v17_5(void) = ConditionalBranch : r17_4"; ];
    }
    subgraph cluster_63 {
      63[shape=point, width=0];
      "label"="Block 12"; 
      
      64["label"="v17_6(void) = NoOp : "; ];
    }
    subgraph cluster_65 {
      65[shape=point, width=0];
      "label"="Block 13"; 
      
      66["label"="r19_1(glval<int>) = VariableAddress[param_var] : "; ];
      67["label"="r19_2(int *) = CopyValue : r19_1"; ];
      68["label"="r19_3(int *) = Constant[0] : "; ];
      69["label"="r19_4(bool) = CompareEQ : r19_2, r19_3"; ];
      70["label"="v19_5(void) = ConditionalBranch : r19_4"; ];
    }
    subgraph cluster_71 {
      71[shape=point, width=0];
      "label"="Block 14"; 
      
      72["label"="v19_6(void) = NoOp : "; ];
    }
    subgraph cluster_73 {
      73[shape=point, width=0];
      "label"="Block 15"; 
      
      74["label"="r20_1(glval<int>) = VariableAddress[param_var] : "; ];
      75["label"="r20_2(int *) = CopyValue : r20_1"; ];
      76["label"="r20_3(int *) = Constant[0] : "; ];
      77["label"="r20_4(bool) = CompareNE : r20_2, r20_3"; ];
      78["label"="v20_5(void) = ConditionalBranch : r20_4"; ];
    }
    subgraph cluster_79 {
      79[shape=point, width=0];
      "label"="Block 16"; 
      
      80["label"="v20_6(void) = NoOp : "; ];
    }
    subgraph cluster_81 {
      81[shape=point, width=0];
      "label"="Block 17"; 
      
      82["label"="r21_1(glval<int>) = VariableAddress[param_var] : "; ];
      83["label"="r21_2(int *) = CopyValue : r21_1"; ];
      84["label"="r21_3(int *) = Constant[0] : "; ];
      85["label"="r21_4(bool) = CompareNE : r21_2, r21_3"; ];
      86["label"="v21_5(void) = ConditionalBranch : r21_4"; ];
    }
    subgraph cluster_87 {
      87[shape=point, width=0];
      "label"="Block 18"; 
      
      88["label"="v21_6(void) = NoOp : "; ];
    }
    subgraph cluster_89 {
      89[shape=point, width=0];
      "label"="Block 19"; 
      
      90["label"="r22_1(glval<int>) = VariableAddress[param_var] : "; ];
      91["label"="r22_2(int *) = CopyValue : r22_1"; ];
      92["label"="r22_3(int *) = Constant[0] : "; ];
      93["label"="r22_4(bool) = CompareEQ : r22_2, r22_3"; ];
      94["label"="v22_5(void) = ConditionalBranch : r22_4"; ];
    }
    subgraph cluster_95 {
      95[shape=point, width=0];
      "label"="Block 20"; 
      
      96["label"="v22_6(void) = NoOp : "; ];
    }
    subgraph cluster_97 {
      97[shape=point, width=0];
      "label"="Block 21"; 
      
      98["label"="v23_1(void) = NoOp : "; ];
      99["label"="v6_7(void) = ReturnVoid : "; ];
      100["label"="v6_8(void) = AliasedUse : m6_3"; ];
      101["label"="v6_9(void) = ExitFunction : "; ];
    }
    subgraph cluster_102 {
      102[shape=point, width=0];
      "label"="Block 22"; 
      
      103["label"="v6_10(void) = Unreached : "; ];
    }
  }
  1 -> 14["label"="False"; "ltail"="cluster_1"; "lhead"="cluster_14"; ];
  19 -> 28["label"="False"; "ltail"="cluster_19"; "lhead"="cluster_28"; ];
  26 -> 28["label"="Goto"; "ltail"="cluster_26"; "lhead"="cluster_28"; ];
  39 -> 41["label"="Goto"; "ltail"="cluster_39"; "lhead"="cluster_41"; ];
  47 -> 49["label"="Goto"; "ltail"="cluster_47"; "lhead"="cluster_49"; ];
  55 -> 57["label"="Goto"; "ltail"="cluster_55"; "lhead"="cluster_57"; ];
  63 -> 65["label"="Goto"; "ltail"="cluster_63"; "lhead"="cluster_65"; ];
  71 -> 73["label"="Goto"; "ltail"="cluster_71"; "lhead"="cluster_73"; ];
  79 -> 81["label"="Goto"; "ltail"="cluster_79"; "lhead"="cluster_81"; ];
  87 -> 89["label"="Goto"; "ltail"="cluster_87"; "lhead"="cluster_89"; ];
  95 -> 97["label"="Goto"; "ltail"="cluster_95"; "lhead"="cluster_97"; ];
  14 -> 102["label"="False"; "ltail"="cluster_14"; "lhead"="cluster_102"; ];
  28 -> 33["label"="False"; "ltail"="cluster_28"; "lhead"="cluster_33"; ];
  33 -> 41["label"="False"; "ltail"="cluster_33"; "lhead"="cluster_41"; ];
  41 -> 49["label"="False"; "ltail"="cluster_41"; "lhead"="cluster_49"; ];
  49 -> 57["label"="False"; "ltail"="cluster_49"; "lhead"="cluster_57"; ];
  57 -> 65["label"="False"; "ltail"="cluster_57"; "lhead"="cluster_65"; ];
  65 -> 73["label"="False"; "ltail"="cluster_65"; "lhead"="cluster_73"; ];
  73 -> 81["label"="False"; "ltail"="cluster_73"; "lhead"="cluster_81"; ];
  81 -> 89["label"="False"; "ltail"="cluster_81"; "lhead"="cluster_89"; ];
  89 -> 97["label"="False"; "ltail"="cluster_89"; "lhead"="cluster_97"; ];
  1 -> 102["label"="True"; "ltail"="cluster_1"; "lhead"="cluster_102"; ];
  19 -> 26["label"="True"; "ltail"="cluster_19"; "lhead"="cluster_26"; ];
  14 -> 19["label"="True"; "ltail"="cluster_14"; "lhead"="cluster_19"; ];
  28 -> 102["label"="True"; "ltail"="cluster_28"; "lhead"="cluster_102"; ];
  33 -> 39["label"="True"; "ltail"="cluster_33"; "lhead"="cluster_39"; ];
  41 -> 47["label"="True"; "ltail"="cluster_41"; "lhead"="cluster_47"; ];
  49 -> 55["label"="True"; "ltail"="cluster_49"; "lhead"="cluster_55"; ];
  57 -> 63["label"="True"; "ltail"="cluster_57"; "lhead"="cluster_63"; ];
  65 -> 71["label"="True"; "ltail"="cluster_65"; "lhead"="cluster_71"; ];
  73 -> 79["label"="True"; "ltail"="cluster_73"; "lhead"="cluster_79"; ];
  81 -> 87["label"="True"; "ltail"="cluster_81"; "lhead"="cluster_87"; ];
  89 -> 95["label"="True"; "ltail"="cluster_89"; "lhead"="cluster_95"; ];
}
Impacted query
/**
 * @name Suspicious check on address-of expression
 * @description Taking the address of a valid object always produces a non-NULL
 *              pointer, so checking the result against NULL is either redundant
 *              or indicates a bug where a different check was intended.
 * @kind problem
 * @problem.severity warning
 * @precision high
 * @id asymmetric-research/suspicious-address-of-null-check
 * @tags reliability
 *       correctness
 *       readability
 */

import cpp
import semmle.code.cpp.ir.dataflow.MustFlow
import semmle.code.cpp.controlflow.Guards

private class MustFlowConfig extends MustFlowConfiguration {
  MustFlowConfig() { this = "SuspiciousAddressOfNullCheckMustFlow" }

  override predicate isSource(Instruction source) {
    source.getUnconvertedResultExpression() instanceof ValidAddressOfExpr
  }

  override predicate isSink(Operand sink) { isNullOrNonNullCheck(sink) }

  override predicate allowInterproceduralFlow() { none() }
  override predicate isBarrier(Instruction instr) {
    instr instanceof InitializeParameterInstruction
  }
}


predicate debug(Instruction instr, string instrType/*, Expr expr, string exprType*/, int startline) {
  instr.getLocation().getStartLine() = startline and
  instr.getLocation().getStartLine() = [28,82, 83] and
  // instr.getLocation().getEndLine() <= 86 and
  // instr.getUnconvertedResultExpression() = expr and
  instrType = instr.getAQlClass() and
  any()
  // exprType = expr.getAQlClass()
}

/**
 * Technically, we'd ignore cases where we do something like &foo->first_field (where foo is a pointer)
 * this is because if foo is NULL, the expression could actually evaluate to NULL.
 * However, it doesn't seem likely that anyone would write code like that intentionally
 * and it's also undefined behavior to dereference a NULL pointer anyway.
 */
class ValidAddressOfExpr extends AddressOfExpr { }

predicate isNullOrNonNullCheck(Operand operand) {
  any(IRGuardCondition gc).comparesEq(operand, 0, _, _)
}

from MustFlowPathNode source, MustFlowPathNode sink, MustFlowConfig cfg
where
  cfg.hasFlowPath(source, sink) and
  not source.getInstruction().getUnconvertedResultExpression().isInMacroExpansion() and
  source.getInstruction().getEnclosingFunction() = sink.getInstruction().getEnclosingFunction()
select sink, "Taking the address of $@ always produces a non-NULL pointer, but it is checked $@.",
  source.getInstruction().getUnconvertedResultExpression().(ValidAddressOfExpr).getOperand(),
  "this operand", sink, "here"

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions