klee_make_symbolic bug?

Discussion:

Mark R. Tuttle

2014-08-13 20:41:25 UTC

Is this a bug in how klee_make_symbolic(adr,size,name) handles the name
string?

I want to modify ktest-tool so complex data structures are easier to read.
I want an array of integers to print in a way that looks like an array of
integers. This leads me to experiment with automating construction of
names like "x[0]" and "x[1]" for klee_make_symbolic to insert into the
*.ktest files.

Consider the following code demo.c

#include <klee/klee.h>

int main(int argc, char* argv[]) {
int array[2];

#if BLOCK==1
array[0] = klee_int("x[0]");
array[1] = klee_int("x[1]");
#endif

#if BLOCK==2
int x;
klee_make_symbolic(&x,sizeof(x),"x[0]");
array[0] = x;
klee_make_symbolic(&x,sizeof(x),"x[1]");
array[1] = x;
#endif

#if BLOCK==3
int x0;
klee_make_symbolic(&x0,sizeof(x0),"x[0]");
array[0] = x0;
int x1;
klee_make_symbolic(&x1,sizeof(x1),"x[1]");
array[1] = x1;
#endif

return 0;
}

Block 1 feels like natural code for inserting symbolic integers into an
array. Block 2 feels like a natural in-lining of the klee_int function
that allocates x on the stack, calls klee_make_symbolic, and returns x.
Block 3 is block 2 without the reuse of x (simulating the result of x on
the stack).

So I expect them all to use "x[0]" and "x[1]" as the names for array[0]
and array[1], but that's not what I get running the latest KLEE with
LLVM/CLANG 3.3:

Block 1

$LLVM/bin/clang -DBLOCK=1 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name

object 0: name: 'x[0]'
object 1: name: 'x[1]'

Block 2

$LLVM/bin/clang -DBLOCK=2 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name

object 0: name: 'x[1]'
object 1: name: 'x[1]'

Block 3

$LLVM/bin/clang -DBLOCK=3 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name

object 0: name: 'x[0]'
object 1: name: 'x[1]'

Is this a bug or am I just doing something stupid?

Thanks,
Mark

Chaoqiang Zhang

2014-08-14 06:18:32 UTC

Permalink

klee will reset the memory object name if you are using the same
variable(the same memory address), see function
SpecialFunctionHandler::handleMakeSymbolic,
So, in this case, your second name overwrite the first one. But the good
thing is that klee also has ObjectState::Array member hold another name, so
if you change the line 1) below inside Executor::getSymbolicSolution into
2). It can output the correct name. But, I am not sure if this is the right
solution.

1) res.push_back(std::make_pair(state.symbolics[i].first->name, values[i]));

====>>>>>

2) res.push_back(std::make_pair(objects[i]->name, values[i]));

Post by Mark R. Tuttle
Is this a bug in how klee_make_symbolic(adr,size,name) handles the name
string?
I want to modify ktest-tool so complex data structures are easier to read.
I want an array of integers to print in a way that looks like an array of
integers. This leads me to experiment with automating construction of
names like "x[0]" and "x[1]" for klee_make_symbolic to insert into the
*.ktest files.
Consider the following code demo.c
#include <klee/klee.h>
int main(int argc, char* argv[]) {
int array[2];
#if BLOCK==1
array[0] = klee_int("x[0]");
array[1] = klee_int("x[1]");
#endif
#if BLOCK==2
int x;
klee_make_symbolic(&x,sizeof(x),"x[0]");
array[0] = x;
klee_make_symbolic(&x,sizeof(x),"x[1]");
array[1] = x;
#endif
#if BLOCK==3
int x0;
klee_make_symbolic(&x0,sizeof(x0),"x[0]");
array[0] = x0;
int x1;
klee_make_symbolic(&x1,sizeof(x1),"x[1]");
array[1] = x1;
#endif
return 0;
}
Block 1 feels like natural code for inserting symbolic integers into an
array. Block 2 feels like a natural in-lining of the klee_int function
that allocates x on the stack, calls klee_make_symbolic, and returns x.
Block 3 is block 2 without the reuse of x (simulating the result of x on
the stack).
So I expect them all to use "x[0]" and "x[1]" as the names for array[0]
and array[1], but that's not what I get running the latest KLEE with
Block 1
$LLVM/bin/clang -DBLOCK=1 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[0]'
object 1: name: 'x[1]'
Block 2
$LLVM/bin/clang -DBLOCK=2 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[1]'
object 1: name: 'x[1]'
Block 3
$LLVM/bin/clang -DBLOCK=3 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[0]'
object 1: name: 'x[1]'
Is this a bug or am I just doing something stupid?
Thanks,
Mark
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Mark R. Tuttle

2014-08-14 15:08:00 UTC

Permalink

Thanks. I believe you.

My reason for writing, though, was that Block 1 should also be using the
same memory address for the two name strings, but it works without
suffering from the overwrite you describe.

Specifically, there is no use of the stack between the two calls to
klee_int, so the memory used on the stack should be the same by the two
invocations to klee_int. I'm guessing the stack is treated differently.

Thanks,
Mark

Post by Chaoqiang Zhang
klee will reset the memory object name if you are using the same
variable(the same memory address), see function SpecialFunctionHandler::handleMakeSymbolic,
So, in this case, your second name overwrite the first one. But the good
thing is that klee also has ObjectState::Array member hold another name, so
if you change the line 1) below inside Executor::getSymbolicSolution into
2). It can output the correct name. But, I am not sure if this is the right
solution.
1) res.push_back(std::make_pair(state.symbolics[i].first->name, values[i]));
====>>>>>
2) res.push_back(std::make_pair(objects[i]->name, values[i]));

Post by Mark R. Tuttle
Is this a bug in how klee_make_symbolic(adr,size,name) handles the name
string?
I want to modify ktest-tool so complex data structures are easier to
read. I want an array of integers to print in a way that looks like an
array of integers. This leads me to experiment with automating
construction of names like "x[0]" and "x[1]" for klee_make_symbolic to
insert into the *.ktest files.
Consider the following code demo.c
#include <klee/klee.h>
int main(int argc, char* argv[]) {
int array[2];
#if BLOCK==1
array[0] = klee_int("x[0]");
array[1] = klee_int("x[1]");
#endif
#if BLOCK==2
int x;
klee_make_symbolic(&x,sizeof(x),"x[0]");
array[0] = x;
klee_make_symbolic(&x,sizeof(x),"x[1]");
array[1] = x;
#endif
#if BLOCK==3
int x0;
klee_make_symbolic(&x0,sizeof(x0),"x[0]");
array[0] = x0;
int x1;
klee_make_symbolic(&x1,sizeof(x1),"x[1]");
array[1] = x1;
#endif
return 0;
}
Block 1 feels like natural code for inserting symbolic integers into an
array. Block 2 feels like a natural in-lining of the klee_int function
that allocates x on the stack, calls klee_make_symbolic, and returns x.
Block 3 is block 2 without the reuse of x (simulating the result of x on
the stack).
So I expect them all to use "x[0]" and "x[1]" as the names for array[0]
and array[1], but that's not what I get running the latest KLEE with
Block 1
$LLVM/bin/clang -DBLOCK=1 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[0]'
object 1: name: 'x[1]'
Block 2
$LLVM/bin/clang -DBLOCK=2 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[1]'
object 1: name: 'x[1]'
Block 3
$LLVM/bin/clang -DBLOCK=3 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[0]'
object 1: name: 'x[1]'
Is this a bug or am I just doing something stupid?
Thanks,
Mark
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Chaoqiang Zhang

2014-08-14 16:52:45 UTC

Permalink

Post by Mark R. Tuttle

Post by Chaoqiang Zhang

Post by Mark R. Tuttle

Post by Mark R. Tuttle
I'm guessing the stack is treated differently.

Yes, good observation. I checked and found in klee_int, the "x" variable
got different addresses for different calls. You may find it by print &x.

Post by Mark R. Tuttle
Thanks. I believe you.
My reason for writing, though, was that Block 1 should also be using the
same memory address for the two name strings, but it works without
suffering from the overwrite you describe.
Specifically, there is no use of the stack between the two calls to
klee_int, so the memory used on the stack should be the same by the two
invocations to klee_int. I'm guessing the stack is treated differently.
Thanks,
Mark
On Thu, Aug 14, 2014 at 2:18 AM, Chaoqiang Zhang <

Post by Chaoqiang Zhang
klee will reset the memory object name if you are using the same
variable(the same memory address), see function SpecialFunctionHandler::handleMakeSymbolic,
So, in this case, your second name overwrite the first one. But the good
thing is that klee also has ObjectState::Array member hold another name, so
if you change the line 1) below inside Executor::getSymbolicSolution
into 2). It can output the correct name. But, I am not sure if this is the
right solution.
1) res.push_back(std::make_pair(state.symbolics[i].first->name, values[i]));
====>>>>>
2) res.push_back(std::make_pair(objects[i]->name, values[i]));

Post by Mark R. Tuttle
Is this a bug in how klee_make_symbolic(adr,size,name) handles the name
string?
I want to modify ktest-tool so complex data structures are easier to
read. I want an array of integers to print in a way that looks like an
array of integers. This leads me to experiment with automating
construction of names like "x[0]" and "x[1]" for klee_make_symbolic to
insert into the *.ktest files.
Consider the following code demo.c
#include <klee/klee.h>
int main(int argc, char* argv[]) {
int array[2];
#if BLOCK==1
array[0] = klee_int("x[0]");
array[1] = klee_int("x[1]");
#endif
#if BLOCK==2
int x;
klee_make_symbolic(&x,sizeof(x),"x[0]");
array[0] = x;
klee_make_symbolic(&x,sizeof(x),"x[1]");
array[1] = x;
#endif
#if BLOCK==3
int x0;
klee_make_symbolic(&x0,sizeof(x0),"x[0]");
array[0] = x0;
int x1;
klee_make_symbolic(&x1,sizeof(x1),"x[1]");
array[1] = x1;
#endif
return 0;
}
Block 1 feels like natural code for inserting symbolic integers into an
array. Block 2 feels like a natural in-lining of the klee_int function
that allocates x on the stack, calls klee_make_symbolic, and returns x.
Block 3 is block 2 without the reuse of x (simulating the result of x on
the stack).
So I expect them all to use "x[0]" and "x[1]" as the names for array[0]
and array[1], but that's not what I get running the latest KLEE with
Block 1
$LLVM/bin/clang -DBLOCK=1 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[0]'
object 1: name: 'x[1]'
Block 2
$LLVM/bin/clang -DBLOCK=2 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[1]'
object 1: name: 'x[1]'
Block 3
$LLVM/bin/clang -DBLOCK=3 -c -emit-llvm -I $KLEE/include -o demo.bc demo.c
$KLEE/bin/klee demo.bc
$KLEE/bin/ktest-tool klee-last/*.ktest | grep name
object 0: name: 'x[0]'
object 1: name: 'x[1]'
Is this a bug or am I just doing something stupid?
Thanks,
Mark
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Daniel Liew

2014-08-14 17:08:16 UTC

Permalink

But the string you pass into klee_int() specifies the string name for
that symbolic and that is exactly what you are observing.

Post by Mark R. Tuttle
Specifically, there is no use of the stack between the two calls to
klee_int, so the memory used on the stack should be the same by the two
invocations to klee_int. I'm guessing the stack is treated differently.

No. klee_int() is part of KLEE's runtime and can be seen at [1]

calling into klee_int() will cause KLEE to enter the klee_int()
function and a variable on klee_int's stack is used as the address of
the symbolic. Even if klee_int() was inlined this would still work
correctly.

[1] https://github.com/klee/klee/blob/master/runtime/Intrinsic/klee_int.c

You might find it informative to look at the LLVM IR that is actually
being executed (look at klee-last/assembly.ll).

Thanks,
Dan.

Chaoqiang Zhang

2014-08-14 17:22:52 UTC

Permalink

Post by Mark R. Tuttle

Post by Daniel Liew
calling into klee_int() will cause KLEE to enter the klee_int()
function and a variable on klee_int's stack is used as the address of
the symbolic. Even if klee_int() was inlined this would still work

correctly
More specifically, the variable "x" in klee_int is actually allocated with
malloc under klee vm instead of automatic memory allocation like standard
machine. see Executor::executeAlloc-> MemoryManager::allocate call chain. I
am also thinking if we can assume klee_int can always give you different
addresses, what if two malloc give you the same address if there are some
free between them. please correct me if I am wrong

Post by Mark R. Tuttle

Post by Mark R. Tuttle
Thanks. I believe you.
My reason for writing, though, was that Block 1 should also be using the
same memory address for the two name strings, but it works without

suffering

Post by Mark R. Tuttle
from the overwrite you describe.

But the string you pass into klee_int() specifies the string name for
that symbolic and that is exactly what you are observing.

No. klee_int() is part of KLEE's runtime and can be seen at [1]
calling into klee_int() will cause KLEE to enter the klee_int()
function and a variable on klee_int's stack is used as the address of
the symbolic. Even if klee_int() was inlined this would still work
correctly.
[1] https://github.com/klee/klee/blob/master/runtime/Intrinsic/klee_int.c
You might find it informative to look at the LLVM IR that is actually
being executed (look at klee-last/assembly.ll).
Thanks,
Dan.