Generate all solutions

Discussion:

Sylvain Gault

2014-08-02 21:34:50 UTC

Hello,

I would like to use klee as a front-end to an SMT solver. Basically,
what I currently do is:
- declare input as symbolic;
- write the computation in C;
- test whether the output is the one I want with an "if";
- make klee find an input that activate the "true" branch of the if.

And it works quite well, it produce a solution.
But then I would like to obtain all the input assignments that would
activate that same path.

Is there an easy way to do this?
I have the feeling that a hack-ish way to do this would be to get the
query to kleaver that activate that path (I think I saw an option for
that) and feed it repetidely to kleaver (or another SMT solver), each
time adding a clause negating the previous solutionS.

I think this is a hackish solution because the solver would probably
make the same computation over and over again. While outputing the
solution and backtracking to find the next one directly would probably
be a lot more efficient. But I may be wrong.

Thanks in advance.

Regards,
Sylvain Gault

Daniel Dunbar

2014-09-12 02:54:10 UTC

Permalink

One "hack" to do this is to simply loop over all of bytes of the symbolic
variable, and test each bit in them (you probably can't let the optimizer
run on this code though, as it will figure out what is happening and remove
it). For example:
--
#include <stdio.h>
#include <klee/klee.h>

signed char concretizeByte(signed char input) {
unsigned char result = 0;
for (int i = 0; i != 8; ++i) {
if (input & (1<<i))
result |= 1<<i;
}
return result;
}

int main() {
signed char x, y;
klee_make_symbolic(&x, sizeof(x), "x");
klee_make_symbolic(&y, sizeof(y), "y");

if (56 * x + 72 * y == 40) {
printf("56 * (%d) + 72 * (%d) == 40\n", concretizeByte(x),
concretizeByte(y));
}

return 0;
}
--
will produce all possible solutions to the Diophantine equation 56 * x + 72
* y = 40.

This probably won't be fast. :)

- Daniel

Post by Sylvain Gault
Hello,
I would like to use klee as a front-end to an SMT solver. Basically,
- declare input as symbolic;
- write the computation in C;
- test whether the output is the one I want with an "if";
- make klee find an input that activate the "true" branch of the if.
And it works quite well, it produce a solution.
But then I would like to obtain all the input assignments that would
activate that same path.
Is there an easy way to do this?
I have the feeling that a hack-ish way to do this would be to get the
query to kleaver that activate that path (I think I saw an option for
that) and feed it repetidely to kleaver (or another SMT solver), each
time adding a clause negating the previous solutionS.
I think this is a hackish solution because the solver would probably
make the same computation over and over again. While outputing the
solution and backtracking to find the next one directly would probably
be a lot more efficient. But I may be wrong.
Thanks in advance.
Regards,
Sylvain Gault
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Sylvain Gault

2014-09-12 04:12:44 UTC

Permalink

I have the feeling this solution is interesting but I don't understand
what happen for klee.

Sylvain Gault

Post by Daniel Dunbar
One "hack" to do this is to simply loop over all of bytes of the symbolic
variable, and test each bit in them (you probably can't let the optimizer run
on this code though, as it will figure out what is happening and remove it).
--
#include <stdio.h>
#include <klee/klee.h>
signed char concretizeByte(signed char input) {
unsigned char result = 0;
for (int i = 0; i != 8; ++i) {
if (input & (1<<i))
result |= 1<<i;
}
return result;
}
int main() {
signed char x, y;
klee_make_symbolic(&x, sizeof(x), "x");
klee_make_symbolic(&y, sizeof(y), "y");
if (56 * x + 72 * y == 40) {
printf("56 * (%d) + 72 * (%d) == 40\n", concretizeByte(x),
concretizeByte(y));
}
return 0;
}
--
will produce all possible solutions to the Diophantine equation 56 * x + 72 * y
= 40.
This probably won't be fast. :)
- Daniel
Hello,
I would like to use klee as a front-end to an SMT solver. Basically,
- declare input as symbolic;
- write the computation in C;
- test whether the output is the one I want with an "if";
- make klee find an input that activate the "true" branch of the if.
And it works quite well, it produce a solution.
But then I would like to obtain all the input assignments that would
activate that same path.
Is there an easy way to do this?
I have the feeling that a hack-ish way to do this would be to get the
query to kleaver that activate that path (I think I saw an option for
that) and feed it repetidely to kleaver (or another SMT solver), each
time adding a clause negating the previous solutionS.
I think this is a hackish solution because the solver would probably
make the same computation over and over again. While outputing the
solution and backtracking to find the next one directly would probably
be a lot more efficient. But I may be wrong.
Thanks in advance.
Regards,
Sylvain Gault
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Kuchta, Tomasz

2014-09-12 07:11:38 UTC

Permalink

Hi Sylvain,

Regarding your question from the first e-mail - as far as I know there is
no easy way to get all possible values, except
for adding inequalities to the set of constraints, but if there exists one
I would be happy to learn about that.
One possible issue related to this problem is when you have non-monotonic
range of values, let¹s say 1..2 and 5..10 are SAT.

If I understand Daniel¹s example (which is very nice) correctly, what
happens is:
1. You mark variables x and y as symbolic

2. In concretizeByte function you iterate over all bits of the input
parameter (which is a symbolic single byte char); this is done in the
loop, by retrieving value of each bit through a bit mask

3. Since the input parameter ³input² to the function concretizeByte is
symbolic, ³if² statement inside the function will cause forking: ³then²
side will be for a given bit being ³1², ³else² side will be for a given
bit being ³0²; in order to fork, we need to call the solver to know
whether the other side of the branch is feasible

4. Since we iterate over all bits of a variable and fork at each bit, what
you get as a result are all possible combinations of 0s and 1s (see point
5)

5. Important part here is that before we call concretizeByte there is an
³if² statement inside ³main² that already puts some constraints on ³x² and
³y², so in concretizeByte not all combinations of bits will be possible

HTH,

Best regards,
Tomek

Post by Sylvain Gault
I have the feeling this solution is interesting but I don't understand
what happen for klee.
Sylvain Gault

_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Sylvain Gault

2014-10-06 02:27:50 UTC

Permalink

Thanks for the explanation, and sorry for the late reply.
It was just not clear to me when the solver is actually called and what
it computes.

So, what I understand is that the solver is called every time there is a
conditional branching, and it just test for the feasability of both
branches. It doesn't provide a value for a symbolic input when the
solution is unique, it doesn't provide additional constraints for the
symbolic values (no constraint propagation).

It may, however, provide an example input when the program terminate or
when an error is encountered.

And the way concretizeByte work is that it may fork for every bit, thus
making a maximum of 256 subprocess, one for every possible value of the
byte. And that's why all the solutions should be printed. And the solver
is called before forking so that it doesn't fork if one process would
explore an unfeasible path. Right?

So it should theoretically only create as many process as the number of
solutions. Right?

Also, I should be able to write x = concretizeByte(x), isn't it?

I'm testing this right now, it seems to work, but is still very slow.
I guess it's the repeated invocations of the solver that takes some time
to do similar computations each time. That being said, compiling with
-O1 helps.

Sylvain

Post by Kuchta, Tomasz
Hi Sylvain,
Regarding your question from the first e-mail - as far as I know there is
no easy way to get all possible values, except
for adding inequalities to the set of constraints, but if there exists one
I would be happy to learn about that.
One possible issue related to this problem is when you have non-monotonic
range of values, let¹s say 1..2 and 5..10 are SAT.
If I understand Daniel¹s example (which is very nice) correctly, what
1. You mark variables x and y as symbolic
2. In concretizeByte function you iterate over all bits of the input
parameter (which is a symbolic single byte char); this is done in the
loop, by retrieving value of each bit through a bit mask
3. Since the input parameter ³input² to the function concretizeByte is
symbolic, ³if² statement inside the function will cause forking: ³then²
side will be for a given bit being ³1², ³else² side will be for a given
bit being ³0²; in order to fork, we need to call the solver to know
whether the other side of the branch is feasible
4. Since we iterate over all bits of a variable and fork at each bit, what
you get as a result are all possible combinations of 0s and 1s (see point
5)
5. Important part here is that before we call concretizeByte there is an
³if² statement inside ³main² that already puts some constraints on ³x² and
³y², so in concretizeByte not all combinations of bits will be possible
HTH,
Best regards,
Tomek

Post by Sylvain Gault
I have the feeling this solution is interesting but I don't understand
what happen for klee.
Sylvain Gault

_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Kuchta, Tomasz

2014-10-06 05:21:11 UTC

Permalink

Hi Sylvain,

Post by Sylvain Gault
Thanks for the explanation, and sorry for the late reply.
It was just not clear to me when the solver is actually called and what
it computes.
So, what I understand is that the solver is called every time there is a
conditional branching, and it just test for the feasability of both
branches. It doesn't provide a value for a symbolic input when the
solution is unique, it doesn't provide additional constraints for the
symbolic values (no constraint propagation).
It may, however, provide an example input when the program terminate or
when an error is encountered.
And the way concretizeByte work is that it may fork for every bit, thus
making a maximum of 256 subprocess, one for every possible value of the
byte. And that's why all the solutions should be printed. And the solver
is called before forking so that it doesn't fork if one process would
explore an unfeasible path. Right?

One note: the term âforkâ in KLEE means that you create new
ExecutionState, which is a representation
of a certain execution path (it contains a âsnapshotâ of program state for
that path). KLEE doesnât spawn
a new processes for each path when forking (new process can, however, be
spawned for a call to the solver, AFAIK).
Yes, as far as I know paths are checked for feasibility to avoid going
into a path
that would not be possible in any concrete execution. I think that whether
the actual solver is called or
not might depend on a certain situation - for example you can have some
sort of results caching that could
prevent from calling the solver or you could have a trivial branch point,
that is uni-directional (so itâs a branch
instruction which only performs a jump).
Also I think that you can have solver calls that are not obvious from the
source code - for example a check for division by zero
has an âifâ statement; solver chain can also be invoked when performing
memory operations.

Post by Sylvain Gault
So it should theoretically only create as many process as the number of
solutions. Right?

Yes, it should explore paths for the possible executions. Please note that
in general for large programs and when
much data is marked as symbolic, exploring all the paths might not be
possible in practice due to the path explosion problem.
You may want to try klee-stats tool that also prints some query statistics.

Post by Sylvain Gault
Also, I should be able to write x = concretizeByte(x), isn't it?

It seems to me that after this call x will stop being symbolic, because it
will be assigned a concrete value in the function.

Post by Sylvain Gault
I'm testing this right now, it seems to work, but is still very slow.
I guess it's the repeated invocations of the solver that takes some time
to do similar computations each time. That being said, compiling with
-O1 helps.
Sylvain

HTH,
Best regards,
Tomek

Post by Sylvain Gault

Post by Kuchta, Tomasz
Hi Sylvain,
Regarding your question from the first e-mail - as far as I know there is
no easy way to get all possible values, except
for adding inequalities to the set of constraints, but if there exists one
I would be happy to learn about that.
One possible issue related to this problem is when you have
non-monotonic
range of values, letÂ¹s say 1..2 and 5..10 are SAT.
If I understand DanielÂ¹s example (which is very nice) correctly, what
1. You mark variables x and y as symbolic
2. In concretizeByte function you iterate over all bits of the input
parameter (which is a symbolic single byte char); this is done in the
loop, by retrieving value of each bit through a bit mask
3. Since the input parameter Â³inputÂ² to the function concretizeByte is
symbolic, Â³ifÂ² statement inside the function will cause forking: Â³thenÂ²
side will be for a given bit being Â³1Â², Â³elseÂ² side will be for a given
bit being Â³0Â²; in order to fork, we need to call the solver to know
whether the other side of the branch is feasible
4. Since we iterate over all bits of a variable and fork at each bit, what
you get as a result are all possible combinations of 0s and 1s (see point
5)
5. Important part here is that before we call concretizeByte there is an
Â³ifÂ² statement inside Â³mainÂ² that already puts some constraints on Â³xÂ²
and
Â³yÂ², so in concretizeByte not all combinations of bits will be possible
HTH,
Best regards,
Tomek

Post by Sylvain Gault
I have the feeling this solution is interesting but I don't understand
what happen for klee.
Sylvain Gault

remove

Post by Sylvain Gault

Post by Daniel Dunbar
it).
--
#include <stdio.h>
#include <klee/klee.h>
signed char concretizeByte(signed char input) {
unsigned char result = 0;
for (int i = 0; i != 8; ++i) {
if (input & (1<<i))
result |= 1<<i;
}
return result;
}
int main() {
signed char x, y;
klee_make_symbolic(&x, sizeof(x), "x");
klee_make_symbolic(&y, sizeof(y), "y");
if (56 * x + 72 * y == 40) {
printf("56 * (%d) + 72 * (%d) == 40\n", concretizeByte(x),
concretizeByte(y));
}
return 0;
}
--
will produce all possible solutions to the Diophantine equation 56 *

Post by Sylvain Gault

Post by Daniel Dunbar
+ 72 * y
= 40.
This probably won't be fast. :)
- Daniel
On Sat, Aug 2, 2014 at 2:34 PM, Sylvain Gault
Hello,
I would like to use klee as a front-end to an SMT solver.

Basically,

Post by Sylvain Gault

Post by Daniel Dunbar
- declare input as symbolic;
- write the computation in C;
- test whether the output is the one I want with an "if";
- make klee find an input that activate the "true" branch of the

if.

Post by Sylvain Gault

Post by Daniel Dunbar
And it works quite well, it produce a solution.
But then I would like to obtain all the input assignments that

would

Post by Sylvain Gault

Post by Daniel Dunbar
activate that same path.
Is there an easy way to do this?
I have the feeling that a hack-ish way to do this would be to get the
query to kleaver that activate that path (I think I saw an option for
that) and feed it repetidely to kleaver (or another SMT solver), each
time adding a clause negating the previous solutionS.
I think this is a hackish solution because the solver would

probably

Post by Sylvain Gault

Post by Daniel Dunbar
make the same computation over and over again. While outputing

the

Post by Sylvain Gault

Post by Daniel Dunbar
solution and backtracking to find the next one directly would probably
be a lot more efficient. But I may be wrong.
Thanks in advance.
Regards,
Sylvain Gault
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Daniel Liew

2014-09-12 10:49:00 UTC

Permalink

Just as a small note for recent versions of LLVM/Clang you could
probably use the "optnone" attribute [1] on the function to prevent
the optimizer from modifying the "concretizeByte()" function.

[1] http://clang.llvm.org/docs/AttributeReference.html#optnone-clang-optnone