Discussion:
memcmp generate too many paths
Sylvain Gault
2014-07-24 00:43:28 UTC
Permalink
Hello,

It seems that when a program use memcmp, klee generate as many path as
the array size since its loop can exit anytime. While all I'm interested
in is whether the arrays are equal or not.

For instance this code would generate 5 paths.

char a1[] = {0, 1, 2, 3};
char a2[sizeof(a1)];

klee_make_symbolic(a2, sizeof(a2), "a2");

if (memcmp(a1, a2, sizeof(a1)))
printf("Winner!\n");
else
printf("Loser!\n");


I understand that during the execution of memcmp, there are different
path depending on the loop iteration that exists. But once the
function exists, most of these path end in the exact same state.

But I'm only interested in the paths in my code. So is there a way to
make klee merge the test cases that differ only in the path through the
libc?

My real case has an array of 168 bytes. So a memcmp generate 168 paths
by itself. Which means that the code after it will be run 168 times for
the exact same result. This is a bit insane.


Sylvain Gault
Peter Collingbourne
2014-07-24 00:55:41 UTC
Permalink
Post by Sylvain Gault
Hello,
It seems that when a program use memcmp, klee generate as many path as
the array size since its loop can exit anytime. While all I'm interested
in is whether the arrays are equal or not.
For instance this code would generate 5 paths.
char a1[] = {0, 1, 2, 3};
char a2[sizeof(a1)];
klee_make_symbolic(a2, sizeof(a2), "a2");
if (memcmp(a1, a2, sizeof(a1)))
printf("Winner!\n");
else
printf("Loser!\n");
I understand that during the execution of memcmp, there are different
path depending on the loop iteration that exists. But once the
function exists, most of these path end in the exact same state.
But I'm only interested in the paths in my code. So is there a way to
make klee merge the test cases that differ only in the path through the
libc?
My real case has an array of 168 bytes. So a memcmp generate 168 paths
by itself. Which means that the code after it will be run 168 times for
the exact same result. This is a bit insane.
You can write your own version of memcmp which is data-independent, and
therefore cannot branch on symbolic data. Something like:

bool memequal(char *c1, char *c2, size_t len) {
bool result = true;
while (len) {
result &= *c1 == *c2;
c1++;
c2++;
len--;
}
return result;
}

Thanks,
--
Peter
Sylvain Gault
2014-07-24 02:14:31 UTC
Permalink
Post by Peter Collingbourne
Post by Sylvain Gault
Hello,
It seems that when a program use memcmp, klee generate as many path as
the array size since its loop can exit anytime. While all I'm interested
in is whether the arrays are equal or not.
For instance this code would generate 5 paths.
char a1[] = {0, 1, 2, 3};
char a2[sizeof(a1)];
klee_make_symbolic(a2, sizeof(a2), "a2");
if (memcmp(a1, a2, sizeof(a1)))
printf("Winner!\n");
else
printf("Loser!\n");
I understand that during the execution of memcmp, there are different
path depending on the loop iteration that exists. But once the
function exists, most of these path end in the exact same state.
But I'm only interested in the paths in my code. So is there a way to
make klee merge the test cases that differ only in the path through the
libc?
My real case has an array of 168 bytes. So a memcmp generate 168 paths
by itself. Which means that the code after it will be run 168 times for
the exact same result. This is a bit insane.
You can write your own version of memcmp which is data-independent, and
bool memequal(char *c1, char *c2, size_t len) {
bool result = true;
while (len) {
result &= *c1 == *c2;
c1++;
c2++;
len--;
}
return result;
}
Thanks. That's more or less what I found while waiting for a less
hackish solution. :)

Thanks.

Sylvain Gault
Chaoqiang Zhang
2014-07-24 01:00:02 UTC
Permalink
state merging is one of the techniques for addressing this problem, there
are several papers on it, one of them is:

http://dslab.epfl.ch/pubs/stateMerging.pdf
Post by Sylvain Gault
Hello,
It seems that when a program use memcmp, klee generate as many path as
the array size since its loop can exit anytime. While all I'm interested
in is whether the arrays are equal or not.
For instance this code would generate 5 paths.
char a1[] = {0, 1, 2, 3};
char a2[sizeof(a1)];
klee_make_symbolic(a2, sizeof(a2), "a2");
if (memcmp(a1, a2, sizeof(a1)))
printf("Winner!\n");
else
printf("Loser!\n");
I understand that during the execution of memcmp, there are different
path depending on the loop iteration that exists. But once the
function exists, most of these path end in the exact same state.
But I'm only interested in the paths in my code. So is there a way to
make klee merge the test cases that differ only in the path through the
libc?
My real case has an array of 168 bytes. So a memcmp generate 168 paths
by itself. Which means that the code after it will be run 168 times for
the exact same result. This is a bit insane.
Sylvain Gault
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Sylvain Gault
2014-07-24 02:17:03 UTC
Permalink
I don't know if this is implemented in klee. But anyway, it would just
be half of the solution since it would still generate 168 paths and thus
call the solver as many times.

Sylvain Gault
state merging is one of the techniques for addressing this problem, there are
http://dslab.epfl.ch/pubs/stateMerging.pdf 
Hello,
It seems that when a program use memcmp, klee generate as many path as
the array size since its loop can exit anytime. While all I'm interested
in is whether the arrays are equal or not.
For instance this code would generate 5 paths.
char a1[] = {0, 1, 2, 3};
char a2[sizeof(a1)];
klee_make_symbolic(a2, sizeof(a2), "a2");
if (memcmp(a1, a2, sizeof(a1)))
        printf("Winner!\n");
else
        printf("Loser!\n");
I understand that during the execution of memcmp, there are different
path depending on the loop iteration that exists. But once the
function exists, most of these path end in the exact same state.
But I'm only interested in the paths in my code. So is there a way to
make klee merge the test cases that differ only in the path through the
libc?
My real case has an array of 168 bytes. So a memcmp generate 168 paths
by itself. Which means that the code after it will be run 168 times for
the exact same result. This is a bit insane.
Sylvain Gault
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
_______________________________________________
klee-dev mailing list
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Loading...