Hongxu Chen
2013-11-12 16:10:39 UTC
Dear all,
We are doing some experiments with some determinism.
I find that there was at least 2 threads about it before:
1. Non-determinism in Klee(
http://keeda.stanford.edu/pipermail/klee-dev/2010-September/000470.html)
2. computing the coverage(
http://keeda.stanford.edu/pipermail/klee-dev/2010-March/000251.html)
Unfortunately I failed to fully understand them.
So here is what we've done:
(1) We basically follow the options at "coreutils experiments" page.
klee \
--simplify-sym-indices --write-cvcs --write-cov --output-module \
--max-memory=1000 --disable-inlining --optimize --use-forked-solver \
--use-cex-cache --libc=uclibc --posix-runtime \
--allow-external-sym-calls --only-output-states-covering-new \
--environ=test.env --run-in=/tmp/sandbox \
--max-sym-array-size=4096 --max-instruction-time=30. --max-time=3600. \
--watchdog --max-memory-inhibit=false --max-static-fork-pct=1 \
--max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal \
*--randomize-fork* *--search=random-path --search=nurs:covnew \ *
*--use-batching-search --batch-instructions=10000 \ *
./rm.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout
Firstly, we change the search strategy to DFS, i.e.
*--search=dfs*
But when tested with a slightly *modified **rm *case, we found that there
are
some HUGE differences for the running time: KLEE finds the error within
about
2400s for once, but about one day later it finds the exact error within
only 30s-50s!
*So is it a regular result*?
The only potential difference I can think out is: the machine I ran KLEE on
may be used
by other CPU-bound operations(but since I don't have priviledge to know the
details of the machine I cannot make sure) when KLEE took 2400s to file the
bug.
(2) Later in order to keep the results a bit more determinist, we also
1. discard "*--randomize-fork*"
2. discard "*--use-batching-search --batch-instructions=10000*"
So the final option we are using is
klee \
--simplify-sym-indices --write-cvcs --write-cov --output-module \
--max-memory=1000 --disable-inlining --optimize --use-forked-solver \
--use-cex-cache --libc=uclibc --posix-runtime \
--allow-external-sym-calls --only-output-states-covering-new \
--environ=test.env --run-in=/tmp/sandbox \
--max-sym-array-size=4096 --max-instruction-time=30. --max-time=3600. \
--watchdog --max-memory-inhibit=false --max-static-fork-pct=1 \
--max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal \
* --search=dfs* \
./rm.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout
However it seems that when running, there are still some time difference
even on a SINGLE machine(still mainly about the time; but it seems that
the time is still unstable. From what we observed,the longest time may
be bigger than 10% than the shortest one).
And for 2 machines that almost have the same power and system
configurations,
the running time difference is even bigger.
The counter example path condition also has several differences for
a simple test case(I only compared the diff of the xxx.pc files and notice
there are several changes but didn't get a better knowledge about the
semantics).
*Is it reasonable?*
(3) Also I tested with a script by running with a simple case:
This case is taken from one of the previous issues on GITHUB:
https://github.com/ccadar/klee/issues/50
Only the "main" function's signature has been changed to 2-args' version.
#include <assert.h>
#include <klee/klee.h>
const char *const errmsg[2] = {0, };
const char *get_error_message(int err) {
char const *x = errmsg[err];
return x;
}
int main(int argc, char** argv) {
int err;
klee_make_symbolic(&err, sizeof(err), "err");
get_error_message(err);
}
I ran it with a script like below:
while [ 1 ]
do
klee --search=dfs test.bc
sleep 10
done
shortest is 41.89s(TSolver: 48.22%).
*So is it common?*
Also I notice that when using a zero-args version of "main", the time will
be
much less; is it because the function function call "stack" or the
environment(but there is no posix-runtime here)?
Best regards,
Hongxu Chen
We are doing some experiments with some determinism.
I find that there was at least 2 threads about it before:
1. Non-determinism in Klee(
http://keeda.stanford.edu/pipermail/klee-dev/2010-September/000470.html)
2. computing the coverage(
http://keeda.stanford.edu/pipermail/klee-dev/2010-March/000251.html)
Unfortunately I failed to fully understand them.
So here is what we've done:
(1) We basically follow the options at "coreutils experiments" page.
klee \
--simplify-sym-indices --write-cvcs --write-cov --output-module \
--max-memory=1000 --disable-inlining --optimize --use-forked-solver \
--use-cex-cache --libc=uclibc --posix-runtime \
--allow-external-sym-calls --only-output-states-covering-new \
--environ=test.env --run-in=/tmp/sandbox \
--max-sym-array-size=4096 --max-instruction-time=30. --max-time=3600. \
--watchdog --max-memory-inhibit=false --max-static-fork-pct=1 \
--max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal \
*--randomize-fork* *--search=random-path --search=nurs:covnew \ *
*--use-batching-search --batch-instructions=10000 \ *
./rm.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout
Firstly, we change the search strategy to DFS, i.e.
*--search=dfs*
But when tested with a slightly *modified **rm *case, we found that there
are
some HUGE differences for the running time: KLEE finds the error within
about
2400s for once, but about one day later it finds the exact error within
only 30s-50s!
*So is it a regular result*?
The only potential difference I can think out is: the machine I ran KLEE on
may be used
by other CPU-bound operations(but since I don't have priviledge to know the
details of the machine I cannot make sure) when KLEE took 2400s to file the
bug.
(2) Later in order to keep the results a bit more determinist, we also
1. discard "*--randomize-fork*"
2. discard "*--use-batching-search --batch-instructions=10000*"
So the final option we are using is
klee \
--simplify-sym-indices --write-cvcs --write-cov --output-module \
--max-memory=1000 --disable-inlining --optimize --use-forked-solver \
--use-cex-cache --libc=uclibc --posix-runtime \
--allow-external-sym-calls --only-output-states-covering-new \
--environ=test.env --run-in=/tmp/sandbox \
--max-sym-array-size=4096 --max-instruction-time=30. --max-time=3600. \
--watchdog --max-memory-inhibit=false --max-static-fork-pct=1 \
--max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal \
* --search=dfs* \
./rm.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout
However it seems that when running, there are still some time difference
even on a SINGLE machine(still mainly about the time; but it seems that
the time is still unstable. From what we observed,the longest time may
be bigger than 10% than the shortest one).
And for 2 machines that almost have the same power and system
configurations,
the running time difference is even bigger.
The counter example path condition also has several differences for
a simple test case(I only compared the diff of the xxx.pc files and notice
there are several changes but didn't get a better knowledge about the
semantics).
*Is it reasonable?*
(3) Also I tested with a script by running with a simple case:
This case is taken from one of the previous issues on GITHUB:
https://github.com/ccadar/klee/issues/50
Only the "main" function's signature has been changed to 2-args' version.
#include <assert.h>
#include <klee/klee.h>
const char *const errmsg[2] = {0, };
const char *get_error_message(int err) {
char const *x = errmsg[err];
return x;
}
int main(int argc, char** argv) {
int err;
klee_make_symbolic(&err, sizeof(err), "err");
get_error_message(err);
}
I ran it with a script like below:
while [ 1 ]
do
klee --search=dfs test.bc
sleep 10
done
From the 306 results KLEE executed, the longest time is 76.88s(50.15%) and
theshortest is 41.89s(TSolver: 48.22%).
*So is it common?*
Also I notice that when using a zero-args version of "main", the time will
be
much less; is it because the function function call "stack" or the
environment(but there is no posix-runtime here)?
Best regards,
Hongxu Chen