Discussion:
using klee with Siemens Benchmarks
Urmas Repinski
2013-03-06 12:31:34 UTC
Permalink
Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

Urmas Repinski
Paul Marinescu
2013-03-06 21:19:54 UTC
Permalink
Hello Urmas,
It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy.

KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is:

--------------------------------------------------------------------------
| Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |
--------------------------------------------------------------------------
| klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |
--------------------------------------------------------------------------

From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned.

Paul

On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote:

> Hello.
>
> My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.
>
> I am trying to generate inputs for Siemens Benchmarks with KLEE.
>
> Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
> They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/
>
> I am testing KLEE with tcas design, it is smallest and simplest.
>
> Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
> But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).
>
> I want to compare Siemens inputs with inputs, generated by klee.
>
> I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.
>
> Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.
>
> When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).
>
> urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
> urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
> KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
> KLEE: output directory = "klee-out-143"
> KLEE: WARNING: undefined reference to function: fwrite
> KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
> KLEE: WARNING ONCE: calling __user_main with extra arguments.
>
> KLEE: done: total instructions = 14790
> KLEE: done: completed paths = 45
> KLEE: done: generated tests = 45
>
>
> This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.
>
> Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
> Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41
>
> Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
> I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
> Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.
>
> Urmas Repinski
> <INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
> klee-dev mailing list
> klee-dev-AQ/***@public.gmane.org
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Urmas Repinski
2013-03-07 05:06:48 UTC
Permalink
Hello.

Yes.

So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.

The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.

Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.

It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.

I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.

Maybe this will make possible to improve input generation and make klee more usable in actual industry.

Urmas Repinski

Subject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Wed, 6 Mar 2013 21:19:54 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

Hello Urmas,It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy.
KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is:
--------------------------------------------------------------------------| Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |--------------------------------------------------------------------------| klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |--------------------------------------------------------------------------
From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned.
Paul
On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote:





Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

Urmas Repinski


<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Paul Marinescu
2013-03-07 10:52:56 UTC
Permalink
1. How do you define 'design structure'?
2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like

if (input = 100) {
//something
}

You would need to generate 2^sizeof(input) inputs to take into account all possible mutations to the literal 100 using regression testing.

Paul


On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org> wrote:

> Hello.
>
> Yes.
>
> So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.
>
> The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.
>
> Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.
>
> It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.
>
> I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.
>
> Maybe this will make possible to improve input generation and make klee more usable in actual industry.
>
> Urmas Repinski
>
> Subject: Re: [klee-dev] using klee with Siemens Benchmarks
> From: paul.marinescu-AQ/***@public.gmane.org
> Date: Wed, 6 Mar 2013 21:19:54 +0000
> CC: klee-dev-AQ/***@public.gmane.org
> To: urrimus-***@public.gmane.org
>
> Hello Urmas,
> It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy.
>
> KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is:
>
> --------------------------------------------------------------------------
> | Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |
> --------------------------------------------------------------------------
> | klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |
> --------------------------------------------------------------------------
>
> From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned.
>
> Paul
>
> On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote:
>
> Hello.
>
> My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.
>
> I am trying to generate inputs for Siemens Benchmarks with KLEE.
>
> Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
> They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/
>
> I am testing KLEE with tcas design, it is smallest and simplest.
>
> Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
> But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).
>
> I want to compare Siemens inputs with inputs, generated by klee.
>
> I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.
>
> Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.
>
> When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).
>
> urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
> urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
> KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
> KLEE: output directory = "klee-out-143"
> KLEE: WARNING: undefined reference to function: fwrite
> KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
> KLEE: WARNING ONCE: calling __user_main with extra arguments.
>
> KLEE: done: total instructions = 14790
> KLEE: done: completed paths = 45
> KLEE: done: generated tests = 45
>
>
> This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.
>
> Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
> Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41
>
> Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
> I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
> Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.
>
> Urmas Repinski
> <INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
> klee-dev mailing list
> klee-dev-AQ/***@public.gmane.org
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
>
> _______________________________________________
> klee-dev mailing list
> klee-dev-AQ/***@public.gmane.org
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Urmas Repinski
2013-03-07 12:34:23 UTC
Permalink
Yes, Thank You.

1. How do you define 'design structure'?

This depends on tool what i am using for test generation. KLEE uses symbolic states and branches, i have also design's model - parsed design representation using hammock graph structure, any tool what generates inputs will have own design representation, and generated tests will depend on the design's representation's structure, more detailed model will generate more complete inputs.

2. You might be criticising
an apple for not being a good orange. KLEE is designed to tests the
program it's being executed on, and if one also wants to test 41
variants, he/she will run KLEE again on the variants. I'm not sure what
you would expect, even in a simple scenario like ....


Regression testing requires 2 versions of the design - correct one and faulty one, and generates inputs, based on detecting behavioral differences between two versions of the design, if i am not mistaken.

I am, from the other hand, want to use only one version of the design, and to generate as complete inputs as possible.

KLEE is pretty fast and reliable in terms of coverage, but i need to make inputs detect various states on the program, various errors.

According to http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/ in Siemens Benchmarks Regression testing were used, all faulty versions of the design were processed and complete set of inputs, that detect already known errors were generated.

This works well if i have set of faulty versions, but what if not, if i have only one version of the design and want to generate complete inputs for it?

Any help will be appreciated,
Urmas Repinski

Subject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Thu, 7 Mar 2013 10:52:56 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

1. How do you define 'design structure'?2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like
if (input = 100) { //something}
You would need to generate 2^sizeof(input) inputs to take into account all possible mutations to the literal 100 using regression testing.
Paul

On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org> wrote:Hello.

Yes.

So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.

The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.

Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.

It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.

I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.

Maybe this will make possible to improve input generation and make klee more usable in actual industry.

Urmas Repinski

Subject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Wed, 6 Mar 2013 21:19:54 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

Hello Urmas,It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy.
KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is:
--------------------------------------------------------------------------| Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |--------------------------------------------------------------------------| klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |--------------------------------------------------------------------------
From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned.
Paul
On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote:Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

Urmas Repinski
<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Urmas Repinski
2013-03-07 14:14:50 UTC
Permalink
Hi, David.

(1) You appear to be investigating the effectiveness of test case generation tools. (ie fault detection effectiveness)Is this a correct conclusion?
The problem is that i were working with Error Localization and Error Corrcetion of DIAMOND FP7 project (http://fp7-diamond.eu/).
Papers will be soon ready, Model-Based Error Localization and Mutation-Based Error Correction.
But i have much time of my PhD studies left, and the superviser asked me to investigate KLEE and i think i can investigate also Error Generation possibilities using the FORENSIC tool, that were developed within contents of the project.
Tool - http://informatik.uni-bremen.de/agra/eng/forensic.php

(2) You
also appear to be attempting to identify a minimal set of regression
test (when you have multiple instances in the history of the program)
Is
comparing programs as you intend even feasible (ie time O(number of
differences), O( polynomial-expression(number of differences) ) or
O(exponential-expression(number of differences) )?
No, i have set of designs - siemens benchmarks, and planning to make experiments only with them, they are all very different (text processing, logical-mathematical operations, Compound C data structures, file processing) and surely complicated enough and suitable for experiments with Error Localization, Error Correction and possible Test generation algorithms.
I do not want to go deeper into code complexity, this is a little different direction i think.


(3) You also appear to be investigation fault isolation strategies (“..but i need to make inputs detect various states on the program, various errors.”)
That is a much harder problemYes, this is what i want to investigate and implements with Test Generation tool.

So i have now two possibilities - i can install KLEE source on my machine and try to extend it, to make possible to generate at least something close to MC/DC test complexity, or
I can start implementing test generation from scratch on the FORENSIC tool, to re-implement known algorithms and to make experiments.

First approach seems more reasonable, SMT-based systems are widely used for Test Generation.

So at the moment i am planning to go deeper in klee and i think it possible to add paths, that differ not only by their activated components - that result high coverage, but by the entire values of the activated variables also - this will make inputs closer to MC/DC.

Please tell me what do you think about this?
Urmas Repinski.

From: david.lightstone-***@public.gmane.org
To: urrimus-***@public.gmane.org
CC: klee-dev-bounces-AQ/***@public.gmane.org
Subject: RE: [klee-dev] using klee with Siemens Benchmarks
Date: Thu, 7 Mar 2013 08:35:15 -0500

Urmas Repinski
As I previously indicated KLEE is not capable of achieving MCDC coverage. That should be enough of an valuation for purposes to test effectiveness (2) You also appear to be attempting to identify a minimal set of regression test (when you have multiple instances in the history of the program)
Is comparing programs as you intend even feasible (ie time O(number of differences), O( polynomial-expression(number of differences) ) or O(exponential-expression(number of differences) )? (3) You also appear to be investigation fault isolation strategies (“..but i need to make inputs detect various states on the program, various errors.”)
That is a much harder problem Dave Lightstone


From: klee-dev-bounces-AQ/***@public.gmane.org [mailto:klee-dev-***@imperial.ac.uk] On Behalf Of Urmas Repinski
Sent: Thursday, March 07, 2013 7:34 AM
To: klee
Subject: Re: [klee-dev] using klee with Siemens Benchmarks Yes, Thank You.

1. How do you define 'design structure'?

This depends on tool what i am using for test generation. KLEE uses symbolic states and branches, i have also design's model - parsed design representation using hammock graph structure, any tool what generates inputs will have own design representation, and generated tests will depend on the design's representation's structure, more detailed model will generate more complete inputs.2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like ....
Regression testing requires 2 versions of the design - correct one and faulty one, and generates inputs, based on detecting behavioral differences between two versions of the design, if i am not mistaken.

I am, from the other hand, want to use only one version of the design, and to generate as complete inputs as possible.

KLEE is pretty fast and reliable in terms of coverage, but i need to make inputs detect various states on the program, various errors.

According to http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/ in Siemens Benchmarks Regression testing were used, all faulty versions of the design were processed and complete set of inputs, that detect already known errors were generated.

This works well if i have set of faulty versions, but what if not, if i have only one version of the design and want to generate complete inputs for it?

Any help will be appreciated,
Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Thu, 7 Mar 2013 10:52:56 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

1. How do you define 'design structure'?2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like if (input = 100) { //something} You would need to generate 2^sizeof(input) inputs to take into account all possible mutations to the literal 100 using regression testing. Paul On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org> wrote: Hello.

Yes.

So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.

The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.

Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.

It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.

I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.

Maybe this will make possible to improve input generation and make klee more usable in actual industry.

Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Wed, 6 Mar 2013 21:19:54 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

Hello Urmas,It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy. KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is: --------------------------------------------------------------------------| Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |--------------------------------------------------------------------------| klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |-------------------------------------------------------------------------- From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned. Paul On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote: Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

Urmas Repinski<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev _______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Urmas Repinski
2013-03-07 15:48:10 UTC
Permalink
Hi David.

GIMPLE Plug-in for gcc compiler provides abstract syntax three with no more than 3 operands, that were used for parsing of the design into Forensic model.
http://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html

Works perfectly, structure, obtained by GIMPLE plug-in of the gcc compiler simply copied into the model, and viola - parsed representation of the design is ready (not so simple of course, but principle is this). So i do not need CIL for front-end parser, it is implemented perfectly using GIMPLE plug-in, you can refer to http://informatik.uni-bremen.de/agra/eng/forensic.php manual.

Implemented in Germany i must say, and very well.

KLEE as back-end - yes, this is i what i intend to reach - test generation back-end.

You say that KLEE does not support floating-points, but it supports strings (at least as processed design arguments), i can make symbolic string, and later transform it to float in my design, KLEE should find suggestions i hope.

YES, one more thing -
Maybe it is easier to works with strings - i can make all arguments as file inputs and program arguments, and make them symbolic, and try to evaluate using klee, i do not have to care about anything else if this works.

But this does not, i tried to make arguments symbolic and to execute them using klee, klee calculated 1 day with no result, when i setted time limit i got garbage instead of arguments.

maybe using klee_assume() it is possible to say that argument is integer, or float, and simply process all code with klee?

the result will be integer and float numbers.....

is there any function like klee_assume_int(char *) or klee_assume_float(char *)?

This can be solution in my case, and maybe there will be MC/DC inputs generated using string arguments with KLEE?

What do you think about this idea?

Necessary just add those 2 functions, and assume that klee_assume_int will generate (-24....(min int) +24....(max int)), same with floats.

I do not care about the computation time if result will be what i need.

Urmas Repinski.


From: david.lightstone-***@public.gmane.org
To: urrimus-***@public.gmane.org
CC: klee-dev-bounces-AQ/***@public.gmane.org
Subject: RE: [klee-dev] using klee with Siemens Benchmarks
Date: Thu, 7 Mar 2013 10:16:00 -0500

Urmas I am not in the academic world, so my perspective on things should be checked with your thesis advisor There are occasions when participating in the development of a tool is appropriate. There are occasions when integrating it with other tools is more appropriate.There is a significant effort needed to understand KLEE well enough to modify it. When you are done with the modifications you will at best have something that is appropriate only for integers.(KLEE does not support Floating point. ). For your purposes transforming the procedures into something more appropriately analyzed by KLEE will be easier. That is why I indicated the existence of CIL ( http://www.eecs.berkeley.edu/~necula/cil/ )Somewhere in my collection of experiments I have done enough to convince myself that the transformation strategy is feasible. (ie for a trivial problem, MCDC coverage was achieved)Not being sufficiently familiar with the limitations upon CIL I cannot conclude that it will be an effective strategy. In the event that it fails the fall back transformation strategy will be Rose ( http://rosecompiler.org/ )This like KLEE is a work in progress. ( 1 ) With respect to MCDCThe FAA (here in the US) has blessed MCDC as being adequate. Read the CAST-10 paper at ( http://www.faa.gov/aircraft/air_cert/design_approvals/air_software/cast/cast_papers/ ) Decide for yourself whether it is or is not adequate.Consider 2 procedures each implementing the same function. Answer for yourself the question – will they have the same MCDC test cases? Consider the trivial function int f(x) = x I’m sure you will be able to find an implementation containing no conditionals, and one contrived to contain conditionals. That should tell you everything you possible need to know about MCDC as an evaluation criteria. It really (strictly opinion) is not intended to find bugs in the implementation, rather it is an implementation sanity check, intended to establish that the compiler did not introduce any errors. Myself, I would use CIL as a front end to Forensic and KLEE as another backend Dave Lightstone From: klee-dev-bounces-AQ/***@public.gmane.org [mailto:klee-dev-bounces-AQ/***@public.gmane.org] On Behalf Of Urmas Repinski
Sent: Thursday, March 07, 2013 9:15 AM
To: klee
Subject: Re: [klee-dev] using klee with Siemens Benchmarks Hi, David.(1) You appear to be investigating the effectiveness of test case generation tools. (ie fault detection effectiveness)Is this a correct conclusion?
The problem is that i were working with Error Localization and Error Corrcetion of DIAMOND FP7 project (http://fp7-diamond.eu/).
Papers will be soon ready, Model-Based Error Localization and Mutation-Based Error Correction.
But i have much time of my PhD studies left, and the superviser asked me to investigate KLEE and i think i can investigate also Error Generation possibilities using the FORENSIC tool, that were developed within contents of the project.
Tool - http://informatik.uni-bremen.de/agra/eng/forensic.phpstart(2) You also appear to be attempting to identify a minimal set of regression test (when you have multiple instances in the history of the program)
Is comparing programs as you intend even feasible (ie time O(number of differences), O( polynomial-expression(number of differences) ) or O(exponential-expression(number of differences) )?
No, i have set of designs - siemens benchmarks, and planning to make experiments only with them, they are all very different (text processing, logical-mathematical operations, Compound C data structures, file processing) and surely complicated enough and suitable for experiments with Error Localization, Error Correction and possible Test generation algorithms.
I do not want to go deeper into code complexity, this is a little different direction i think.

(3) You also appear to be investigation fault isolation strategies (“..but i need to make inputs detect various states on the program, various errors.”)
That is a much harder problemYes, this is what i want to investigate and implements with Test Generation tool. So i have now two possibilities - i can install KLEE source on my machine and try to extend it, to make possible to generate at least something close to MC/DC test complexity, or
I can start implementing test generation from scratch on the FORENSIC tool, to re-implement known algorithms and to make experiments.

First approach seems more reasonable, SMT-based systems are widely used for Test Generation.

So at the moment i am planning to go deeper in klee and i think it possible to add paths, that differ not only by their activated components - that result high coverage, but by the entire values of the activated variables also - this will make inputs closer to MC/DC.

Please tell me what do you think about this?
Urmas Repinski. From: david.lightstone-***@public.gmane.org
To: urrimus-***@public.gmane.org
CC: klee-dev-bounces-AQ/***@public.gmane.org
Subject: RE: [klee-dev] using klee with Siemens Benchmarks
Date: Thu, 7 Mar 2013 08:35:15 -0500Urmas Repinski As I previously indicated KLEE is not capable of achieving MCDC coverage. That should be enough of an valuation for purposes to test effectiveness (2) You also appear to be attempting to identify a minimal set of regression test (when you have multiple instances in the history of the program)
Is comparing programs as you intend even feasible (ie time O(number of differences), O( polynomial-expression(number of differences) ) or O(exponential-expression(number of differences) )? (3) You also appear to be investigation fault isolation strategies (“..but i need to make inputs detect various states on the program, various errors.”)
That is a much harder problem Dave Lightstone

From: klee-dev-bounces-AQ/***@public.gmane.org [mailto:klee-dev-***@imperial.ac.uk] On Behalf Of Urmas Repinski
Sent: Thursday, March 07, 2013 7:34 AM
To: klee
Subject: Re: [klee-dev] using klee with Siemens Benchmarks Yes, Thank You.

1. How do you define 'design structure'?

This depends on tool what i am using for test generation. KLEE uses symbolic states and branches, i have also design's model - parsed design representation using hammock graph structure, any tool what generates inputs will have own design representation, and generated tests will depend on the design's representation's structure, more detailed model will generate more complete inputs.2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like ....
Regression testing requires 2 versions of the design - correct one and faulty one, and generates inputs, based on detecting behavioral differences between two versions of the design, if i am not mistaken.

I am, from the other hand, want to use only one version of the design, and to generate as complete inputs as possible.

KLEE is pretty fast and reliable in terms of coverage, but i need to make inputs detect various states on the program, various errors.

According to http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/ in Siemens Benchmarks Regression testing were used, all faulty versions of the design were processed and complete set of inputs, that detect already known errors were generated.

This works well if i have set of faulty versions, but what if not, if i have only one version of the design and want to generate complete inputs for it?

Any help will be appreciated,
Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Thu, 7 Mar 2013 10:52:56 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

1. How do you define 'design structure'?2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like if (input = 100) { //something} You would need to generate 2^sizeof(input) inputs to take into account all possible mutations to the literal 100 using regression testing. Paul On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org> wrote: Hello.

Yes.

So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.

The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.

Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.

It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.

I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.

Maybe this will make possible to improve input generation and make klee more usable in actual industry.

Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Wed, 6 Mar 2013 21:19:54 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

Hello Urmas,It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy. KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is: --------------------------------------------------------------------------| Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |--------------------------------------------------------------------------| klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |-------------------------------------------------------------------------- From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned. Paul On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote: Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

Urmas Repinski<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev _______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Urmas Repinski
2013-03-07 16:06:31 UTC
Permalink
Hi David.

No-no lets leave regression testing for now.

Adding klee_assume_int(char*) and klee_assume_float(char*) into klee command can really return some impressive results, will use only C arguments as strings, and will generate possibly much higher amount of test cases in this case.

What i got when i used design symbolic arguments for klee i got a lot of garbage in inputs, but there were some integer numbers also.

Later on in my design i use atoi(...) C function, get integers, and will process them as usual variables.

Klee will generate integers as strings, and possibly result will be what i need.

I just didn't found klee_assume_int(char *) and klee_assume_float(char*) in klee library, but actually i can create them by myself, and try to evaluate klee with design's symbolic string arguments.

I think this is good idea, i have to try it for now.

Thank You very much David,
Urmas Repinski

From: david.lightstone-***@public.gmane.org
To: urrimus-***@public.gmane.org
Subject: RE: [klee-dev] using klee with Siemens Benchmarks
Date: Thu, 7 Mar 2013 10:54:13 -0500

Urmas With respect to (2) you did not understand my intent By minimal set of regression tests I meant the following. Consider a test suite, that was used to test a program. The program has gone into distribution. Subsequently, someone asks for a change to that program. Consider the questions –(1) Which tests do not need to be re-run?(2) Which tests need to be re-run? (complement of first question)(3) Which tests will fail because of the change?(4) Which tests need to be added? The collection of tests which need to be re-run, added or corrected is the minimal regression suite (in some sense) My guess that these questions are equivalent to Error Localization. That is if there are errors, created as a consequence to the change how will they manifest themselves Dave Lightstone From: klee-dev-bounces-AQ/***@public.gmane.org [mailto:klee-dev-bounces-AQ/***@public.gmane.org] On Behalf Of Urmas Repinski
Sent: Thursday, March 07, 2013 9:15 AM
To: klee
Subject: Re: [klee-dev] using klee with Siemens Benchmarks Hi, David.(1) You appear to be investigating the effectiveness of test case generation tools. (ie fault detection effectiveness)Is this a correct conclusion?
The problem is that i were working with Error Localization and Error Corrcetion of DIAMOND FP7 project (http://fp7-diamond.eu/).
Papers will be soon ready, Model-Based Error Localization and Mutation-Based Error Correction.
But i have much time of my PhD studies left, and the superviser asked me to investigate KLEE and i think i can investigate also Error Generation possibilities using the FORENSIC tool, that were developed within contents of the project.
Tool - http://informatik.uni-bremen.de/agra/eng/forensic.php(2) You also appear to be attempting to identify a minimal set of regression test (when you have multiple instances in the history of the program)
Is comparing programs as you intend even feasible (ie time O(number of differences), O( polynomial-expression(number of differences) ) or O(exponential-expression(number of differences) )?
No, i have set of designs - siemens benchmarks, and planning to make experiments only with them, they are all very different (text processing, logical-mathematical operations, Compound C data structures, file processing) and surely complicated enough and suitable for experiments with Error Localization, Error Correction and possible Test generation algorithms.
I do not want to go deeper into code complexity, this is a little different direction i think.

(3) You also appear to be investigation fault isolation strategies (“..but i need to make inputs detect various states on the program, various errors.”)
That is a much harder problemYes, this is what i want to investigate and implements with Test Generation tool. So i have now two possibilities - i can install KLEE source on my machine and try to extend it, to make possible to generate at least something close to MC/DC test complexity, or
I can start implementing test generation from scratch on the FORENSIC tool, to re-implement known algorithms and to make experiments.

First approach seems more reasonable, SMT-based systems are widely used for Test Generation.

So at the moment i am planning to go deeper in klee and i think it possible to add paths, that differ not only by their activated components - that result high coverage, but by the entire values of the activated variables also - this will make inputs closer to MC/DC.

Please tell me what do you think about this?
Urmas Repinski. From: david.lightstone-***@public.gmane.org
To: urrimus-***@public.gmane.org
CC: klee-dev-bounces-AQ/***@public.gmane.org
Subject: RE: [klee-dev] using klee with Siemens Benchmarks
Date: Thu, 7 Mar 2013 08:35:15 -0500Urmas Repinski As I previously indicated KLEE is not capable of achieving MCDC coverage. That should be enough of an valuation for purposes to test effectiveness (2) You also appear to be attempting to identify a minimal set of regression test (when you have multiple instances in the history of the program)
Is comparing programs as you intend even feasible (ie time O(number of differences), O( polynomial-expression(number of differences) ) or O(exponential-expression(number of differences) )? (3) You also appear to be investigation fault isolation strategies (“..but i need to make inputs detect various states on the program, various errors.”)
That is a much harder problem Dave Lightstone

From: klee-dev-bounces-AQ/***@public.gmane.org [mailto:klee-dev-***@imperial.ac.uk] On Behalf Of Urmas Repinski
Sent: Thursday, March 07, 2013 7:34 AM
To: klee
Subject: Re: [klee-dev] using klee with Siemens Benchmarks Yes, Thank You.

1. How do you define 'design structure'?

This depends on tool what i am using for test generation. KLEE uses symbolic states and branches, i have also design's model - parsed design representation using hammock graph structure, any tool what generates inputs will have own design representation, and generated tests will depend on the design's representation's structure, more detailed model will generate more complete inputs.2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like ....
Regression testing requires 2 versions of the design - correct one and faulty one, and generates inputs, based on detecting behavioral differences between two versions of the design, if i am not mistaken.

I am, from the other hand, want to use only one version of the design, and to generate as complete inputs as possible.

KLEE is pretty fast and reliable in terms of coverage, but i need to make inputs detect various states on the program, various errors.

According to http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/ in Siemens Benchmarks Regression testing were used, all faulty versions of the design were processed and complete set of inputs, that detect already known errors were generated.

This works well if i have set of faulty versions, but what if not, if i have only one version of the design and want to generate complete inputs for it?

Any help will be appreciated,
Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Thu, 7 Mar 2013 10:52:56 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

1. How do you define 'design structure'?2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like if (input = 100) { //something} You would need to generate 2^sizeof(input) inputs to take into account all possible mutations to the literal 100 using regression testing. Paul On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org> wrote: Hello.

Yes.

So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.

The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.

Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.

It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.

I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.

Maybe this will make possible to improve input generation and make klee more usable in actual industry.

Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: paul.marinescu-AQ/***@public.gmane.org
Date: Wed, 6 Mar 2013 21:19:54 +0000
CC: klee-dev-AQ/***@public.gmane.org
To: urrimus-***@public.gmane.org

Hello Urmas,It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy. KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is: --------------------------------------------------------------------------| Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |--------------------------------------------------------------------------| klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |-------------------------------------------------------------------------- From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned. Paul On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote: Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.
They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

Urmas Repinski<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev _______________________________________________
klee-dev mailing list
klee-dev-AQ/***@public.gmane.org
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Michael Stone
2013-03-07 12:44:21 UTC
Permalink
On Mar 7, 2013 5:53 AM, "Paul Marinescu"
<paul.marinescu-AQ/***@public.gmane.org<javascript:_e({}, 'cvml',
'paul.marinescu-AQ/***@public.gmane.org');>>
wrote:
>
> 1. How do you define 'design structure'?
> 2. You might be criticising an apple for not being a good orange. KLEE is
designed to tests the program it's being executed on, and if one also wants
to test 41 variants, he/she will run KLEE again on the variants. I'm not
sure what you would expect, even in a simple scenario like
>
> if (input = 100) {
> //something
> }
>
> You would need to generate 2^sizeof(input) inputs to take into account
all possible mutations to the literal 100 using regression testing.
>

Urmas,

To expand a bit on Paul's first question to try to improve my own
understanding...

As I understand things, the point of the MC/DC coverage criterion as
typically applied to the certification of life-critical airborne systems is
that, given a good spec, one can check the adequacy of a set of test cases
for a given implementation by checking that the test cases achieve 100%
coverage of both the requirements and the implementation logic and state
vector.

In this world, MC/DC is helpful because it finds many bugs with a number of
test cases that grows linearly, rather than exponentially, in the number
of, e.g., conditional branches and bits of program state.

The work that you are examining seems, from casual inspection, to be about
assessing the claim that MC/DC (and related adequacy criteria) really are
adequate. (Urmas: did I get this right?)

As a result, as Paul somewhat tersely replied: what's the executable spec
-- i.e., the "true set of requirements" from which MC/DC or other test
cases might be generated?

Regards,

Michael

P.S. - also, have you played with using Klee to check equivalence or
bisimulation of two programs?

(Here's an old post that I wrote up to quickly illustrate one approach:

http://mstone.info/posts/klee_quickcheck/

)


> Paul
>
>
> On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org<javascript:_e({}, 'cvml', 'urrimus-***@public.gmane.org');>>
wrote:
>
>> Hello.
>>
>> Yes.
>>
>> So KLEE takes into account only the coverage, if all paths are covered,
but does not take into account entire design structure.
>>
>> The errors provided by siemens benchmarks are in the operators (+ -> -,
> -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even
if corresponding nodes in the design's model representation are activated,
error is not propagated to the output.
>>
>> Its a pity, if it is possible to test the entire structure with klee
inputs (activate different bits in variables with inputs, confirm that
operators values take influence on the output) then klee generated inputs
will be more useful for practical imlementation.
>>
>> It is possible to investigate error types, found in siemens benchmarks,
and improve test generation with klee.
>>
>> I am writing one article about the errors in the designs at the moment,
it will be published in the september, and then i will send it to the list
too.
>>
>> Maybe this will make possible to improve input generation and make klee
more usable in actual industry.
>>
>> Urmas Repinski
>>
>> ________________________________
>> Subject: Re: [klee-dev] using klee with Siemens Benchmarks
>> From: paul.marinescu-AQ/***@public.gmane.org <javascript:_e({}, 'cvml',
'paul.marinescu-AQ/***@public.gmane.org');>
>> Date: Wed, 6 Mar 2013 21:19:54 +0000
>> CC: klee-dev-AQ/***@public.gmane.org <javascript:_e({}, 'cvml',
'klee-dev-AQ/***@public.gmane.org');>
>> To: urrimus-***@public.gmane.org <javascript:_e({}, 'cvml',
'urrimus-***@public.gmane.org');>
>>
>> Hello Urmas,
>> It's not that clear what you're trying to do but I assume you want to
use the inputs generated by KLEE on one program version to test 41
variants. I'm afraid this might not be that easy.
>>
>> KLEE actually gets nearly 100% statement and branch coverage on the
original program ('nearly' because there's some unreachable code). You
should run it without the POSIX runtime and uclibc because you're not using
them anyway. What you get is:
>>
>>
--------------------------------------------------------------------------
>> | Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%)
|
>>
--------------------------------------------------------------------------
>> | klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92
|
>>
--------------------------------------------------------------------------
>>
>> From what I can tell, the path coverage is also 100%, counting just
feasible paths, so there are no other inputs to generate as far as KLEE is
concerned.
>>
>> Paul
>>
>> On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org<javascript:_e({}, 'cvml', 'urrimus-***@public.gmane.org');>>
wrote:
>>
>>> Hello.
>>>
>>> My name is Urmas Repinski, PhD student in Tallinn University of
Technology, Estonia.
>>>
>>> I am trying to generate inputs for Siemens Benchmarks with KLEE.
>>>
>>> Siemens Benchmarks - C designs with various erroneous versions for
testing error localization and error correction in C designs.
>>> They can be located and downloaded from
http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/
>>>
>>> I am testing KLEE with tcas design, it is smallest and simplest.
>>>
>>> Siemens benchmarks provide their inputs, tcas/script/runall.sh script
has 1607 lines of execution of the design, and it is easy extract 1607
various inputs for the design.
>>> But i have no idea how the inputs were generated, attaching siemens
benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).
>>>
>>> I want to compare Siemens inputs with inputs, generated by klee.
>>>
>>> I had installed klee as it is written in documentation -
http://klee.llvm.org/GetStarted.html - with uclibc support.
>>>
>>> Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint
12 uses too new version of gcc, and this generated error then used
llvm-gcc, but ok, this error solved.
>>>
>>> When i take tcas/v1/tcas.c design, adding corresponding modification to
generate inputs with klee (tcas_original.c and tcas_modified.c are attached
to the letter), i get 45 inputs generated, and after modifying klee output
to suitable format i get klee outputs (file KLEE_OUTPUT_arg).
>>>
>>> urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
>>> urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o
>>> KLEE: NOTE: Using model:
/home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
>>> KLEE: output directory = "klee-out-143"
>>> KLEE: WARNING: undefined reference to function: fwrite
>>> KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
>>> KLEE: WARNING ONCE: calling __user_main with extra arguments.
>>>
>>> KLEE: done: total instructions = 14790
>>> KLEE: done: completed paths = 45
>>> KLEE: done: generated tests = 45
>>>
>>>
>>> This outputs have same coverage than Siemens Inputs, but most of errors
in erroneous designs are simply not detected, while Simenes inputs detect
all errors.
>>>
>>> Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors -
8/41
>>> Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607,
Detected Erroneous designs - 41/41
>>>
>>> Maybe i have something wrong with KLEE arguments when i execute klee,
can somebody help me with right klee execution?
>>> I had tested klee with various options, but i still have 45 generated
inputs. Is it possible to increase somehow number of generated inputs with
klee?
>>> Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite
aborts executions somewhere, but there is no fwrite function in tcas design.
>>>
>>> Urmas Repinski
>>>
<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
>>> klee-dev mailing list
>>> klee-dev-AQ/***@public.gmane.org <javascript:_e({}, 'cvml',
'klee-dev-AQ/***@public.gmane.org');>
>>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
>>
>>
>> _______________________________________________
>> klee-dev mailing list
>> klee-dev-AQ/***@public.gmane.org <javascript:_e({}, 'cvml',
'klee-dev-AQ/***@public.gmane.org');>
>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
>
>
>
> _______________________________________________
> klee-dev mailing list
> klee-dev-AQ/***@public.gmane.org <javascript:_e({}, 'cvml',
'klee-dev-AQ/***@public.gmane.org');>
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
>
Urmas Repinski
2013-03-07 15:02:40 UTC
Permalink
Hi Michael.

The work that you are examining seems, from casual inspection, to be
about assessing the claim that MC/DC (and related adequacy criteria)
really are adequate. (Urmas: did I get this right?)

I want to be able to generate inputs that detect errors at least in Siemens Benchmarks without additional usage already known faulty versions and regression testing.
So yes, i think MC/DC can be good start, but later it will be possible to reduce the generated inputs.

As a result, as Paul somewhat tersely replied: what's the executable
spec -- i.e., the "true set of requirements" from which MC/DC or
other test cases might be generated?

I do not need spec for test generation, i have one design and want to get as much different inputs from it as possible, MC/DC criteria, as i already told, is a good start.
Using this inputs all other errors in my experimental designs should be detected.
I see no need for specification at test generation stage, specification is necessary for Error Localization and Error Correction also.

P.S. - also, have you played with using Klee to check equivalence or bisimulation of two programs?

This is something from formal verification, equivalence checking, again, this is slightly different direction from this, that i intend to apply.

The problem that if i will not the task concrete - MC/DC inputs for one design, using KLEE or without - then i can waste much time with no experimental result, this is the thing i want to avoid.

With regards,
Urmas Repinski

Date: Thu, 7 Mar 2013 07:44:21 -0500
From: michael.r.stone-***@public.gmane.org
To: paul.marinescu-AQ/***@public.gmane.org
CC: klee-dev-AQ/***@public.gmane.org
Subject: [klee-dev] using klee with Siemens Benchmarks



On Mar 7, 2013 5:53 AM, "Paul Marinescu" <paul.marinescu-AQ/***@public.gmane.org> wrote:

>

> 1. How do you define 'design structure'?

> 2. You might be criticising an apple for not being a good orange. KLEE is designed to tests the program it's being executed on, and if one also wants to test 41 variants, he/she will run KLEE again on the variants. I'm not sure what you would expect, even in a simple scenario like



>

> if (input = 100) {

> //something

> }

>

> You would need to generate 2^sizeof(input) inputs to take into account all possible mutations to the literal 100 using regression testing.

>Urmas,To expand a bit on Paul's first question to try to improve my own understanding...

As I understand things, the point of the MC/DC coverage criterion as typically applied to the certification of life-critical airborne systems is that, given a good spec, one can check the adequacy of a set of test cases for a given implementation by checking that the test cases achieve 100% coverage of both the requirements and the implementation logic and state vector.
In this world, MC/DC is helpful because it finds many bugs with a number of test cases that grows linearly, rather than exponentially, in the number of, e.g., conditional branches and bits of program state.
The work that you are examining seems, from casual inspection, to be about assessing the claim that MC/DC (and related adequacy criteria) really are adequate. (Urmas: did I get this right?)As a result, as Paul somewhat tersely replied: what's the executable spec -- i.e., the "true set of requirements" from which MC/DC or other test cases might be generated?


Regards,

MichaelP.S. - also, have you played with using Klee to check equivalence or bisimulation of two programs? (Here's an old post that I wrote up to quickly illustrate one approach:
http://mstone.info/posts/klee_quickcheck/
)


> Paul

>

>

> On 7 Mar 2013, at 05:06, Urmas Repinski <urrimus-***@public.gmane.org> wrote:

>

>> Hello.

>>

>> Yes.

>>

>> So KLEE takes into account only the coverage, if all paths are covered, but does not take into account entire design structure.

>>

>> The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if corresponding nodes in the design's model representation are activated, error is not propagated to the output.



>>

>> Its a pity, if it is possible to test the entire structure with klee inputs (activate different bits in variables with inputs, confirm that operators values take influence on the output) then klee generated inputs will be more useful for practical imlementation.



>>

>> It is possible to investigate error types, found in siemens benchmarks, and improve test generation with klee.

>>

>> I am writing one article about the errors in the designs at the moment, it will be published in the september, and then i will send it to the list too.

>>

>> Maybe this will make possible to improve input generation and make klee more usable in actual industry.

>>

>> Urmas Repinski

>>

>> ________________________________

>> Subject: Re: [klee-dev] using klee with Siemens Benchmarks

>> From: paul.marinescu-AQ/***@public.gmane.org

>> Date: Wed, 6 Mar 2013 21:19:54 +0000

>> CC: klee-dev-AQ/***@public.gmane.org

>> To: urrimus-***@public.gmane.org

>>

>> Hello Urmas,

>> It's not that clear what you're trying to do but I assume you want to use the inputs generated by KLEE on one program version to test 41 variants. I'm afraid this might not be that easy.

>>

>> KLEE actually gets nearly 100% statement and branch coverage on the original program ('nearly' because there's some unreachable code). You should run it without the POSIX runtime and uclibc because you're not using them anyway. What you get is:



>>

>> --------------------------------------------------------------------------

>> | Path | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |

>> --------------------------------------------------------------------------

>> | klee-out-0 | 5951 | 0.26 | 99.36 | 94.44 | 311 | 91.92 |

>> --------------------------------------------------------------------------

>>

>> From what I can tell, the path coverage is also 100%, counting just feasible paths, so there are no other inputs to generate as far as KLEE is concerned.

>>

>> Paul

>>

>> On 6 Mar 2013, at 12:31, Urmas Repinski <urrimus-***@public.gmane.org> wrote:

>>

>>> Hello.

>>>

>>> My name is Urmas Repinski, PhD student in Tallinn University of Technology, Estonia.

>>>

>>> I am trying to generate inputs for Siemens Benchmarks with KLEE.

>>>

>>> Siemens Benchmarks - C designs with various erroneous versions for testing error localization and error correction in C designs.

>>> They can be located and downloaded from http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

>>>

>>> I am testing KLEE with tcas design, it is smallest and simplest.

>>>

>>> Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 lines of execution of the design, and it is easy extract 1607 various inputs for the design.

>>> But i have no idea how the inputs were generated, attaching siemens benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

>>>

>>> I want to compare Siemens inputs with inputs, generated by klee.

>>>

>>> I had installed klee as it is written in documentation - http://klee.llvm.org/GetStarted.html - with uclibc support.

>>>

>>> Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses too new version of gcc, and this generated error then used llvm-gcc, but ok, this error solved.

>>>

>>> When i take tcas/v1/tcas.c design, adding corresponding modification to generate inputs with klee (tcas_original.c and tcas_modified.c are attached to the letter), i get 45 inputs generated, and after modifying klee output to suitable format i get klee outputs (file KLEE_OUTPUT_arg).



>>>

>>> urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c

>>> urmas-PBL21 src # klee --libc=uclibc --posix-runtime tcas.o

>>> KLEE: NOTE: Using model: /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca

>>> KLEE: output directory = "klee-out-143"

>>> KLEE: WARNING: undefined reference to function: fwrite

>>> KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)

>>> KLEE: WARNING ONCE: calling __user_main with extra arguments.

>>>

>>> KLEE: done: total instructions = 14790

>>> KLEE: done: completed paths = 45

>>> KLEE: done: generated tests = 45

>>>

>>>

>>> This outputs have same coverage than Siemens Inputs, but most of errors in erroneous designs are simply not detected, while Simenes inputs detect all errors.

>>>

>>> Inputs KLEE: Coverage - 92.5532%. Total Inputs - 45 Detected Errors - 8/41

>>> Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected Erroneous designs - 41/41

>>>

>>> Maybe i have something wrong with KLEE arguments when i execute klee, can somebody help me with right klee execution?

>>> I had tested klee with various options, but i still have 45 generated inputs. Is it possible to increase somehow number of generated inputs with klee?

>>> Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts executions somewhere, but there is no fwrite function in tcas design.

>>>

>>> Urmas Repinski

>>> <INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________

>>> klee-dev mailing list

>>> klee-dev-AQ/***@public.gmane.org

>>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

>>

>>

>> _______________________________________________

>> klee-dev mailing list

>> klee-dev-AQ/***@public.gmane.org

>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

>

>

>

> _______________________________________________

> klee-dev mailing list

> klee-dev-AQ/***@public.gmane.org

> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

>
DAVID LIGHTSTONE
2013-03-07 05:25:23 UTC
Permalink
At one time I was playing around with KLEE. The goal was to see if MCDC coverage was possible.
It was not.

By coincidence I discovered CIL a C language program transformation tool.
Try
(1) transform the source using CIL
(2) generate test cases using KLEE operating upon the transformed source
(3) run test cases against the original source



Dave Lightstone
Loading...