Static program analysis explained
In computer science, static program analysis (also known as static analysis or static simulation) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution in the integrated environment.[1] [2]
The term is usually applied to analysis performed by an automated tool, with human analysis typically being called "program understanding", program comprehension, or code review. In the last of these, software inspection and software walkthroughs are also used. In most cases the analysis is performed on some version of a program's source code, and, in other cases, on some form of its object code.
Rationale
The sophistication of the analysis performed by tools varies from those that only consider the behaviour of individual statements and declarations,[3] to those that include the complete source code of a program in their analysis. The uses of the information obtained from the analysis vary from highlighting possible coding errors (e.g., the lint tool) to formal methods that mathematically prove properties about a given program (e.g., its behaviour matches that of its specification).
Software metrics and reverse engineering can be described as forms of static analysis. Deriving software metrics and static analysis are increasingly deployed together, especially in creation of embedded systems, by defining so-called software quality objectives.[4]
A growing commercial use of static analysis is in the verification of properties of software used in safety-critical computer systems andlocating potentially vulnerable code.[5] For example, the following industries have identified the use of static code analysis as a means of improving the quality of increasingly sophisticated and complex software:
- Medical software: The US Food and Drug Administration (FDA) has identified the use of static analysis for medical devices.[6]
- Nuclear software: In the UK the Office for Nuclear Regulation (ONR) recommends the use of static analysis on reactor protection systems.[7]
- Aviation software (in combination with dynamic analysis).[8]
- Automotive & Machines (functional safety features form an integral part of each automotive product development phase, ISO 26262, section 8).
A study in 2012 by VDC Research reported that 28.7% of the embedded software engineers surveyed use static analysis tools and 39.7% expect to use them within 2 years.[9] A study from 2010 found that 60% of the interviewed developers in European research projects made at least use of their basic IDE built-in static analyzers. However, only about 10% employed an additional other (and perhaps more advanced) analysis tool.[10]
In the application security industry the name static application security testing (SAST) is also used. SAST is an important part of Security Development Lifecycles (SDLs) such as the SDL defined by Microsoft[11] and a common practice in software companies.[12]
Tool types
The OMG (Object Management Group) published a study regarding the types of software analysis required for software quality measurement and assessment. This document on "How to Deliver Resilient, Secure, Efficient, and Easily Changed IT Systems in Line with CISQ Recommendations" describes three levels of software analysis.[13]
- Unit Level: Analysis that takes place within a specific program or subroutine, without connecting to the context of that program.
Technology Level: Analysis that takes into account interactions between unit programs to get a more holistic and semantic view of the overall program in order to find issues and avoid obvious false positives.
System Level: Analysis that takes into account the interactions between unit programs, but without being limited to one specific technology or programming language. A further level of software analysis can be defined.
- Mission/Business Level: Analysis that takes into account the business/mission layer terms, rules and processes that are implemented within the software system for its operation as part of enterprise or program/mission layer activities. These elements are implemented without being limited to one specific technology or programming language and in many cases are distributed across multiple languages, but are statically extracted and analyzed for system understanding for mission assurance.
Formal methods
See main article: Formal methods.
Formal methods is the term applied to the analysis of software (and computer hardware) whose results are obtained purely through the use of rigorous mathematical methods. The mathematical techniques used include denotational semantics, axiomatic semantics, operational semantics, and abstract interpretation.
By a straightforward reduction to the halting problem, it is possible to prove that (for any Turing complete language), finding all possible run-time errors in an arbitrary program (or more generally any kind of violation of a specification on the final result of a program) is undecidable: there is no mechanical method that can always answer truthfully whether an arbitrary program may or may not exhibit runtime errors. This result dates from the works of Church, Gödel and Turing in the 1930s (see: Halting problem and Rice's theorem). As with many undecidable questions, one can still attempt to give useful approximate solutions.
Some of the implementation techniques of formal static analysis include:[14]
- Abstract interpretation, to model the effect that every statement has on the state of an abstract machine (i.e., it 'executes' the software based on the mathematical properties of each statement and declaration). This abstract machine over-approximates the behaviours of the system: the abstract system is thus made simpler to analyze, at the expense of incompleteness (not every property true of the original system is true of the abstract system). If properly done, though, abstract interpretation is sound (every property true of the abstract system can be mapped to a true property of the original system).[15]
- Data-flow analysis, a lattice-based technique for gathering information about the possible set of values;
- Hoare logic, a formal system with a set of logical rules for reasoning rigorously about the correctness of computer programs. There is tool support for some programming languages (e.g., the SPARK programming language (a subset of Ada) and the Java Modeling Language—JML—using ESC/Java and ESC/Java2, Frama-C WP (weakest precondition) plugin for the C language extended with ACSL (ANSI/ISO C Specification Language)).
- Model checking, considers systems that have finite state or may be reduced to finite state by abstraction;
- Symbolic execution, as used to derive mathematical expressions representing the value of mutated variables at particular points in the code.
- Nullable reference analysis
Data-driven static analysis
Data-driven static analysis leverages extensive codebases to infer coding rules and improve the accuracy of the analysis.[16] [17] For instance, one can use all Java open-source packages available on GitHub to learn good analysis strategies. The rule inference can use machine learning techniques.[18] It is also possible to learn from a large amount of past fixes and warnings.[16]
Remediation
Static analyzers produce warnings. For certain types of warnings, it is possible to design and implement automated remediation techniques. For example, Logozzo and Ball have proposed automated remediations for C# cccheck.[19]
Further reading
- Ayewah . Nathaniel . Hovemeyer . David . Morgenthaler . J. David . Penix . John . Pugh . William . 2008 . Using Static Analysis to Find Bugs . IEEE Software . 25 . 5 . 22–29 . 10.1109/MS.2008.130 . 10.1.1.187.8985 . 20646690 .
- Book: Brian Chess, Jacob West (Fortify Software) . Secure Programming with Static Analysis . Addison-Wesley . 2007 . 978-0-321-42477-8.
- Book: Flemming Nielson . Hanne R. Nielson . Chris Hankin . 1999 (corrected 2004) . Principles of Program Analysis . Springer . 978-3-540-65410-0. 2004-12-10 .
- "Abstract interpretation and static analysis," International Winter School on Semantics and Applications 2003, by David A. Schmidt
Notes and References
- https://web.archive.org/web/20110927010304/http://www.ida.liu.se/~TDDC90/papers/industrial95.pdf . 2011-09-27 . Industrial Perspective on Static Analysis. . Software Engineering Journal . Mar 1995 . 69–75 . Wichmann . B. A. . A. A. . Canning . D. L. . Clutterbuck . L. A. . Winsbarrow . N. J. . Ward . D. W. R. . Marsh . 10 . 2 . 10.1049/sej.1995.0010 .
- Egele. Manuel. Scholte. Theodoor. Kirda. Engin. Kruegel. Christopher. 2008-03-05. A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys. 44. 2. 6:1–6:42. 10.1145/2089125.2089126. 1863333 . 0360-0300.
- Khatiwada. Saket. Tushev. Miroslav. Mahmoud. Anas. 2018-01-01. Just enough semantics: An information theoretic approach for IR-based software bug localization. Information and Software Technology. en. 93. 45–57. 10.1016/j.infsof.2017.08.012.
- http://web1.see.asso.fr/erts2010/Site/0ANDGY78/Fichier/PAPIERS%20ERTS%202010/ERTS2010_0035_final.pdf "Software Quality Objectives for Source Code"
- http://research.microsoft.com/en-us/um/people/livshits/papers/pdf/thesis.pdf Improving Software Security with Precise Static and Runtime Analysis
- Web site: Infusion Pump Software Safety Research at FDA . FDA . Food and Drug Administration . 2010-09-08 . 2010-09-09 . live . https://web.archive.org/web/20100901084658/https://www.fda.gov/MedicalDevices/ProductsandMedicalProcedures/GeneralHospitalDevicesandSupplies/InfusionPumps/ucm202511.htm . 2010-09-01 .
- Computer based safety systems - technical guidance for assessing software aspects of digital computer based protection systems, Web site: Computer based safety systems . http://webarchive.nationalarchives.gov.uk/20130104193206/http://www.hse.gov.uk/nuclear/operational/tech_asst_guides/tast046.pdf . dead . January 4, 2013 . May 15, 2013 .
- http://www.faa.gov/aircraft/air_cert/design_approvals/air_software/cast/cast_papers/media/cast-9.pdf Position Paper CAST-9. Considerations for Evaluating Safety Engineering Approaches to Software Assurance
- Web site: Automated Defect Prevention for Embedded Software Quality . VDC Research . VDC Research . 2012-02-01 . 2012-04-10 . live . https://web.archive.org/web/20120411211422/http://alm.parasoft.com/embedded-software-vdc-report/ . 2012-04-11 .
- Prause, Christian R., René Reiners, and Silviya Dencheva. "Empirical study of tool support in highly distributed research projects." Global Software Engineering (ICGSE), 2010 5th IEEE International Conference on. IEEE, 2010 https://ieeexplore.ieee.org/Xplore/login.jsp?url=%2Fielx5%2F5581168%2F5581493%2F05581551.pdf&authDecision=-203
- M. Howard and S. Lipner. The Security Development Lifecycle: SDL: A Process for Developing Demonstrably More Secure Software. Microsoft Press, 2006.
- Achim D. Brucker and Uwe Sodan. Deploying Static Application Security Testing on a Large Scale . In GI Sicherheit 2014. Lecture Notes in Informatics, 228, pages 91-101, GI, 2014.
- Web site: OMG Whitepaper | CISQ - Consortium for Information & Software Quality . 2013-10-18 . live . https://web.archive.org/web/20131228132152/http://www.omg.org/CISQ_compliant_IT_Systemsv.4-3.pdf . 2013-12-28 .
- Web site: A Survey of Automated Techniques for Formal Software Verification. Vijay D’Silva. Transactions On CAD. 2008. 2015-05-11. etal. live. https://web.archive.org/web/20160304074248/http://www.kroening.com/papers/tcad-sw-2008.pdf. 2016-03-04.
- Web site: A Formal Methods-based verification approach to medical device software analysis . Jones . Paul . Embedded Systems Design . 2010-02-09 . 2010-09-09 . dead . https://web.archive.org/web/20110710185427/http://embeddeddsp.embedded.com/design/opensource/222700533 . July 10, 2011 .
- Web site: Learning from other's mistakes: Data-driven code analysis. . www.slideshare.net . 13 April 2015 . en.
- Book: Söderberg . Emma . Church . Luke . Höst . Martin . Open Data-driven Usability Improvements of Static Code Analysis and its Challenges . 2021-06-21 . Evaluation and Assessment in Software Engineering . https://doi.org/10.1145/3463274.3463808 . EASE '21 . New York, NY, USA . Association for Computing Machinery . 272–277 . 10.1145/3463274.3463808 . 978-1-4503-9053-8.
- Book: Oh. Hakjoo. Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications - OOPSLA 2015. Yang. Hongseok. Yi. Kwangkeun. Learning a strategy for adapting a program analysis via bayesian optimisation. 2015. 572–588. 10.1145/2814270.2814309. 9781450336895. 13940725.
- Logozzo . Francesco . Ball . Thomas . 2012-11-15 . Modular and verified automatic program repair . ACM SIGPLAN Notices . 47 . 10 . 133–146 . 10.1145/2398857.2384626 . 0362-1340.