Oink Manifest

For those who may wish to hack on Oink, I attempt here to provide an organized top-down view of the aspects of Oink and the purpose of each file.

Many of the files and even directories you see mentioned here are generated so you will not see them until you have build and tested oink.

NOTE: This is out of date, but mostly by omission. That is, there are files not mentioned here, but the ones that are should be documented correctly.


Text file for pointing people to Doc/index.html.

Describes the copyright and terms of use of Oink.

Documentation directory.

Primary documentation.

This file: a list of all the files in Oink and their purpose.

Requirements for those hacking on Oink: design and syntax considerations.


Converts its command line flags into name/value pairs with which files named FILE.in will have '@name@' substituted by 'value' to produce FILE. Generates config.status which will actually do the substitution.

Shell script for calling configure.pl.

Actually do the substitution of names for values in files to be configured.

Print out the name/value pairs chosen at configuration time.

Top level makefiles

A makefile for giving an Oink demo.

Makefile Makefile.in
The top level driver makefile; implements all of the features usually provided by a makefile, such as building and testing. It includes (possibly indirectly) many makefiles suffixed with '.incl.mk'.

Build.incl.mk Build.incl.mk.in
Included makefile for building everything.


Directory containing tests and temporary files generated during testing.

A included makefile for testing everything.

A script for doing a controlled experiment test: generating two very similar test inputs from one annotated input such that the difference between them is minimal and yet should make the difference in whatever feature is being tested.


Elsa is Scott McPeak's C++ front-end which Oink uses. Oink builds its own version of elsa and so its build process is subtle, as we re-use much of elsa as-is but modify some of it slightly. We rebuild elsa by compiling most the elsa/*.cc files in oink, so you will notice lots of .o and .d files in oink that do not correspond to .cc files in oink, but instead in elsa.

A generated directory which contains softlinks to the header files in Elsa that we need. Our build process also builds our own version of a modified elsa and we use -IElsaHeaders instead of -I../elsa on the compile line in order to carefully exclude certain files in elsa.

One of many .tok files that are the single-source of information about the set of lexing tokens. Many files are generated from the collection of .tok files.

cc_tokens.cc cc_tokens.h cc_tokens.ids
Files generated from the collection of .tok files for implementing various features of language tokens.

cc.ast.gen.cc cc.ast.gen.h
The C++ implementation of the Elsa/Oink/Cqual++ AST which is generated from a collection of .ast files by Scott McPeak's astgen tool (which lives in the ../ast directory).

cc_type.h cc_type.h.cpatch
The base C++ type system from elsa.
Note that oink/FILE.cpatch is used as a kind of diff file by ../smbase/codepatch.pl to generate oink/FILE from elsa/FILE in a similar way to the utility 'patch'.

cc_print.cc cc_print.cc.cpatch cc_print.h cc_print.h.cpatch
A module for pretty-printing C++ AST and Types. It is extended by the value_print module to print abstract values and further by qual_value_print to print qualifier annotations.
Note that oink/FILE.cpatch is used as a kind of diff file by ../smbase/codepatch.pl to generate oink/FILE from elsa/FILE in a similar way to the utility 'patch'.

The version of elsa that we build, minus the main() function module. It isn't really necessary that we build an archive; it just makes the build process a little cleaner.

archive_srz_test.cc elsa_tests.gen.incl.mk elsa_tests_get elsa_tests_omit.files
Filters out the tests from elsa/regrtest that we want to use in oink. That is, elsa_tests_get generates elsa_tests.gen.incl.mk from ../elsa/regrtest by subtracting files in elsa_tests_omit.files. This mechanism is so that oink can automatically use elsa tests as they are added to elsa unless one of them is breaking oink. This generation step is not done automatically anywhere; it is just for someone to run when they feel like going through all the errors that may come up; The resulting elsa_tests.gen.incl.mk is checked in.

One of the elsa debugging mechanisms is that a class may request to have each object get a serial number by inheriting from class SerialBase. The generated file .serialno keeps these unique (up to integer rollover) even across multiple calls to Elsa/Oink. This is handy for debugging serialized files as the state persists.


Oink provides an empty-analysis that just builds from the infrastructure but does nothing.

oink_cmd.cc oink_cmd.h oink_cmd_util.h
Processes command-line arguments for the oink empty-analysis proper. Other tools inherit from it and re-use the command-line flags and their processing.

oink_file.cc oink_file.h
Implement handling input files for analysis.

oink_control.cc oink_control.h
Implement oink control files. Examples are in Test/*.ctl

oink_global.cc oink_global.h
All of Oink's global variables in one place.

Some basic utilities for reporting user errors with line numbers and other standard functionality.


Extensions oink makes to the token set. This file is currently empty as the oink empty analysis proper need make no extension to the token set; this file exists just for completeness.

Extensions oink makes to the lexer. This file is currently empty as the oink empty analysis proper need make no extension to the lexing rules; this file exists just for completeness.

Generated lexer file for lexing the oink extensions to elsa.

Oink lexer generated by flex.

oink.gr oink.gr.gen.cc oink.gr.gen.h oink.gr.gen.out
oink.gr is a grammar specification file for the modifications that oink makes to the elsa grammar. The other files are generated from the collection of .gr files by elkhound.

An AST specification file for the modifications that oink makes to Elsa's AST.

Implementation of functions declared in the oink.ast file.

oink_type.cc oink_type.h oink_type_children.cc oink_type_children.h oink_var.cc oink_var.h
Extensions/modifications made to the oink typesystem. Note that oink_type.h is a strange header file that inserts a class into the middle of the Elsa Type hierarchy; this simple preprocessor trick prevents the need for multiple inheritance; do not include this header file directly, instead include cc_type.h. You will get a compile-time error if you neglect to do this.

oink_tcheck.cc oink_tcheck_env.cc oink_tcheck_env.h oink_tcheck_env_clbk.cc oink_tcheck_env_clbk.h
Extensions and modifications by oink to the typechecker; for processing the nodes added by oink.ast.

oink_integrity.cc oink_integrity.h Check the integrity of the AST after parsing, typechecking, and elaboration. Not to be confused with elsa/integrity.h.

oink.h oink.cc
The primary functionality that oink proper (not including dataflow or qual) adds to elsa. Provides implementation of each stage of processing, but leaves it to a main() function to call those methods. Much here can be re-used by other analyses by inheritance.

The main() function for the oink empty-analysis proper.

The oink executable. Uses the infrastructure and implements the empty analysis. Useful for testing that Oink's modifications to elsa haven't broken it.

Included makefile for testing the oink empty analysis proper which exhibits the basic functionality shared among all the analysis tools.

[Abstract] Value

Any analysis that concerns itself with dataflow needs a notion of an Abstract Value. They are isomorphic to the "constructed" subset of the Elsa typesystem. These are the types that are transparent in the sense that two are "the same type" exactly when they have the same internal structure. They are also the types that can take a const or volatile qualifier. The difference of Values from Types is two int expressions have the same Type but different [Abstract] Values: each Value annotates exactly one expression, whereas Types are re-used.

Extensions to the AST to allow for Value annotations to the AST.

value.h value.cc
An Abstract Value class.

value_ast_visitor.cc value_ast_visitor.h
An AST visitor that annotates a typechecked AST with Values.

value_print.cc value_print.h
A class for printing Values; inherits from elsa class TypePrinter.

XML serialization

Elsa provides XML serialization functionality (thanks to me); we augment it here so that Oink and Qual do as well.

xml_ast.gen.tokens xml_ast_reader_0decl.gen.h xml_ast_reader_1defn.gen.cc xml_ast_reader_2ctrc.gen.cc xml_ast_reader_3regc.gen.cc
Files generated from astgen from the collection of .ast files that specify how to lex (.tokens file) and parse (the rest of the files) XML.

Tokens used in the XML serialization language for the Value classes.

xml_enum_1.gen.h xml_lex_1.gen.lex xml_name_1.gen.cc
Files generated by the collection of .tokens files (mostly in elsa) by ../elsa/token.pl for use in XML lexing and parsing.

xml_lex.gen.lex xml_lex.gen.yy.cc
Respectively the generated lex file and then the XML lexer generated from that by flex.

xml_value_reader.cc xml_value_reader.h
Specialization of the ReadXml framework that reads in XML for serialized Values.

xml_value_writer.cc xml_value_writer.h
Implements a simple DFS XML writer for Values. Writing is always easier than reading.

Script for canonicalizing XML output. Implements an equivalence relation such that an xml file should be xml_canonicalize-equivalent to the file you get by reading it in and serializing it out again. That is, reading then writing XML should be idempotent on the set of XML files modulo xml_canonicalize.


Classes for serializing and deserializing groups of files into an archive.

archive_srz.cc archive_srz.h



The primary value that Oink provides over Elsa is a whole-program dataflow analysis.

dataflow_visitor.cc dataflow_visitor.h dataflow_cpdinit_clbk.cc dataflow_cpdinit_clbk.h
An AST Visitor that computes a instance-sensitive, polymorphic, non-flow-sensitive, non-path-sensitive, expression granularity dataflow graph on a Translation Unit AST. When a pair of expressions is found between which data flows, the pair is handed off to the dataflow_ex module below.

dataflow_ex.cc dataflow_ex.h
Insert a single expression-level edge by driving a call with the appropriate arguments on the underlying class DataFlowTy in the dataflow_ty module below.

dataflow_ty.cc dataflow_ty.h
Given two top-level Value trees between which data should flow, makes the dataflow calls between individual [Abstract] Value nodes.

Lib: utility library

A library for utilities, such as useful non-standard container classes.

Hashable-based implementation of the union find algorithm.

A data-structure I have never had a need for before nor have I ever seen in any book or container library:

The invariant is that the map is well-defined when the domain is modded out by the equivalence-relation. That is, the pull-back of anything in the range of the map is always in the same equivalence class in the relation.

Lib/union_find_test Lib/union_find_test.cc Lib/LibTest.incl.mk
Code and makefile for testing union_find. The union_find_map module is only tested by its usage in oink.

LibCpdInit: compound initializers library

AST-agnostic and Typesystem-agnostic library for computing the dataflow resulting from a C99 compound initializers expression. Does not work in C++ nor in instance-sensitive mode unfortunately. Has been tested versus the CIL implementation by George Necula. If you are writing some other C++ front-end, this module is designed to be re-usable in it, which is why it is a separate directory.

Readme just for LibCpdInit.

A script that announces there is no configuration to be done.

For testing purposes, build a vacuous LibCpdInit against its own cpdinit_lib.h which specifies what the compiler writer would have to provide.

Documentation in the syntax of an elkhound grammar for the subset of the AST that LibCpdInit knows about; again, this is just documentation. Note that LibCpdInit will work with any grammar if you provide the right interface.

LibCpdInit/cpdinit.cc LibCpdInit/cpdinit.h
Process one Value/compound-initializer pair, generating dataflow edges.

LibCpdInit/member_type_iter.cc LibCpdInit/member_type_iter.h
An iterator used by the cpdinit module to keep track of where it is in the compound initializer.

A vacuous header file that shows what needs to be provided by a compiler writer to use LibCpdInit. Using the Makefile, LibCpdInit will build against this header file.

Compound initializers testing

The Oink headers that LibCpdInit needs to build.

cpdinit_test_visitor.cc cpdinit_test_visitor.h cpdinit_test_clbk.cc cpdinit_test_clbk.h cpdinit_test.cc cpdinit_test.h cpdinit_test_main.cc cpdinit_test
Visitor, callback, and testing code for generating the cpdinit_test executable which tests LibCpdInit.

Included makefile for testing LibCpdInit.


Static Printer: A sample Oink tool that shows how to query the AST and typesystem; for now it just prints the inheritance graph. If there is something you always wanted a tool to tell you about your raw program, implement it as feature here and send it to me.

staticprint_cmd.cc staticprint_cmd.h
Process command-line arguments.

staticprint.cc staticprint.h
Primary functionality.

staticprint_global.cc staticprint_global.h
All the globals for staticprint in one place.

staticprint staticprint_main.cc
Just the main() function.

Included makefile for testing.


Control Flow Graph Printer: A sample Oink tool that shows you how to access the intRA-procedural control flow graph (provided by elsa). For now we just print it out.

cfgprint_cmd.cc cfgprint_cmd.h
Process command-line arguments.

cfgprint.cc cfgprint.h
Primary functionality.

cfgprint_global.cc cfgprint_global.h
All the globals for cfgprint in one place.

cfgprint cfgprint_main.cc
Just the main() function.

Included makefile for testing.


Data Flow Graph Printer: A sample Oink tool that shows you how to access the intER-procedural data flow graph, provided by the dataflow_* modules. For now we just print it out.

dfgprint_cmd.cc dfgprint_cmd.h
Process command-line arguments.

dfgprint.cc dfgprint.h
Primary functionality.

dfgprint_global.cc dfgprint_global.h
All the globals for dfgprint in one place.

dfgprint dfgprint_main.cc
Just the main() function.

Included makefile for testing.


Qualifier analysis: The flagship Oink tool that hooks Scott McPeak's C++ front-end elsa, the oink dataflow functionality, and Rob Johnson's libqual backend together into a polymorphic qualifier analysis for C and C++.

qual_cmd.cc qual_cmd.h
Process command-line arguments.

qual_global.cc qual_global.h
All globals in one place.

Qual extensions to the lexing tokens.

Qual extensions to the lexer.

Generated lexer file for lexing the qual extensions to elsa.

Qual lexer generated by flex.

qual.gr qual.gr.gen.cc qual.gr.gen.h qual.gr.gen.out
qual.gr is a grammar specification file for the modifications that qual makes to the elsa grammar (beyond those of oink). The other files are generated from the collection of .gr files by elkhound.

An AST specification file for the modifications that qual makes to Elsa's AST (beyond those of oink).

qual_ast_aux.cc qual_ast_aux.h
Implementation of functions declared in the qual.ast file.

qual_literal.cc qual_literal.h
Implements 1) representing the syntax and 2) providing the semantics of qualifier literals such as $tainted.

qual_funky.cc qual_funky.h
The word "polymorphic" is seriously overloaded in the world of programming languages. Implementation of the polymorphic ("funky") qualifier literals syntax, such as this.

   char $_1 * strchr(const char $_1 * s, int c);

qual_value_children.cc qual_value_children.h
Qual versions of the Value classes, such as this.

   class CVAtomicValue_Q : public CVAtomicValue

Qual extensions to the oink variable system. These functions would go on a "qual annot variable" class except that there is no extra state so no such extra class is necessary.

qual_annot.cc qual_annot.h
Annotations onto the *Value_Q types that hold the libqual backend annotations. The idea is to keep the libqual backend implementation of the oink/qual_* modules somewhat separated from oink/qual_* itself.

qual_dataflow_visitor.cc qual_dataflow_visitor.h qual_dataflow_ex.cc qual_dataflow_ex.h qual_dataflow_ty.cc qual_dataflow_ty.h
Qual subclasses of the respective dataflow modules; these classes inherit from the generic dataflow and modify its behavior for purposes of the qual analysis.

qual_value_print.cc qual_value_print.h
Modifications to the Type/Value pretty-printing process to implement the -fq-print-trans-qual functionality. A 'type printer' for oink [Abstract] Values that will also print the conclusions of the backend as to the qualifiers that apply at each point.

qual_xml_value_reader.cc/.h qual_xml_value_writer.cc/.h
Subclasses of xml_value_reader.h and xml_value_writer.h respectively that serialize the *Value_Q and Variable_Q classes instead of their Oink versions. These qual classes do not actually have additional data, so the serialization just renames the tags and re-uses the oink serialization.

qual_libqual_iface.cc qual_libqual_iface.h
An interface to the libqual polymorphic qualifier backend.

qual.cc qual.h
The primary functionality that qual adds to oink. Provides implementation of each stage of processing, but leaves it to a main() function to call those methods.

Just the main() function for the qual analysis proper.

The qual executable.

Qual lattice file that implements a $tainted analysis that survives through casts.

File generated by the -fq-print-quals-graph flag by the libqual backend.

Filter the quals.dot output of the backend so that the resulting graph consists only of the connected component containing a single given root node.

qual_parse_test.incl.mk qual_result_test.incl.mk qual_serialization_test.incl.mk qual_test.incl.mk
Included makefiles for testing the qual tool.