Features of Elsa

The elsa documentation is extensive; I simply point out a few useful features that users of Oink/Cqual++ should know about.

Features You Need to Ask elsa/ccparse For Directly

The elsa frontend is oink-stack/elsa/ccparse. The feature of printing the typechecked AST is provided by Elsa and can be used directly.

  oink-stack/elsa/ccparse -tr printTypedAST foo.cc

In general many "-tr" ("tracing flags") that work for elsa also work for the oink or qual tools; be sure to provide them separately "-tr blort -tr gronk" rather than the "-tr blort,gronk" which only Elsa accepts.

Note however that flags implemented in elsa/main.cc will not be honored by oink or qual because we replace that module with our own since we have our own main() function. Here are some introspection flags that Elsa implements that there simply aren't command-line flags for in oink and as mentioned the -tr flags won't work from ./oink either.

  -tr parseTree; print the parse tree; a bit obscure to read
  -tr printHierarchies; print the class hierarchies

Reading and Writing the The AST and Typesystem as XML

I implemented reading and writing of the Elsa AST and Typesystem so that we could have serialization in Cqual++. You can get an XML view of the AST and optionally the Typesystem.

  -tr xmlPrintAST; print the AST as XML
  -tr xmlPrintAST-types; include type annotations in the XML
  -tr xmlPrintAST-indent; indent the output (slower)

The input file oink-stack/elsa/in/t0001.cc:

    // very simple

    int x;

The start of the 9258 lines of XML output. You can see the line length map for the file and the beginnings of the declaration of an int.

    ./ccparse -tr c_lang -tr xmlPrintAST,xmlPrintAST-types,xmlPrintAST-indent in/t0001.cc
...
    <File _id="FI0x83DBE20"
     name="in/t0001.cc"
     numChars="23"
     numLines="4"
     lineLengths="FI0x83DE408">
     <LineLengths _id="FI0x83DE408">
    14 0 6 0
     </LineLengths>
    </File>
    <TranslationUnit _id="AST0x83E6CB8"
     topForms="AL0x83E6CB8">
     <List_TranslationUnit_topForms _id="AL0x83E6CB8">
      <_List_Item item="AST0x83E6D40">
       <TF_decl _id="AST0x83E6D40"
        loc="in/t0001.cc:3:1"
        decl="AST0x83E6CE0">
        <Declaration _id="AST0x83E6CE0"
         dflags="(extern "C")"
         spec="AST0x83E6CC8"
         decllist="FL0x83E6D18">
         <TS_simple _id="AST0x83E6CC8"
          loc="in/t0001.cc:3:1"
          cv=""
          id="int">
         </TS_simple>

Elsa will parse it back in again too.

  -tr parseXml; parse in XML in the same schema we output it.

Oink/Cqual++ makes use of this to do serialization and de-serialization, so if you analyze and serialize a file foo.i using Cqual++ and look in the resulting foo.qdir/value directory you will see the same XML augmented with serialization for the Abstract Value annotations as well. I just didn't make a command line flag to print it to standard out.

XML makes a handy debugging output format since it is reasonably readable and is guaranteed to be complete since we use it for serialization. When following object ids as links in the XML you may find it handy to use the emacs C-s C-w feature for searching for whatever is in front of the cursor instead of attempting to type in those ids manually.

The lowered view of the AST is basically the AST after templates have been instantiated, implicit syntax made explicit, and overloading resolved; it is the version that you want to look at if you are analyzing or compiling the code. See the elsa documentation in oink-stack/elsa/doc for more details.

  -tr xmlPrintAST-lowered; give a "lowered" view of the AST

Hashline directives

If there are # 123 "blort.c" ("hashline") directives in your blort.i file (which there will be by default) it means that that line of blort.i came from line 123 of blort.c. The elsa source location module will parse these directives, build an internal map back to the original .c locations, and use this map to report locations of things relative to the original file line numbers instead of the line numbers in blort.i.

foo.c:

    #define BOINK b
    int main() {
      BOINK;
    }
    $ gcc -E foo.c > foo.i

foo.i:

    # 1 "foo.c"
    # 1 "<built-in>"
    # 1 "<command line>"
    # 1 "foo.c"

    int main() {
      b;
    }

Note the change in line number reported for the error.

    $ ./oink foo.i
    foo.c:3:3: error: there is no variable called `b'
    . . .
    $ ./oink -tr nohashline foo.i
    foo.i:7:3: error: there is no variable called `b'
    . . .

Use this flag to turn that behavior off and get the locations reported relative to the raw input file. It works on both elsa and oink.

  -tr nohashline; don't use the preprocessor hashline directives
      to map back to the original file

Lexer output

If you just want to see the token stream from the lexer, use this:

    ./tlexer -tr tokens in/t0001.cc
    %%% progress: 0ms: making Lexer
    %%% progress: 1ms: lexing in/t0001.cc...
    in/t0001.cc:3:1: int
    in/t0001.cc:3:5: <name>: x
    in/t0001.cc:3:6: ;
    %%% progress: 7ms: done lexing (4 ms)

Using the ubiquitous 'gdb()' method

AST nodes, types, and variables have a gdb() method designed to be used to print out an object in a helpful from when traversing the graph in gdb by dumping a human readable representation of the object. Just say 'print x->gdb()' from the gdb command prompt.

This is a good way to use your .gdbinit file.

    file ./oink
    set args ../elsa/in/t0001.cc
    break main
    break breaker
    run

It works on AST nodes.

    Breakpoint 3, ValueASTVisitor::visitDeclarator(Declarator*) (this=0xbffff650,
        decltor=0x8379958) at value_ast_visitor.cc:33
    (gdb) p decltor->gdb()
    tree = Declarator:
      var: (global) (definition) int x
      context = DC_TF_DECL
      decl = D_name:
        loc = ../elsa/in/t0001.cc:3:5
        name = PQ_name:
          loc = ../elsa/in/t0001.cc:3:5
          name = "x"
      init is null
      ctorStatement is null
      dtorStatement is null
    $1 = void

It works on Variables.

    (gdb) p decltor->var->gdb()
    int x
    $2 = void

It works on Types.

    (gdb) p decltor->var->type->gdb()
    int
    $3 = void

Serial numbers

Tracking dependencies through the heap can be a big pain: Sometimes you have

When built with USE_SERIAL_NUMBERS, any object whose class inherits from class SerialBase will get an 'int serialNumber' field; these are currently: (elsa) Scope, the heirarchy of Types, Variable, and (oink) Value. You can use the serial number mechanism to go "backward in time" in gdb as follows:

To do this for two objects, just create two breakpoints at the same place in serialno.cc and condition each breakpoint on its respective serial number.

One instance of the Cqual++ tool can create objects and serialize them to a file, and then another process instance can read them back in for further analysis. Therefore it is possible to find a corrupted object that was created in a process that no longer even exists! Elsa therefore consults a file .serialno in the local directory to find out where to start the first serial number. To debug multi-process sequences as the one just described, just do as follows.