Repair/RepairCompiler/compile.plan

   1
   2 new issues - it is not always possible to precalculate (or even
   3 calculate) an offset for a particular field... this would be the case if
   4 a field relied on a particular structure's fields and that fields'
   5 offset relied upon itself or something beyond it in its own
   6 structure...
   7
   8 compilation plan xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
   9
  10 1. determine preconditions (required sets/relations), post conditions
  11    (generated sets), and measures of difficult in checking constraints
  12
  13 2. find best ordering of rule building to minimize the cost of space
  14    and time.
  15
  16 3. build ordering and generate code.
  17
  18
  19
  20 set traversal xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  21
  22 1. when accessing set traversal costs assume that sets do not have
  23    cycles and that set traversal can occur in constant time if necessary.
  24
  25 2. on first run, do a linear space traversal to determine a time bound
  26    on which the set traversal should finish. from then on use a constant
  27    space traversal with a dynamic time bound.
  28
  29
  30
  31 accessing ease of relation building xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  32
  33 1. rank ease of relation building by whether the relation is a field
  34 or a constant time countable function of the fields.
  35
  36
  37
  38 changing lattices xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  39
  40 1. examine how relations are built. for instance, if we can guarantee
  41    that duplicates shouldn't occur, and we might be able to if we use a
  42    hash just in case.
  43
  44 2. if a relation is only used for counting and we are guaranteed not
  45    to double count than just keep the count around (saves massive space)
  46
  47
  48
  49 plan of action xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  50
  51 1. determine rules for generating pre/post conditions and cost measures
  52    for rule building.
  53
  54 2. devise an algorithm to schedule these rule building and constraint
  55    checking mechanisms.
  56
  57 3. be warned about tight coupling of set building and consistency
  58    checking. it may be better to decouple them and set up pre/post
  59    conditions on the consistency checks (per quantifier)
  60
  61 4. test scheduler against file example and compare to hand coded
  62
  63
  64
  65 plan of action revised xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  66
  67 1. perform topological sort of all rules, TS[nodes]
  68
  69 2. mark sets that need to be built in linear space if
  70    a. it appears inside a guard expression or inclusion expr
  71    b. it is marked "build" in the spec (see redundant)
  72    c. if it appears as the range in a relation that is not
  73       statically type safe
  74
  75 3. mark relations that need to be built in linear space if
  76    a. they appear in constraints and are difficult to build
  77    b. a cycle has been found
  78
  79 4. detect singletons and replace with constants and wrap traversals with
  80    'if-else' checks. upon failure define sets that the quantifier
  81    appears in as empty and check as usual
  82
  83 5. for all sets that appear in constraints and relations that need to be
  84    built, sort them topologically by TS and the generate each one in order.
  85
  86 6. pile up set building on top of previous set traversals unless a set
  87    is marked 'build'. then build it and each traversal dependent on build
  88    should begin afresh.
  89
  90 7. at each interface point, assess what can be checked and check it. the
  91    idea is that since we are guaranteeing that everything that needs to
  92    traversed is traversed if we check whenever we can, what we can then
  93    we should end up doing a complete check.
  94
  95
  96
  97 hacks xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  98
  99 1. statically find singleton sets and optimize them out of quantifiers
 100
 101
 102
 103 issues xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
 104
 105 1. domain type problems with relations during fast traversal
 106
 107    S(node*);
 108    R(S->S);
 109    [], true => root in S
 110    [forall s in S], s.next != null => <s, s.next> in R
 111    [forall s in S], s.next != null AND s.flag = 1 => s.next in S
 112
 113    the problem here is that if you have a simple linked list,
 114    root(1)->element(0) where the parantheses mark the value of .flag
 115    then the above system will add <root, element> to R but element is
 116    not in S because flag = 0. In brian's system the relations undergo a
 117    post process to determine if any elements exist don't exist in the
 118    domain/range of the relation.
 119
 120    the problem appears when the relation is being quickly traversed (via
 121    pattern matching). now it is possible for a relation to be returned
 122    that is not truly a match. the solution we have determined at this
 123    point is to assume that most specs won't have this problem (its
 124    arguably a problem in the spec!) and to basically do perform these
 125    two static tests:
 126
 127    if [forall s in S], s.next != null => <s, s.next> in R
 128       AND  s.next != null => s.next in S
 129
 130    then R is safe and no fast traversal should return non-elements,
 131    regardless of the state of the data structure. If it can not be
 132    determined whether a relation is type safe, then a linear space
 133    storage of S must be done so that the relations can be
 134    existence-checked against the stored set
 135
 136 2. redundancy in sets may result in slowdown
 137
 138    assume we have a set of file references (set R) which reference files
 139    (set F). These files can contain a large amount of block
 140    references (set B). Now assume that we have one file but thousands of
 141    references to that file. If there a test on the blocks of a file than
 142    there can be a slow down caused by a constant space traversal of the
 143    sets that is disproportiate to the gain seen.
 144
 145    clarification. lets assume that we test that blocks are marked as
 146    used. if sets R and F and B were generated in linear space than the
 147    number of checks done to test the constraint on B would be O(B).
 148
 149    now assume that we traverse R, F, and B in constant space. Because
 150    each reference points to the same file and we have no way of knowing
 151    in constant space whether we are rechecking each file over and over,
 152    we will check the constraints on B O(R*B) times. this is much worse
 153    and represents a jump in the time complexity.
 154
 155    static analysis of the specs will not be able to determine such a
 156    problem as a broken data structure is a perfectly viable input to the
 157    system. therefore it is necessary to have the program be given hints
 158    about the redundancy of structures. If a structure is highly
 159    redundant than that structure should be stored in linear space to
 160    fight the problem of quadratic blow up.
 161
 162    the hint could be in the form of a keyword addition to the space file
 163    language. another option is to allow the checker to dynamically
 164    detect redundancy (how? maybe hash table? maybe sparse linear time
 165    checks) and change the way its builds sets realtime. for simplicity,
 166    the spec hint will be used.
 167
 168 3. how do page accesses translate to incremental model checks?
 169
 170    TBD.
 171
 172