Repair/RepairCompiler/compile.plan

   1
   2 MARTIN's IDEA ************************************************* 7/9/03
   3 use type safe language and check only on reads (when data is accessed)
   4 check consistency properties only when the porperty needs to be
   5 consistent... only when it needs the property to be true
   6 **********************************************************************
   7
   8 issue: what if you are building the model and you are adding elements to
   9 a relation and the sets that compose those relations don't exist yet?
  10 hmm... maybe the question boils down to a question of whether the items
  11 being added to a relation are implied in the addition... that is if we
  12 have:
  13
  14 R: S->S
  15 condition(a) => <a, f(a)> in R
  16
  17 then if condition(a) => b in S then we never need to check for that
  18 relations (R)'s type... we only need to check if condition(a) => b in S
  19 hold ups (statically).
  20
  21 if not, in this case the question is whether its better to do
  22 a post exhaustive step or a per addition check ... in the per addition
  23 step we will obviously need to ensure that the sets that compose the
  24 domain and range of a relation are built prior to the rules that add
  25 elements to relations (i'm thinking that its better to do a per addition
  26 check because the post exhaustive step seems just as expensive and
  27 messier!)
  28
  29 question still remains though: do we need to build the sets prior?
  30 well... of course we do unless we want the worklist to also trigger
  31 rules based upon the additions to the sets of the respective
  32 relations... of course we would only be interested if a particular
  33 relation addition was blocked because of such a check...
  34
  35 what do we do now? well we don't even properly include required
  36 constraints in the required symbol list... that is being changed right
  37 now ... we'll have to compare this with the older graph
  38
  39 ok... the answer is that the graph changes once we make this change
  40 ... i'm checking out if cycles appear... i'm also going to consider the
  41 problem of not having a well defined relation domain/range
  42 (i.e. integers rather than a strictly typed set)
  43
  44 ------------------------------------------------------------------------
  45
  46 1. check at compile time how relations are used. if they are used in
  47    both directions (meaning inverse and otherwise) generate both
  48    simultaneously so that checks are fast .... otherwise only generate
  49    one of them that is used... code for generation should automatically
  50    handle the necessary switches and double-adding...
  51
  52
  53 new issues - it is not always possible to precalculate (or even
  54 calculate) an offset for a particular field... this would be the case if
  55 a field relied on a particular structure's fields and that fields'
  56 offset relied upon itself or something beyond it in its own
  57 structure...
  58
  59 compilation plan xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  60
  61 1. determine preconditions (required sets/relations), post conditions
  62    (generated sets), and measures of difficult in checking constraints
  63
  64 2. find best ordering of rule building to minimize the cost of space
  65    and time.
  66
  67 3. build ordering and generate code.
  68
  69
  70
  71 set traversal xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  72
  73 1. when accessing set traversal costs assume that sets do not have
  74    cycles and that set traversal can occur in constant time if necessary.
  75
  76 2. on first run, do a linear space traversal to determine a time bound
  77    on which the set traversal should finish. from then on use a constant
  78    space traversal with a dynamic time bound.
  79
  80
  81
  82 accessing ease of relation building xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  83
  84 1. rank ease of relation building by whether the relation is a field
  85 or a constant time countable function of the fields.
  86
  87
  88
  89 changing lattices xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  90
  91 1. examine how relations are built. for instance, if we can guarantee
  92    that duplicates shouldn't occur, and we might be able to if we use a
  93    hash just in case.
  94
  95 2. if a relation is only used for counting and we are guaranteed not
  96    to double count than just keep the count around (saves massive space)
  97
  98
  99
 100 plan of action xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
 101
 102 1. determine rules for generating pre/post conditions and cost measures
 103    for rule building.
 104
 105 2. devise an algorithm to schedule these rule building and constraint
 106    checking mechanisms.
 107
 108 3. be warned about tight coupling of set building and consistency
 109    checking. it may be better to decouple them and set up pre/post
 110    conditions on the consistency checks (per quantifier)
 111
 112 4. test scheduler against file example and compare to hand coded
 113
 114
 115
 116 plan of action revised xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
 117
 118 1. perform topological sort of all rules, TS[nodes]
 119
 120 2. mark sets that need to be built in linear space if
 121    a. it appears inside a guard expression or inclusion expr
 122    b. it is marked "build" in the spec (see redundant)
 123    c. if it appears as the range in a relation that is not
 124       statically type safe
 125
 126 3. mark relations that need to be built in linear space if
 127    a. they appear in constraints and are difficult to build
 128    b. a cycle has been found
 129
 130 4. detect singletons and replace with constants and wrap traversals with
 131    'if-else' checks. upon failure define sets that the quantifier
 132    appears in as empty and check as usual
 133
 134 5. for all sets that appear in constraints and relations that need to be
 135    built, sort them topologically by TS and the generate each one in order.
 136
 137 6. pile up set building on top of previous set traversals unless a set
 138    is marked 'build'. then build it and each traversal dependent on build
 139    should begin afresh.
 140
 141 7. at each interface point, assess what can be checked and check it. the
 142    idea is that since we are guaranteeing that everything that needs to
 143    traversed is traversed if we check whenever we can, what we can then
 144    we should end up doing a complete check.
 145
 146
 147
 148 hacks xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
 149
 150 1. statically find singleton sets and optimize them out of quantifiers
 151
 152
 153
 154 issues xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
 155
 156 1. domain type problems with relations during fast traversal
 157
 158    S(node*);
 159    R(S->S);
 160    [], true => root in S
 161    [forall s in S], s.next != null => <s, s.next> in R
 162    [forall s in S], s.next != null AND s.flag = 1 => s.next in S
 163
 164    the problem here is that if you have a simple linked list,
 165    root(1)->element(0) where the parantheses mark the value of .flag
 166    then the above system will add <root, element> to R but element is
 167    not in S because flag = 0. In brian's system the relations undergo a
 168    post process to determine if any elements exist don't exist in the
 169    domain/range of the relation.
 170
 171    the problem appears when the relation is being quickly traversed (via
 172    pattern matching). now it is possible for a relation to be returned
 173    that is not truly a match. the solution we have determined at this
 174    point is to assume that most specs won't have this problem (its
 175    arguably a problem in the spec!) and to basically do perform these
 176    two static tests:
 177
 178    if [forall s in S], s.next != null => <s, s.next> in R
 179       AND  s.next != null => s.next in S
 180
 181    then R is safe and no fast traversal should return non-elements,
 182    regardless of the state of the data structure. If it can not be
 183    determined whether a relation is type safe, then a linear space
 184    storage of S must be done so that the relations can be
 185    existence-checked against the stored set
 186
 187 2. redundancy in sets may result in slowdown
 188
 189    assume we have a set of file references (set R) which reference files
 190    (set F). These files can contain a large amount of block
 191    references (set B). Now assume that we have one file but thousands of
 192    references to that file. If there a test on the blocks of a file than
 193    there can be a slow down caused by a constant space traversal of the
 194    sets that is disproportiate to the gain seen.
 195
 196    clarification. lets assume that we test that blocks are marked as
 197    used. if sets R and F and B were generated in linear space than the
 198    number of checks done to test the constraint on B would be O(B).
 199
 200    now assume that we traverse R, F, and B in constant space. Because
 201    each reference points to the same file and we have no way of knowing
 202    in constant space whether we are rechecking each file over and over,
 203    we will check the constraints on B O(R*B) times. this is much worse
 204    and represents a jump in the time complexity.
 205
 206    static analysis of the specs will not be able to determine such a
 207    problem as a broken data structure is a perfectly viable input to the
 208    system. therefore it is necessary to have the program be given hints
 209    about the redundancy of structures. If a structure is highly
 210    redundant than that structure should be stored in linear space to
 211    fight the problem of quadratic blow up.
 212
 213    the hint could be in the form of a keyword addition to the space file
 214    language. another option is to allow the checker to dynamically
 215    detect redundancy (how? maybe hash table? maybe sparse linear time
 216    checks) and change the way its builds sets realtime. for simplicity,
 217    the spec hint will be used.
 218
 219 3. how do page accesses translate to incremental model checks?
 220
 221    TBD.
 222
 223