Modification to the current specifications.

I. Order
-- Sequential order (SO): Some total order that is consistent with the union of
happens-before and SC relation.

II. State
1. Global state: We allow users to specify a single global state so that when we
want to execute the sequential replay (execute the sequential order), the whole
process is similar to executing an sequential program. Such a global state is
similiar to the internal state of a sequential data structure. We also have this
in our old version (the rejection of PLDI'16). As an analogy to the cache-memory
model, the global state we define here is similar to the main memory in the
sense that there does not exist a real total order to all memory accesses, but
to some degree (with the non-deterministic specifications) we can have an
illution that there is some total order.

2. Local State: Beside of having one global state (the one that we have in the
old approach), we also allow each method call to have a local state.  This local
state is the accumulative side effects of the subset of method calls that happen
before the current method call. As an analogy to the cache-memory model, the
local state we define here is similar to cache, a local state is local to the
sense that the current method call must have seen those effects. The main goal
of having this is to support tighter non-deterministic specifications.

To evaluate the local state of each method call, an obvious approach is to
execute the subset of methods that happen before the current method in the
sequential order from the initial state. A optimization we can make is that we
start to evaluate the state from the most recent deviding node which every other
node in that subset is either hb before or after. Also, since local states are
not required in specifications all the time, it is only evaluated when needed.

III. Specifications
Our specification language supports using the following primitives to access
both global state and local state so that users can use those to write
specifications with different level of tightness.

To support tighter specifications, we introduce the concept of concurrent set of
method calls, meaning that for a specific method call, it can basically see the
effect of two different categories of method calls --- one that happens before
it, and one that concurrently execute with it. It is worth noting that when two
two method calls execute concurrently, in general there can be the following two
effects: 1) those concurrent methods can happen in either order, and the final
result remains the same. A concurrent FIFO is an example, in which concurrent
enqueue and dequeue methods can happen in a random order; and 2) the order of
those concurrent methods will affect the final result. The C/C++11 atomics is an
example, in which when concurrent stores to the same location execute in
different order, a later store will have different result.

1. CONCURRENT: This primitive extracts all the methods that executes
"concurrently" with the current method --- neither hb/SC before nor after the
current method --- and returns as a set. It is worth noting that the concurrent
method calls cannot be accessed for calculating local state but only for
assertions.

2. PREV: This primitive extracts all the methods that execute right before the
current method in the execution graph --- most recent method calls that are
hb/SC before the current method call --- and returns as a set. For each method
in this set, the current method's specification can access their local state.

3. NEXT: This primitive extracts all the methods that execute right after the
current method in the execution graph, and returns as a set. For each method in
this set, the current method's specification cannot access their local state.

4. LOCAL: This primitive allows users to access the local state of a method
call. It is worth noting that in order to calculate the local state of a method
call, one can only access the set of method calls that happen before the current
method call.

Our specifications allow two ways of calculating the local states, a default way
and a user-customized way. The default approach is to execute the set of method
calls that happen before the current method call in the sequential order, and a
the user-customized approach supports users to calculate the local state by
using the PREV primitive to access the local state of previous method calls.

5. COPY: This is the function that users provide to deep-copy the state. We
require users to provide such a primitive function because each local state
should be have its own copy.

6. FINALLY: This is the function that allows users to specify some final check
on the state. Initially, this check will only access the global state. However,
for the concern of expressiveness, we allow users to access the initial state,
meaning that users can basically access the whole graph and the local state of
each method call. This will enable to users to use the graph model (the relaxed
atomics can be specified) although the complxity of using that can get quite
complex.

IV. Examples

// Global specification
@DeclareState: // Declare the state structure
@InitState: // How do we initialize the state
@CopyState: // A function on how to copy an existing state
@Commutativity: Method1 <-> Method2 (Guard) // Guard can only access the return
		// value and the arguments of the two method calls

// Interface specification
@Interface: InterfaceName // Required; a label to represent the interface
@LocalState: // Optional; to calculate the accumulative local state before this
			 // method call in a customized fashion. If not specified here, the
			 // local state would be default, which is the result of the
			 // execution on the subset of method calls in the sequential order
@PreCondition: // Optional; checking code
@LocalSideEffect: // Optional; to calculate the side effect this method call
				  // have on the local state in a customized fashion. If this
				  // field is not stated, it means we don't care about it.
@SideEffect: // Optional; to calculate the side effect on the global state. When
		// the "@LocalSideEffect" specification is ommitted, we also impose the
		// same side effect on the set of method calls that happen before this
		// method call in the sequential order.
@PostCondition: // Optional; checking code

// Ordering point specification
@OPDefine: condition	// If the specified condition satisfied, the atomic
						// operation right before is an ordering point

@PotentialOP(Label): condition	// If the specified condition satisfied, the
								// atomic operation right before is a potential
								// ordering point, and we label it with a tag

@OPCheck(Label): condition	// If the specified condition satisfied, the
							// potential ordering point defined earlier with the
							// same tag becomes an ordering point

@OPClear: condition		// If the specified condition satisfied, all the
						// ordering points and potential ordering points will be
						// cleared

@OPClearDefine: condition	// If the specified condition satisfied, all the
							// ordering points and potential ordering points will
							// be cleared, and the atomic operation right before
							// becomes an ordering point. This is a syntax sugar
							// as the combination of an "OPClear" and "OPDefine"
							// statement


1. The register examples: Basically, we can think of registers as the cache on a
memory system. The effect of a load or store basically tells us what the current
value in the cache line is, and a load can read from values that can be
potentially in the cache --- either one of the concurrent store update the cache
or it inherites one of the the previous local state in the execution graph.

----------   Interfae   ----------
void init(atomic_int &loc, int initial);
int load(atomic_int &loc);
void store(atomic_int &loc, int val);
----------   Interfae   ----------

a. The SC atomics --- the classic linearizability approach

b. The RA (release/acquire) C/C++ atomics
// For RA atomics, a load must read its value from a store that happens before
// it.
----------   Specification   ----------
@DeclareVar: int x;
@InitVar: x = 0;

@Interface: Store
@SideEffect: LOCAL(x) = val;
void store(int *loc, int val);

@Interface: Load
@PreCondition:
	Subset(PREV, subset, LOCAL(x) == RET);
	return Size(subset) > 0;
@SideEffect: LOCAL(x) = RET;
int load(int *loc);

c. The C/C++ atomics (a slightly loose specification)
// Actually, any concurrent data structures that rely modification-order to be
// correct would not have a precicely tight specification under our model, and
// C/C++ relaxed atomics is an example. See the following read-read coherence
// example.

// T1				// T2
x = 1;				x = 2;

// T3
r1 = x; // r1 == 1
r2 = x; // r2 == 2
r3 = x; // r3 == 1

Our model cannot prevent such a case from happening. However, we can still have
a slightly loose specification which basically states that a load can read from
any store that either immediately happens before it or concurrently executes.


// We define a struct called MethodCall to represent the data we would collect
// and communicate between the real execution and the checking process
typedef struct MethodCall {
	string interfaceName; // The interface label name
	void *value; // The pointer that points to the struct that have the return
				 // value and the arguments
	void *localState; // The pointer that points to the struct that represents
					  // the (local) state
	vector<MethodCall*> *prev; // Method calls that are hb right before me
	vector<MethodCall*> *next; // Method calls that are hb right after me
	vector<MethodCall*> *concurrent; // Method calls that are concurrent with me
} MethodCall;

We will automatically generate two types of struct. One is the state struct, and
we will call it StateStruct. This state is shared by all the global and local
state. The other one is a per interface struct, and it wraps the return value
(RET) and the arguments as its field. We will name those struct with the
interface label name.

// Some very nice C/C++ macro definition to make specifications a lot easier
// 1. ForEach  --> to iterate all 
#define ForEach(item, list) \
	for (iterator<MethodCall*> _mIter = list->begin(), \
		MethodCall *item = *_mIter; _mIter != list->end(); item = (++iter != \
		list->end()) ? *_mIter : NULL)


*********************************************

TODO:We want subset operation by the equality and inequality of the interface
name, a specific (or combined) state and a specific (or combined) of the value
(RET & arguments)

// 1.1 Subset(set, name)  --> to get a subset of method calls by name
inline vector<MethodCall*>* Subset(vector<MethodCall*> *set, string name) {
	vector<MethodCall*> _subset = new vector<MethodCall*>;
	for (int i = 0; i < set->size(); i++) {
		MethodCall *_m = (*set)[i];
		if (_m->interfaceName == name)
			_subset->push_back(_m);
	}
	return _subset;
}

// 1.2 Size(set) --> to get the size of a method set
#define Size(set) set->size()

// 1.3 Belong(set, method) --> whether method belongs to set
#define Belong(set, method) std::find(set->begin(), set->end(), method) != set->end()

// 1.4 Intersect(set1, set2) --> the intersection of two method sets
inline MethodSet Intersect(MethodSet set1, MethodSet set2) {
	MethodSet res = NewSet;
	ForEach (m, set1) {
		if (Belong(set2, m))
			res->push_back(m);
	}
	return res;
}

// 1.5 Union(set1, set2) --> the union of two method sets
inline MethodSet Union(MethodSet set1, MethodSet set2) {
	MethodSet res = NewSet(set1);
	ForEach (m, set2) {
		if (!Belong(set1, m))
			res->push_back(m);
	}
	return res;
}

// 1.6 Insert(set, method) --> add a method to the set
inline bool Insert(MethodSet set, Method m) {
	if (Belong(set, m))
		return false;
	else {
		set->push_back(m);
		return true;
	}
}

// 1.7 Subtract(set1, set2) --> subtract set2 from set1 
inline MethodSet Subtract(MethodSet set1, MethodSet set2) {
	MethodSet res = NewSet;
	ForEach (m, set1) {
		if (!Belong(set2, m))
			res->push_back(m);
	}
	return res;
}

// 1.8 MakeSet(count, ...) --> Make a set from the arguments

	
// 2. Local(method, field)
#define Local(method, field) ((StateStruct*) method->localState)->field

// 3. Value(method, type, field)
#define Value(method, type, field) ((type*) method->value)->field

// 4. Name(mehtod)
#defien Lable(method) method->interfaceName

// 5. Prev(method)
#define Prev(method) mehtod->prev

// 6. Next(method)
#define Next(method) mehtod->next

// 7. Concurrent(method)
#define Concurrent(method) mehtod->concurrent


----------   Specification   ----------
@DeclareVar: int x;
@InitVar: x = 0;

@Interface: Store
@SideEffect: LOCAL(x) = val;
void store(int *loc, int val);


@Interface: Load
@PreCondition:
	// Auto generated code
	// MethodCall *ME = ((SomeTypeConversion) info)->method;
	
	Subset(Prev, readSet1, LOCAL(x) == RET);
	if (Size(readSet1) > 0) return true;
	Subset(Concurrent, readSet2, NAME == "Store" && VALUE(Store, val) == RET);
	return Size(readSet2) > 0;
@SideEffect: LOCAL(x) = RET;
int load(int *loc);

d. The C/C++ normal memory accesses
- Use the admissibility requirement, then the classic linearizability approach
on the admissible executions

2. The FIFO queue example.
----------   Specification   ----------
// A FIFO queue should have the following properties held:
// 1. The enq() methods should conflict
// 2. The deq() methods that succeed should conflict
// 3. Corresponding enq() and deq() methods should conflict
// 4. An enqueued item can be dequeued by at most once
// 5. A dequeued item must have a corresponding enqueued item
// 6. When a queue is NOT "empty" (users can tightly or loosely define
// emptiness), and there comes a deq() method, the deq() method should succeed


@DeclareVar: vector<int> *q;
@InitVar: q = new voctor<int>;
@Copy: New.q = new vector<int>(Old.q);
// Fails to dequeue
@Commutativity: Deq <-> Deq (!_M1.RET || !_M2.RET)
// The dequeuer doesn't dequeue from that enqueuer
@Commutativity: Enq <-> Deq (!_M2.RET || (_M2.RET && Enq.val != *Deq.res))

@Interface: Enq
@SideEffect: q->push_back(val);
void enq(queue_t *q, int val);

@Interface: Deq
@PreCondition:
	// Check whether the queue is really empty
	// Either the local state is an empty queue, or for all those remaining
	// elements in the local queue, there should be some concurrent dequeuers to
	// dequeue them
	if (!RET) {
		// Local state is empty
		if (Local(q)->size() == 0) return true;
		// Otherwise check there must be other concurrent dequeuers
		ForEach (item, Local(q)) {
			// Check there's some concurrent dequeuer for this item
			Subset(CONCURRENT, concurSet, NAME == "Deq" && VALUE(Deq, RET) &&
				*VALUE(Deq, res) == item);
			if (Size(concurSet) == 0) return false;
		}
		return true;
	} else { // Check the global queue state
		return q->back() == *res;
	}
@SideEffect:
	if (RET) q->pop_back();
bool deq(queue_t *q, int *res);


A good example to simulate a queue data structure is as follows.
Suppose we have a special structure
typedef struct Q {
	atomic_int x;
	atomic_int y;
} Q;

, and we have two interface on Q, read() and write(), where the write and read
method calls are synchronized by themselves, and they have to read and write the
x and y fields in turn.


----------------------------------------------------------------------------------------
We also need to think about the non-ordered queue.

####
Combiming Data Structures --- 
For example, a queue, a -hb->c, b -hb-> e.

// T1
enq(1) -> {1} - {$}    // a
enq(2) -> {1, 2} - {$}   // b

// T2
deq(1) -> {$} - {1}   // c
deq($) -> {$} - {1}   // d

// State before this method  
JOIN ({1, 2} - {$}, {$} - {1}) = {2} - {1}
deq(2) -> {$} - {1, 2}