tengiz wrote:vc, боюсь, что я начинаю терять нить дискусси. Давайте попробуем всё-таки определиться и чётко обозначить что мы обсуждаем.
We are discussing the fact (I believe) that an S2PL scheduler by itself ensures serializable histories for static databases. However, there are at least two restrictions:
1. The database should be static. Maybe 'should be' is too stong a statement, the traditional serializability theory simply says nothing about inserts. It just talks about reads and writes of the existing data. However, as soon as we try to reason about inserts (writing new data), S2PL won't ensure serializability any more (phantoms).
2. If we want to use another more powerful language, like SQL, to describe our transactions, we have to preserve the original simplistic serializability theory language (SL) semantics since otherwise serializability theory is not applicable any more and any discussion becomes meaningless.
SL's vocabulary contains {read, write, precedence, conflict}.
A translation example:
if (select count (*) from t where col < 0) between 5 and 10 then do some stuff...
We cannot offer a translation at this stage because we do not know what 'do some stuff' is (I'll elaborate on why we need to know that later). Let's assume it's 'update t where col >0.
Then the SQL can be translated as:
if count(read(Xi|col >= 0)) between ...; /* we have to invert the predicate in order to create a *conflict* with concurrent writes */; write(Xi|col>0);
The reason we need to know what 'do stuff' is that we have to determine whether the original statement can be re-stated in such a way as to create conflicts, a crucial component of S2PL. Generally speaking, we can ensure that conflicts are in place when the subsets on which read/write operate are a partial order with respect to the 'subset of' operation. In other words, only SQLs where all the predicates define ( or can be re-stated to define) such subsets can be analyzed in terms of serializability theory.
tengiz wrote: Во-первых, обработка транзакций и реляцонная теория - вещи ортогональные и полностью независимые. Я использую SQL в примерах только потому, что это самый популярный способ общения с современными системами обработки транзакций.
I've talked all the time about transactions, not a word was said about relational theory. When I mentioned sets, I was trying to describe how to re-formulate a SQL so that it could be analyzed using serializability theory.
tengiz wrote:Во-вторых, суть обещаний, которые дают систем обработки транзакций очень проста: если программы, манипулирующие данными (любым способом, написанные на любом языке) работают правильно в единственном экземпляре, то система обработки транзакций гарантирует, что любая их параллельно работающая комбинация тоже будет работать правильно. Откуда сразу следует, что искусственных ограничений на то, что могут и чего не могут делать эти программы практически нет.
Informally yes you are right. Unfortunately S2PL theory, the cornerstone of all the locking schedulers, limits the set of serializable transactions to conflict serializable only with an additional restriction being that the database' had better be static.
tengiz wrote:Но формальная теория не имеет таких ограничений.
If by formal theory you mean S2PL, see above regarding restrictions.
tengiz wrote:Теоретическая работа описывающая проблему фантомов и её решение при помощи предикатных блокировок была опубликована аж в 1976 году.
Well yes Eswaran and others indeed published their work in 1976. However, whilst theoretically interesting, generic predicate locking implementation has at least two problems: detecting conflicts is very expensive and concurrency is not great. What we have today are partial implementations either via entire table locking or key-range locking (if you are lucky to have an appropriate index and the optimizer decides to use it).
Besides, predicate locking is an addition to S2PL so that we cannot say there is some elegant unified serializability theory. Instead, we have rather an eclectic architecture created to ensure decent concurrency in lower isolation modes through S2PL plus serializability via table/index locking under Serializable IL.