Distributed real-time systems increase in size an complexity, and the nodes in such systems become difficult to implement and test. In particular, communication for synchronization of shared information in groups of nodes becomes complex to manage. Several authors have proposed to using a distributed database as a communication subsystem, to off-load database applications from explicit communication. This lets the task for information dissemination be done by the replication mechanisms of the database. With increasingly larger systems, however, there is a need for managing the scalability for such database approach. Furthermore, timeliness for database clients requires predictable resource usage, and scalability requires bounded resource usage in the database system. Thus, predictable resource management is an essential function for realizing timeliness in a large scale setting.
We discuss scalability problems and methods for distributed real-time databases in the context of the DeeDS database prototype. Here, all transactions can be executed timely at the local node due to main memory residence, full replication and detached replication of updates. Full replication contributes to timeliness and availability, but has a high cost in excessive usage of bandwidth, storage, and processing, in sending all updates to all nodes regardless of updates will be used there or not. In particular, unbounded resource usage is an obstacle for building large scale distributed databases. For many application scenarios it can be assumed that most of the database is shared by only a limited number of nodes. Under this assumption it is reasonable to believe that the degree of replication can be bounded, so that a bound also can be set on resource usage.
The thesis proposal identifies and elaborates research problems for bounding resource usage in large scale distributed real-time databases. One objective is to bound resource usage by taking advantages of pre-specified data needs, but also by detecting unspecified data needs and adapting resource management accordingly. We elaborate and evaluate the concept of virtual full replication, which provides an image of a fully replicated database to database clients. It makes data objects available where needed, while fulfilling timeliness and consistency requirements on the data.
In the first part of our work, virtual full replication makes data available where needed by taking advantages of pre-specified data accesses to the distributed database. For hard real-time systems, the required data accesses are usually known since such systems need to be well specified to guarantee timeliness. However, there are many applications where a specification of data accesses can not be done before execution. The second part of our work extends virtual full replication to be used with such applications. By detecting new and changed data accesses during execution and adapt database replication, virtual full replication can continuously provide the image of full replication while preserving scalability.
One of the objective of the thesis work is to quantify scalability in the database context, so that actual benefits and achievements can be evaluated. Further, we find out the conditions for setting bounds on resource usage for scalability, under both static and dynamic data requirements.