Replication is widely used to achieve various goals in information systems, such as better performance, fault tolerance, backup, etc. Four classes of replication technologies are available. Your decision about which technology to use will depend on the logical presentation of the data you are trying to replicate.

1、Storage-level replication

At the storage level, replication is focused on a block of binary data. Replication may be done either on block devices or at the file-system level. In both cases, replication is dealing with unstructured binary data. The range of technologies for storage-level replication is very broad, from commodity RAID arrays to network file system and NAS.

2、Database-level replication

Usually applications do not use file storage directly, but use databases instead. Databases deal with structured data. Database-level replication technologies also operate with structured data (e.g., records for RDBMS, objects for object databases). There are two major techniques for database replication.

Record-based replication

Databases utilize some minimal identifiable piece of data to function properly (e.g., for an RDBMS, it is a record in a table). In record-based replication, each time a record is modified, a logical change record (LCR) event is delivered to the replication partner to keep it in sync.

Statement-based replication

Sometimes a single statement may modify a large number of records. In this case, it would be more efficient to deliver the statement itself to the replication partner rather than copy all modified records. This is a statement-based approach.

All mainstream DBMS offerings have replication features built-in. Sometime you can even choose the type of replication (e.g., record or statement-based) in your database configuration. MySQL is one such product that allows this.

3、Message-centric replication

Often what we really want to do is not replicate data, but replicate the application state. For instance, in statement-based replication in databases, you can replicate a stream of business events, which are changing application states, instead of replicating data changes. The same processed events should lead to the same application state (at least the business-equivalent state).

Message-centric replication has a few practical benefits. First, replication is usually done at the boundary of a business transaction, which may be different from the database transaction boundary. Second, replication is independent from the structure of data in the database. This may be very helpful when you are rolling out updates to database schema -- different schemas can coexist in the replication topology.

Message-centric replication is tightly coupled with application logic. There is no generic solution for this kind of replication. But message queue middleware is widely used to transfer messages between replication partners.

4、Application-state replication (grid technologies)

Converting data from in-memory presentation to a database is expensive and it may be become a major bottleneck in certain cases. Application-state replication addresses this problem by replicating in-memory data structures between application processes (and servers). There are two main subclasses of such technologies.

In-memory data grid

In-memory data grids offer a shared-memory service like storage (all grid participants have a coherent view of this storage and may access it equally). The grid offers an API which is used by applications to manipulate shared storage. Storage is used to store application-level objects (in serialized form). Under the hood, grid middleware is responsible for replicating/distributing storage content across servers and processes.

Transparent application clustering

This approach tries to hide as much detail about replication as possible and minimize changes to application code. Typically for this class of technologies, processes are separated into worker processes (running application code) and management processes. In the simplest cases, these two processes can be combined into one. Management processes do not run any application code; they coordinate replication of states between instances of worker processes. Examples of products using this paradigm are Terracotta, which implements clustering for Java runtime, and GemStone – clustered runtime for Smalltalk (with upcoming support for Ruby).
comments 讨论   addto 把此链接加入于...  recommend 与朋友分享   report 已已沉

评论/ 意见 有谁投票过 相关链接