Using collection lexicon for tiered storage
28 February 2017 05:13 PM
|
|
Tiered Storage MarkLogic Server allows you to manage your data at different tiers of storage and computation environments, with the top-most tier providing the fastest access to your most critical data and the lowest tier providing the slowest access to your least critical data. MarkLogic Server tiered storage manages data in partitions. Each partition consists of a group of database forests that share the same name prefix and the same partition range. The range of a partition defines the scope of element or attribute values for the documents to be stored in the partition. This element or attribute is called the partition key. The partition key is based on a range index, collection lexicon, or field set on the database. The partition key is set on the database and the partition range is set on the partition, so there can be several partitions in a database with different ranges. This article provides a generic and simple example of using a collection lexicon as the partition key. Collection Lexicon with Tiered Storage Consider a database 'test-db' with 4 forests that are grouped into 2 partitions. Following are the necessary configuration requirements to setup this database for tiered storage. These are settings that can be configured on the admin UI database configuration page (Admin UI - > databases -> {database-name})
- set 'rebalancer enable' to true - set 'Locking' to strict - set 'Rebalancer Assignment Policy' to range - set 'Collection lexicon' to true Under the assignment policy, choose 'Collection Lexicon' as the 'Range index type'. By doing this we are setting the partition key as 'collection lexicon' Partitions are based on forest naming conventions. A forest's partition name prefix and the rest of the forest name are separated by a dash (-). For our example, consider the following forest names and the partitions they will be grouped into: tier1-forest1 tier1-forest2 tier2-forest1 tier2-forest2 As specified by the forest name, all forests with the same prefix are grouped under one partition. So, in this case, forests with prefix tier-1 are grouped under the first partition, forests with prefix tier-2 are grouped as the next partition, and so on. Note that all of the forests in a database configured for tiered storage must be part of a partition. The determination of which partition the data that is ingested should be placed in is made by the defined partition range. All the forests in one partition will have a common range. These are defined in the forest configuration page (Admin UI-> forests-> {forest-name}-> range) For this example, since we are using collection lexicon as the partition key, consider the following ranges for the three partitions - Tier1 lower bound - accounts upper bound - files Tier2 lower bound - journals upper bound - magazines Alternatively, partitions can be created using the REST management API or the xquery/Javascript APIs (). Once this is done, if documents are ingested, for example with a collection "books", that document will be placed into any of the forests in Tier-1. Related Documentation | |
|