Community

MarkLogic 10 and Data Hub 5.0

Latest MarkLogic releases provide a smarter, simpler, and more secure way to integrate data.

Read Blog →

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up →

 
Knowledgebase:
Rebalancing, replication and forest reordering
02 September 2020 10:07 PM

Context

This KB article talks specifically about how the Rebalancer interacts with database replication, and how to solve the issues that may arise if not configured correctly.

For a general discussion on how rebalancing works in MarkLogic, refer to this article and the server documentation.

Rebalancing and Database Replication

When database replication is configured for a database, rebalancing is disabled by default on the Replica database and no rebalancing will occur until the database replication configuration is deleted. Until the time when the primary is available, forest to forest mapping will remain.

Note that the Replica databases must have at least as many forests as the Master database. Otherwise, not all of the data on the Master database will be replicated.

It is important to make sure that the assignment policy on the Replica is the same as the Master - so that in a DR situation, when the Replica takes over as the Primary, rebalancing is not triggered.

Forest order mismatch can cause Rebalancing

Forest order is the order in which forests are attached to the database. When the document assignment policy is set to 'Segment', 'Legacy' or 'Bucket', it is required that the Replica database configuration should have the same forest order as the Master to ensure rebalancing does not occur if or when replication is deconfigured.

If there is a difference in forest orders between the Master and the Replica, a Warning level message is logged on the Replica, which looks like this:

2015-10-21 13:34:59.359 Warning: forest order mismatch: local forest Test_12 is at position 15 
while foreign master forest 2108358988113530610 (cluster=8893136914265436826) is at position 12

In this state, when database replication is deleted between the clusters, the database on the Replica cluster will start to rebalance right away and it could take variable amount of time depending on how many documents need to be rebalanced.

Fixing the forest order:

On clusters with database replication enabled and both Master and Replica databases in sync (document counts match and all primary forests on Replica db are in 'open replica' state), the following steps help in removing the mismatch and making the forest order same on both Master and Replica

i. Make sure that both Master and Replica databases have the same rebalancer assignment policy.

ii. Disable rebalancer and reindexer, if you have them enabled on both clusters for the database in question.

iii. Obtain the forest order from the Master cluster - below is the query for an example database:

xquery version "1.0-ml";

(: Returns a list of forests in order for a given database :)

import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";

let $config := admin:get-configuration()
let $dbid := admin:database-get-id($config, "content-db-master")

return admin:database-get-attached-forests($config,$dbid) ! xdmp:forest-name(.)

Example output for this query is

content-forest-2, content-forest-1, content-forest-3

iv. On the Replica cluster, reorder the forests according to the order returned on the Master from step iii:

xquery version "1.0-ml";

import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";
let $config := admin:get-configuration()
let $dbid := admin:database-get-id($config, "content-db-replica")
let $forest-names-in-order := (
"content-forest-2",
"content-forest-1",
"content-forest-3"
)

let $forest-ids := $forest-names-in-order ! xdmp:forest (.)
let $config := admin:database-reorder-forests($config, $dbid, $forest-ids)
return (
'reordering to: ' || fn:string-join ($forest-names-in-order, ', '),
admin:save-configuration($config)
)

v. Re-enable rebalancer and reindexer on both clusters, if you had them enabled previously.

vi.Verify that the Warning messages on the Replica cluster do not appear anymore. (these messages are logged once every hour)

Further Reading:

Database Rebalancing

Understanding what work the rebalancer will do

Using the rebalancer to move the content in one forest to another location

Checking database replication status

(1 vote(s))
Helpful
Not helpful

Comments (0)