Using the rebalancer to move the content in one forest to another location
13 September 2017 10:18 AM
In this Knowledgebase article, we will show you how the rebalancer can be used to migrate data from a filesystem mount to another separate mount.
This scenario could occur, for example, if you created some forests and initially did not specify a data directory and later on, a new volume was added.
Understanding Forest configuration
It's worth noting that you can't modify a forest's data directory location after the forest has been created, so if a forest is created and you later realise that the data directory path was incorrect, the fastest course of action to remedy the issue is to delete the current configuration and to create a new forest with the correct (and updated) configuration.
Scenario: migrating all forest content from one location to a new location
In this scenario, we are going to make use of MarkLogic's rebalancer to get it to migrate the data held in two forests onto two other forests. In this scenario, we are working on the premise that there is a database (called MyDatabase) which contains two forests: these forests happen to have the default data directory specified.
We want to migrate all the data into two new forests on a different mount point. In this example, I'm demonstrating this through the use of a different directory, but the principle remains the same.
We will run through this scenario step-by-step to show you how to migrate data from one location to another.
1. Identify the database that contains the data that you want to move
In this situation, we're using an example database called MyDatabase
2. Ensure that the rebalancer is enabled for this database
In the admin GUI on port 8001 we are going to go to Configure > Databases > MyDatabase and then we're going to scroll down the options until we see the one for "rebalancer enable". This needs to be set to true.
3. Make a note of the current forests for that database
In the admin GUI on port 8001 we are going to go to Configure > Databases > MyDatabase and then we're going to go to the "Status" tab:
Note that we have two forests listed: Forest-1 and Forest-2. These are the forests that will be getting retired and removed from service.
4. Create your new forests on the new mountpoint
In this case, we have created two new forests: NewForest-1 and NewForest-2 and in both cases, we've specified a new filesystem location. In the example below I've called this C:\MarkLogicData to demonstrate the process:
Note that you can go to this view in the admin GUI by going to Configure > Forests and by looking at the content in the Summary tab.
Also note that at this stage, these forests have been created but they're not listed as being attached to any database; this is indicated by the blank entry in the dropdown menus next to the forest names.
5. Review the current forest configuration of your database
You can see the status of the current configuration in the admin GUI on port 8001 by selecting Configure > Databases > MyDatabase > Forests
Note that you should see your two current forests (Forest-1 and Forest-2 listed as "attached" and immediately below that you should see that there are two forests that are not yet attached to any database (NewForest-1 and NewForest-2):
6. Retire the current forests and attach the new forests to the database
Ensure that the original forests (Forest-1 and Forest-2) are now set to "retired" using the checkboxes and ensure the new forests (NewForest-1 and NewForest-2) are now attached to this database so the rebalancer can migrate all the data from the original forests to the newly added forests:
7. Confirm that the rebalancer is now operational on the database status page
In the admin GUI on port 8001 go back to Configure > Databases > MyDatabase and then look at the Status tab:
Note that you should see information listed under the heading "Rebalancing State"; this should give you an indication on how long MarkLogic Server expects the operation to take and how many fragments need to be migrated out.
You should also see 4 forests listed; the original forests should now show less documents than before and a number of deleted fragments, whereas you should see the document counts on the newly added forests beginning to increase.
8. Confirm that the original forests are now empty
When the process is complete, the Rebalancing State should read "Not rebalancing" and you should see that your original 2 forests now list 0 documents:
At this stage, we can see that the rebalancing work is done and the retired forests are now safe to remove from the system.
9. Detach the original forests from your database
In the admin GUI on port 8001, go back to the Forest Configuration page for the database (MyDatabase in this example) by selecting Configure > Databases > MyDatabase > Forests
You can now uncheck the attached checkboxes for both of the original (now retired) forests. Save the changes with the "ok" button:
10. Confirm that your database only lists the new forests on the status page
Note that we should only see two forests (NewForest-1 and NewForest-2) listed this time: