Knowledgebase:
MarkLogic Server Backup is Slower than a File Copy.
22 March 2013 07:07 PM

Summary

A database or forest backup in MarkLogic Server may be significantly slower than just performing a file copy (cp in Linux).  Why is this so?

Details

Using cp on very large files on a large-memory linux can produce huge amounts of dirty pages that can saturate i/o channels for minutes in order to flush data to the disk. Cp also doesn’t wait for the data to be written before returning.  As a result, cp is very unfriendly to other applications running on the same system.

When MarkLogic Server performs a backup, it works hard not to saturate any subsystem or resource. MarkLogic takes care that the number of dirty pages at any one time is never very large, and it keeps the i/o queues short so that any concurrent database queries and updates are not significantly impacted by the backup. Finishing the backup in the fastest possible time is not the priority. 

Can I make it go faster?

Yes, there is a diagnostic trace event “Unthrottle Backup” that turns off throttling in MarkLogic. However, even with throttling turned off, MarkLogic will still work to keep the number of dirty pages low.

The diagnostic trace event can be enabled from the MarkLogic Server Admin UI by navigating to -> Configure -> Groups -> {group-name} -> Diagnostic:  trace events activated = true; Add  “Unthrottle Backup” (without quotes); Press "ok".

(2 vote(s))
Helpful
Not helpful

Comments (0)