Solutions

Stay on top of everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Learn

Stay on top of everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Community

Stay on top of everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Company

Stay on top of everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

 
Knowledgebase:
Best Practice for Adding an Index in Production
01 November 2017 02:21 PM

Best Practice for Adding an Index in Production

Summary

It is sometimes necessary to remove or add an index to your production cluster. For a large database with more than a few GB of content, the resulting workload from reindexing your database can be a time and resource intensive process, that can affect query performance while the server is reindexing. This article points out some strategies for avoiding some of the pain-points associated with changing your database configuration on a production cluster.

Preparing your Server for Production

In general, high performance production search implementations run with tight controls on the automatic features of MarkLogic Server. 

  • Re-indexer disabled by default
  • Format-compatibility set to the latest format
  • Index-detection set to none.
  • On a very large cluster (several dozen or more hosts), consider running with expunge-locks set to none
  • On large clusters with insufficient resources, consider bumping up the default group settings
    • xdqp-timeout: from 10 to 30
    • host-timeout: from 30 to 90

The xdqp and host timeouts will prevent the server from disconnecting prematurely when a data-node is busy, possibly triggering a false failover event. However, these changes will affect the legitimate time to failover in an HA configuration. 

Preparing to Re-index

When an index configuration must be changed in production, you should:

  • First, index-detection should is set back to automatic,
  • Then, the index configuration change should be made,
  • Finally, the reindexer should be enabled during off-hours to reindex the content.

Reindexing works by reloading all the Uris that are affected by the index change, this process tends to create lots of new/deleted fragments which then need to be merged. Given that reindexing is very CPU and disk I/O intensive, the re-indexer-throttle can be set to 3 or 2 to minimize impact of the reindex.

After the Re-index

After the re-index has completed, it is important to return to the old settings by disabling the reindexer and setting index-detection back to none.

If you're reindexing over several nights or weekends, be sure to allow some time for the merging to complete. So for example, if your regular busy time starts at 5AM, you may want to disable the reindexer at around midnight to make sure all your merging is completed before business hours.

By following the above recommendations, you should be able to complete a large re-index without any disruption to your production environment.

(7 vote(s))
Helpful
Not helpful

Comments (0)