Knowledgebase: MarkLogic Server
Adding RAM to your host
16 November 2015 07:33 PM

Summary

When changing the amount of RAM on your MarkLogic Server host, there are additional considerations such as cache settings and swap space.

Group Cache Settings

As a ‘Rule of Thumb’, the memory allocated to group caches (List, Compressed Tree and Expanded Tree) on a host should come out to be about 1/3 to 3/8 of main memory. Increasing the group caches beyond this ratio can result in excessive swapping which will adversely affect performance.

  • For E/D-nodes: this can be distributed as 1/8 of main memory dedicated to your list cache, 1/16 to your compressed tree cache, and 1/8 to your expanded tree cache.
  • For E-nodes: Can be configured with a larger Expanded Tree Cache; the List Cache and the Compressed Tree Cache can be set to 128MB each. 
  • For D-nodes: Can be configured with a larger List Cache and Compressed Tree Cache; the Expanded Tree Cache can be set to 128MB. 

Swap Space (Linux)

  • Linux Huge Pages should be set to 3/8 the size of your physical memory. 
  • Swap space should be set to the size of your physical memory minus the size of your Huge Pages (because Linux Huge Pages are not swapped), or 32GB, whichever is lower.

For Example: If you have 96GB RAM; Huge Pages should be set to 36GB, and swap space at 32GB. 

Swap Space (Solaris)

Swap space should be twice the size of physical memory

Solaris technical note:

Why do we recommend 2x the size of main memory allocated to swap space? When MarkLogic allocates virtual memory on Solaris with mmap and the MAP_ANON option, we do not specify MAP_NORESERVE, and instead let the kernel reserve swap space at the time of allocation.  We do this so that if we run out of swap space we find out by the memory allocation failing with an error, rather than the process getting killed at some inopportune time with SIGBUS.  The process could be using about all of physical memory, so that explains why you need at 1X physical memory in swap space.

MarkLogic Server uses the standard fork() and exec() idiom to run the pstack program.  The pstack program can be a critically important tool for the diagnosis of server problems.  We can’t use vfork() to run pstack instead of fork() because it’s not safe for multithreaded programs.  When a process calls fork(), the kernel makes a virtual copy of all the address space of the process, so it also reserves swap space for this anonymously mapped memory for the forked process.  Of course, immediately after forking, the forked process calls exec() on the pstack program, which frees that reserved memory.  Unlike Linux, Solaris doesn’t overbook swap space, so if the kernel cannot reserve the swap space, fork() fails.  That's why you need 2X physical memory for swap space on Solaris.

Page File (Windows)

On a Windows system, the page file should be twice the size of physical memory. You can set the page file size in the virtual memory section in the advanced system settings from the Windows Control Panel. 

Performance Solution?

Increasing the amount of RAM is not a "cure all" for performance issues on the server.  Even when memory issues appears to be the resource bottleneck, increasing RAM may not be the required solution.  However, here are a few scenarios where increasing RAM on your server may be appropriate

  • You have a need to increase group cache sizes because your cache hit / miss ratio is too high or your queries are failing with cache full errors. In this case, increasing RAM can give you additional flexibility on how the group caches are configured.  However, an alternative solution that could result in even greater performance improvements may involve reworking your queries and index settings so that the queries can be fully resolved during the index resolution phase of the query evaluation.
  • While monitoring your server for swap utilization, you noticed that the server is swapping often and you have already checked your memory and cache setting to verify they are within the MarkLogic recommendations.  The system should be configured so that swapping does not occur during normal operations of the server as swapping can severely affect performance adversely. If that is the case, then adding RAM may improve performance. 

Increasing RAM on your server may only be a temporary fix.  If your queries do not scale, then, as the data size in your forests grow, you may once again hit the issues that caused you to increase your RAM in the first place. If this is the case evaluate your queries and indexes to make them more efficient.

(10 vote(s))
Helpful
Not helpful

Comments (1)
Peter Kester
23 January 2018 05:42 AM
Do we have an update on this for Marklogic 9? Or do the same rules of thumb apply?