Community

MarkLogic 10 and Data Hub 5.0

Latest MarkLogic releases provide a smarter, simpler, and more secure way to integrate data.

Read Blog →

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up →

 
Knowledgebase : Administration

Introduction

When configuring database replication, it is important to note that the Connect Forests by Name field is true by default. This works great because, when new forests of the same name are later added to the Master and Replica databases, they will be automatically configured for Database Replication.

The issue

The problem arises when you use replica forest names that do not match the original Master forest names. In that case, you may find that failover events cause forests to get stuck in the Wait Replication state. The usual methods of failing back to the designated masters will not work - restarting the replicas will not work, and neither will shutting down cluster/removing labels/restarting cluster.

Resolution

In this case, the way to fix the issue is to set Connect Forests by Name to false, and then you must manually connect the Master forests on the local cluster to the Replica forests on the foreign cluster, as described in the documentation: Connecting Master and Replica Forests with Different Names.

it is worth noting that, starting MarkLogic 7, you are also allowed to rename the replica forests. Once you rename the replica forests to the same name as the forest name of the designated master database (e.g., the Security database should have a Security forest in both the master and replica), then they will be automatically configured for Database Replication, as expected.

Introduction

This article will show you how to add a Fast Data Directory (FDD) to an existing forest.

Details

The fast data directory stores transaction journals and stands. When the directory becomes full, larger stands will be merged into the data directory. Once the size of the fast data directory approaches its limit, then stands are created in the data directory.

Although it is not possible to add an FDD path to a currently-existing forest, it is possible to do the following:

1. Destroy an existing forest configuration (while preserving the data)

2. Re-create a forest with the same name & data, with an FDD added

 

The queries below illustrate steps one and two of the process. Note that you can also do this with Admin UI.

The query below will delete the forest configurations but not data.

Preparation:

1. Schedule a downtime window for this procedure (DO NOT DO THIS ON A LIVE PRODUCTION SYSTEM)

2. Insure that all ingestion and merging has stopped

3. Just to be on safer side, take a Backup of the forest first before applying this in Production

3. Detach the forest before running these queries


1) Use the following API to Delete an existing forest configuration

NOTE: make sure to set the $delete-data papameter to false().

admin:forest-delete(
$config as element(configuration),
$forest-ids as xs:unsignedLong*,
$delete-data as xs:Boolean {=FALSE}
) as element(configuration)


2) Use the following API to create a new forest  pointing to the old data directory which includes the configured FDD:

admin:forest-create(
$config as element(configuration),
$forest-name as xs:string,
$host-id as xs:unsignedLong,
$data-directory as xs:string?,
[$large-data-directory as xs:string?],
[$fast-data-directory as xs:string?]
) as element(configuration)



Here's an example query that uses these APIs:

xquery version "1.0-ml";

declare namespace html = "http://www.w3.org/1999/xhtml";

import module namespace admin = "http://marklogic.com/xdmp/admin" 
at "/MarkLogic/admin.xqy";

let $config := admin:get-configuration()

(: preserve some path values from the old forest :)

let $forest-name := "YOUR_FOREST_NAME"

let $new-fast-data := "YOUR_NEW_FAST_DATA_DIR"

let $old-data := admin:forest-get-data-directory($config, admin:forest-get-id($config, $forest-name))

let $old-large-data := admin:forest-get-large-data-directory($config, admin:forest-get-id($config, $forest-name))

return
admin:save-configuration(admin:forest-delete(
$config, admin:forest-get-id($config, $forest-name),
fn:false())),

let $config1 := admin:get-configuration()
return
admin:save-configuration(admin:forest-create(
    $config1,
    $forest-name,
    xdmp:host(),
    $old-data,
    $old-large-data,
    $new-fast-data
))

You can create and attach the forest in a single transaction. This is also possible using the admin UI (as two separate transactions) i.e. deleting only configuration of forest without data.

After attaching the forest, please re-index and data will then migrate to FDD. Note that the sample query needs to be executed on the host where the forest resides.


 

 

Alternatives to Configuration Manager

Overview

The MarkLogic Server Configuration Manager provided a read-only user interface to the MarkLogic Admin UI and could be used for saving and restoring configuration settings. The Configuration Manager tool was deprecated starting with MarkLogic 9.0-5, and is no longer available in MarkLogic 10.

Alternatives

There are a number of alternatives to the Configuration Manager. Most of the options take advantage of the MarkLogic Admin API, either directly or behind the scenes. The following is a list of the most commonly used options:

  • Manual Configuration
  • ml-gradle
  • Configuration Management API

Manual Configuration

For a single environment, the following Knowledge base covers the process of Transporting Resources to a New Cluster.

ml-gradle

For a repeatable process, the most widely used approach is ml-gradle.

A project would be created in Gradle, with the desired configurations. The project can then be used to deploy to any environment - test, prod, qa etc - creating a known configuration that can be maintained under source control, which is a best practice.

Similar to Configuration Manager, ml-gradle also allows for exporting the configuration of an existing cluster.

While ml-gradle is an open source community project that is not directly supported, it enjoys very good community and developer support.  The underlying APIs that ml-gradle uses are fully supported by MarkLogic.

Configuration Management API

An additional option is to use the Configuration Management API directly to export and import resources.

Summary

Both ml-gradle and the Configuration Management API use the MarkLogic Admin API behind the scenes but, for most use cases, our recommendation is to use ml-gradle rather than writing the same functionality from scratch.

Summary

Provide an answer to the question "What ports need to be open in my Security Group in order to run MarkLogic Server on Amazon's EC2?"

Details

To run MarkLogic Server on Amazon's EC2, you'll need to open a port range from 7998-8002 in the appropriate Security Group.

Summary

Customers using the MarkLogic AWS Cloud Formation Templates may encounter a situation where someone has deleted an EBS volume that stored MarkLogic data (mounted at /var/opt/MarkLogic).  Because the volume, and the associated data are no longer available, the host is unable to rejoin the cluster.  

Getting the host to rejoin the cluster can be complicated, but it will typically be worth the effort if you are running an HA configuration with Primary and Replica forests.

This article details the procedures to get the host to rejoin the cluster.

Preparing the New Volume and New Host

The easiest way to create the new volume is using a snapshot of an existing host's MarkLogic data volume.  This saves the work of manually copying configuration files between hosts, which is necessary to get the host to rejoin the cluster.

In the AWS EC2 Dashboard:Elastic Block Store:Volumes section, create a snapshot of the data volume from one of the operational hosts.

Next, in the AWS EC2 Dashboard:Elastic Block Store:Snapshots section, create a new volume from the snapshot in the correct zone and note note the new volume id for use later.

(optional) Update the name of the new volume to match the format of the other data volumes

(optional) Delete the snapshot

Edit the Auto Scaling Group with the missing host to bring up a new instance, by increasing the Desired Capacity by 1

This will trigger the Auto Scaling Group to bring up a new instance. 

Attaching the New Volume to the New Instance

Once the instance is online, and startup is complete connect to the new instance via ssh

Ensure MarkLogic is not running, by stopping the service and checking for any remaining processes.

  • sudo service MarkLogic stop
  • pgrep -la MarkLogic

Remove /var/opt/MarkLogic if it exists, and is mounted on the root partition.

  • sudo rm -rf /var/opt/MarkLogic

Edit /var/local/mlcmd and update the volume id listed in the MARKLOGIC_EBS_VOLUME variable to the volume created above.

  • MARKLOGIC_EBS_VOLUME="[new volume id],:25::gp2::,*"

Run mlcmd to attach and mount the new volume to /var/opt/MarkLogic on the instance

  • sudo /opt/MarkLogic/mlcmd/bin/mlcmd init-volumes-from-system
  • Check that the volume has been correctly attached and mounted

Remove contents of /var/opt/MarkLogic/Forests (if they exist)

  • sudo rm -rf /var/opt/MarkLogic/Forests/*

Run mlcmd to sync the new volume information to the DynamoDB table

  • sudo /opt/MarkLogic/mlcmd/bin/mlcmd sync-volumes-to-mdb

Configuring MarkLogic With Empty /var/opt/MarkLogic

If you did not create your volume from a snapshot as detailed above, complete the following steps.  If you created your volume from a snapshot, then skip these steps, and continue with Configuring MarkLogic and Rejoining Existing Cluster

  • Start the MarkLogic service, wait for it to complete it's initialization, then stop the MarkLogic service:
    • sudo service MarkLogic start
    • sudo service MarkLogic stop
  • Move the configuration files out of /var/opt/MarkLogic/
    • sudo mv /var/opt/MarkLogic/*.xml /secure/place (using default settings; destination can be adjusted)
  • Copy the configuration files from one of the working instances to the new instance
    • Configuration files are stored here: /var/opt/MarkLogic/*.xml
    • Place a copy of the xml files on the new instance under /var/opt/MarkLogic

Configuring MarkLogic and Rejoining Existing Cluster

Note the host-id of the missing host found in /var/opt/MarkLogic/hosts.xml

  • For example, if the missing host is ip-10-0-64-14.ec2.internal
    • sudo grep "ip-10-0-64-14.ec2.internal" -B1 /var/opt/MarkLogic/hosts.xml

  • Edit /var/opt/MarkLogic/server.xml and update the value for host-id to match the value retrieved above

Start MarkLogic and view the ErrorLog for any issues

  • sudo service MarkLogic start; sudo tail -f /var/opt/MarkLogic/Logs/ErrorLog.txt

You should see messages about forests synchronizing (if you have local disk failover enabled, with replicas) and changing states from wait or async replication to sync replication.  Once all the forests are either 'open' or 'sync replicating', then your cluster is fully operational with the correct number of hosts.

At this point you can fail back to the primary forests on the new instances to rebalance the workload for the cluster.

You can also re-enable xdqp ssl enabled, by setting the value to true on the Group Configuration page, if you disabled the setting as part of these procedures.

Update the Userdata In the Auto Scaling Group

To ensure that the correct volume will be attached if the instance is terminated, the Userdata needs to be updated in a Launch Configuration.

Copy the Launch Configuration associated with the missing host.

Edit the details

  • (optional) Update the name of the Launch Configuration
  • Update the User data variable MARKLOGIC_EBS_VOLUME and replace the old volume id with the id for the volume created above.
    • MARKLOGIC_EBS_VOLUME="[new volume id],:25::gp2::,*"
  • Save the new Launch Configuration

Edit the Auto Scaling Group associated with the new node

Change the Launch Configuration to the one that was just created and save the Auto Scaling Group.

Next Steps

Now that normal operations have been restored, it's a good opportunity to ensure you have all the necessary database backups, and that your backup schedule has been reviewed to ensure it meets your requirements.

Summary

MarkLogic recommends that all production servers be monitored for system health.

Recommendations

For production MarkLogic Server clusters, the system monitoring solution should include the following features:  

  • Enable monitoring history, which will allows for the capture and viewing of critical performance data from your cluster. You can learn more about the Monitoring History features by following this link: http://docs.marklogic.com/guide/monitoring/history
    • Monitor processes that are running on the system
    • Monitor RAM & swap space utilization.
    • Monitor I/O device service time, wait time, and queue size; any of these could be indications that the storage system is underpowered or poorly configured.
    • Monitor the network for signs of problems that impact application performance. A misconfigured or poorly performing network can have drastic impacts on the performance of an application running on MarkLogic Server. 
  • MarkLogic Error logs should be constantly monitored (and notifications sent) for the following keywords: 'exception', 'SVC-', 'XDMP-', & 'start'; Over time, you may want to refine the keywords, but these may indicate that something is wrong.
  • Log-file messages should also be monitored based on message level, see Understanding the Log Levels.  It's good practice to investigate and resolve important messages promptly.
  • Switch to debug level logging.  Not only will this provide additional information for you to monitor system health, but will also provide additional information to analyze in the event a problem does occur.
  • Monitor forest sizes - in particular the ratio of forest size to total available disk size (see Memory, Disk Space, and Swap Space Requirements).  Alarms should sound if the forest sizes increases significantly beyond target available disk space.
  • Ensure that the server time is synchronized across all the hosts in the cluster.  For example: Use NTP to manage system time across the cluster.
  • Monitor for host “hot spots.”  Uneven host workload could be a symptom that there is an uneven distribution of data across the hosts which may result in performance issues.   

MarkLogic Server provides a rich set of monitoring features that include a pre-configured monitoring dashboard, and a Management API that allows you to integrate MarkLogic Server with existing monitoring applications or create your own custom monitoring applications.

For additional information regarding the monitoring support in MarkLogic, Please refer to the Monitoring MarkLogic Guide available on the MarkLogic developer website.

Best Practice for Adding an Index in Production

Summary

It is sometimes necessary to remove or add an index to your production cluster. For a large database with more than a few GB of content, the resulting workload from reindexing your database can be a time and resource intensive process, that can affect query performance while the server is reindexing. This article points out some strategies for avoiding some of the pain-points associated with changing your database configuration on a production cluster.

Preparing your Server for Production

In general, high performance production search implementations run with tight controls on the automatic features of MarkLogic Server. 

  • Re-indexer disabled by default
  • Format-compatibility set to the latest format
  • Index-detection set to none.
  • On a very large cluster (several dozen or more hosts), consider running with expunge-locks set to none
  • On large clusters with insufficient resources, consider bumping up the default group settings
    • xdqp-timeout: from 10 to 30
    • host-timeout: from 30 to 90

The xdqp and host timeouts will prevent the server from disconnecting prematurely when a data-node is busy, possibly triggering a false failover event. However, these changes will affect the legitimate time to failover in an HA configuration. 

Preparing to Re-index

When an index configuration must be changed in production, you should:

  • First, index-detection should is set back to automatic
  • Then, the index configuration change should be made

When you have Database Replication Configured:

If you have to add or modify indexes on a database which has database replication configured, make sure the same changes are made on the Replica cluster as  well. Starting with ML server version 9.0-7, index data is also replicated from the Master to the Replica, but it does not automatically check if both sides have the same index settings. Reindexing is disabled by default on a replica cluster. However, when database replication configuration is removed (such as after a disaster),  the replica database will reindex as necessary. So it is important that the Replica database index configuration matches the Master’s to avoid unnecessary reindexing.

Note: If you are on a version prior to 9.0-7 - When adding/updating index settings, it is recommended that you update the settings on the Replica database before updating those on the Master database; this is because changes to the index settings on the Replica database only affect newly replicated documents and will not trigger reindexing on existing documents.

Further reading -

Master and Replica Database Index Settings

Database Replication - Indexing on Replica Explained

  • Finally, the reindexer should be enabled during off-hours to reindex the content.

Reindexing works by reloading all the Uris that are affected by the index change, this process tends to create lots of new/deleted fragments which then need to be merged. Given that reindexing is very CPU and disk I/O intensive, the re-indexer-throttle can be set to 3 or 2 to minimize impact of the reindex.

After the Re-index

After the re-index has completed, it is important to return to the old settings by disabling the reindexer and setting index-detection back to none.

If you're reindexing over several nights or weekends, be sure to allow some time for the merging to complete. So for example, if your regular busy time starts at 5AM, you may want to disable the reindexer at around midnight to make sure all your merging is completed before business hours.

By following the above recommendations, you should be able to complete a large re-index without any disruption to your production environment.

Introduction

Sometimes you may find that there are one or more tasks that are taking too long to complete or are hogging too many server resources, and you would like to remove them from the Task Server.  This article presents a way to cancel active tasks in the Task Server.

Details

To cancel active tasks in the Task Server, you can browse to the Admin UI, navigate to the Status tab of the Group's Task Server, and cancel the tasks. However, this may get tedious if there are many tasks to be terminated.

As an alternative, you can use the server monitoring built-ins to programmatically find and cancel the tasks. The documentation for the MarkLogic Server API contains includes information for all the builtin functions you will need (refer to http://docs.marklogic.com/xdmp/server-monitoring).

Sample Script

Here is a sample script that removes the task based on the path to the module that is being executed:

let $host-id := xdmp:host()
let $host-task-server-id := xdmp:host-status($host-id)//*:task-server/*:task-server-id/text()
let $task-server-status := xdmp:server-status($host-id,$host-task-server-id)
let $task-server-requests := $task-server-status/*:request-statuses
let $scheduled-task-request := $task-server-requests/*:request-status[*:request-text = "/PATH/TO/SCHEDULED/TASK/MODULE.XQY"]/*:request-id/text()
return
   xdmp:request-cancel($host-id,$host-task-server-id,$scheduled-task-request)

Introduction

This article discusses some of the issues you should think about when preparing to change the IP address' of a MarkLogic Server.

Detail: 

If the hostnames stay the same, then changing IP addresses should not have any adverse side effects since none of the default MarkLogic Server settings require an IP address.

Here are some caveats:

  1. Make sure there are no application servers that have an 'address' setting to an IP address that will no longer be accessible/exist after the change.
  2. Similarly, make sure there a no external (to MarkLogic Server) dependencies on the original IP addresses.
  3. Make sure you allow some time (on the order of minutes) for the routing tables to propagate across the DNS servers before bringing up MarkLogic Server.
  4. Make sure the hosts themselves are reachable via the standard Unix channels (ping, ssh, etc) before starting MarkLogic Server.
  5. Make sure you test this in a non-production environment box before you implement it in production.

Introduction

If you have an existing MarkLogic Server instance running on EC2, there may be circumstances where you need to change the size of available storage.

This article discusses approaches to ensure a safe increase in the amount of available storage for your EC2 instances without compromising MarkLogic data integrity.

This article assumes that you have started your cluster using the CloudFormation templates provided by MarkLogic.

The recommended method (I.) is to shut down the cluster, do the resize using snapshots and start again. If you wish to avoid downtime an alternative procedure (II.) using multiple volumes and rebalancing is described below.

In both procedures we are recommending a single, large EBS volume as opposed to multiple smaller ones because:

1. Larger EBS volumes have faster IO as described by the Amazon EBS Volume types at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html

2. You have to keep enough spare capacity on every single volume to allow for merges.  MarkLogic disk space requirements are described in our Installation Guide.

I. Resizing using AWS snapshots

This is the recommended method. This procedure follows the same steps as official Amazon AWS documentation, but highlights MarkLogic specific steps. Please review AWS Documentation in detail before proceeding:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

1. Make sure that you have an up to date backup of your data and a working restore plan.

2. Stop the MarkLogic cluster by going to AWS Console -> CloudFormation -> Actions -> Update Stack

aws-update-stack.png

Click through the pages and leave all other settings intact, but change Nodes to and review and confirm updating the stack. This will stop the cluster.

This is also covered in Marklogic EC2 documentation:

https://docs.marklogic.com/guide/ec2/managing#id_59478

4. Create a snapshot of the volume to resize.

5. Create a new volume from the snapshot.

Ensure that the new volume is sufficiently large to cover MarkLogic disk space requirements (generally at least 1.5x of the planned total forest size).

6. Detach the old volume.

7. Attach the newly expanded volume.

Steps 4-7 are exactly as covered in AWS documentation and have no Marklogic specific parts.

8. Restart MarkLogic cluster, by going to AWS Console -> CloudFormation -> Actions -> Update Stack and changing Nodes to the original setting.

9. Connect to the machine using SSH and resize the logical partition to match the new size. This is covered in AWS documentation, the commands are:

- resize2fs for ext3 and 4

xfs_growfs for xfs

10. The new volume will have a different id. You need to update the CloudFormation template so that the data volumes are retained and remounted when the cluster or nodes are restarted. The easiest way is to use mlcmd shell script provided by Marklogic. Also using SSH, run the following:

/opt/MarkLogic/bin/mlcmd sync-volumes-to-mdb

This will synchronise the EBS volume id with the CloudFormation template.

At this point the procedure is complete and you can delete the old EBS volume and once you have verified that everything is working fine, also delete the snapshot created in step 4.

II. Resizing with no downtime, using MarkLogic Rebalancing

This method avoids cluster downtime but it is slightly more complicated than procedure 1 and rebalancing will take additional time and add load to the cluster during rebalancing. In most cases procedure 1 takes far less time to complete, however, the cluster is down for the duration. With this procedure the cluster can serve requests at all times.

This procedure follows the same steps as official Amazon AWS documentation where possible, but highlights MarkLogic specific steps. Please review AWS Documentation in detail before proceeding:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

The procedure is described in more detail in the MarkLogic Server on Amazon EC2 Guide at https://docs.marklogic.com/guide/ec2/managing#id_81403

1. Create a new volume.

Ensure that the new volume is sufficiently large to cover MarkLogic disk space requirements (generally at least 1.5x of the planned total forest size).

2. Attach the volume to the EC2 instance. Please take a note of the EC2 device mount point, for example /dev/sdg and see here where it maps to in Linux and in RedHat: https://docs.marklogic.com/guide/ec2/managing#id_17077

3. SSH into the instance and execute the /opt/MarkLogic/bin/mlcmd init-volumes-from-system command to create a filesystem for the volume and update the Metadata Database with the new volume configuration. The init-volumes-from-system command will output a detailed report of what it is doing. Note the mount directory of the volume from this report.

4. Once the volume is attached and mounted to the instance, log into the Administrator Interface on that host and create a forest or forests, specifying host name of the instance and the mount directory of the volume as the forest Data Directory. For details on how to create a forest, see Creating a Forest in the Administrator's Guide.

5. Once the status of the new forest is set to "open", attach the new forest(s) to the database and retire all the forest(s) on the old volume. If you only have 1 data volume then this includes forests for Schemas, Security, Triggers, Modules etc. It is possible to script this part using XQuery, JS or REST:

https://docs.marklogic.com/admin:forest-create

This will trigger rebalancing - database fragments will start to move to the new forests. This process will take several hours or days, depending on the size of data and the Admin UI will show you an estimate.

The Admin UI for this is covered here: https://docs.marklogic.com/guide/admin/forests#id_93728

and here is more information on rebalancing: https://docs.marklogic.com/guide/admin/database-rebalancing#id_87979

6. Once the old forest(s) have 0 fragments in them you can detach them and delete the old forest(s). The migration to a new volume is complete.

7. Optional removing of the old volume. If your original volume was data only, the original volume should be empty after this procedure and you can:

a) unmount the volume in Linux

b) delete the volume in AWS EC2 console

c) issue /opt/MarkLogic/bin/mlcmd sync-volumes-to-mdb. This will preserve the new volume mappings in the Cloud Formation template and the volumes will be preserved and remounted when nodes are restarted or even terminated.

Introduction

MarkLogic Server has shipped with full support for the W3C XML Schema specification and schema validation capabilities since version 4.1 (released in 2009).

These features allow for the validation of complete XML documents or elements within documents against an existing XML Schema (or group of Schemas), whose purpose is to define the structure, content, and typing of elements within XML documents.

You can read more about the concepts behind XML Schemas and MarkLogic's support for schema based validation in our documentation:

https://docs.marklogic.com/guide/admin/schemas

Caching XML Schema data

In order to ensure the best possible performance at scale, all user created XML Schemas are cached in memory on each individual node within the cluster using a portion of that node's Expanded Tree Cache.

Best practices when making changes to pre-existing XML Schemas: clearing the Expanded Tree Cache

In some cases, when you are redeploying a revised XML Schema to an existing schema database, MarkLogic can sometimes refer to an older, cached version of the schema data associated with a given document.

Therefore, it's important to note that whenever you plan to deploy a new or revised version of a Schema that you maintain, as a best practice, it may be necessary to clear the cache in order to ensure that you have evicted all cached data stored for older versions of your schemas.

If you don't clear the cache, you may sometimes get references to the old, cached schema references and as result, you may get errors like:

XDMP-LEXVAL (...) Invalid lexical value

You can clear all data stored in the Expanded Tree Cache in two ways:

  1. By restarting MarkLogic service on every host in the cluster. This will automatically clear the cache, but it may not be practical on production clusters.
  2. By issuing a call to xdmp:expanded-tree-cache-clear() command on each host in the cluster. You can run the function in query console or via REST endpoint and you will need a user with admin rights to actually clear the cache.

An example script has been provided that demonstrates the use of XQuery to execute the call to clear the Expanded Tree Cache against each host in the cluster:

Please contact MarkLogic Support if you encounter any issues with this process.

Related KB articles and links:

Introduction

There may be situations where a MarkLogic cluster has evolved over time and there may be a need to create a clone of that setup in order to create another separate development environment.

In this Knowledgebase article we will outline the necessary steps to clone / move a MarkLogic environment.

Important notes:

  • The target environment must have the same architecture as the source cluster (e.g., Windows, Linux, Solaris). The CPUs must also have the same number of bits (e.g., 64, 32).
  • The target environment must have the same allocated disk space as the master environment if all databases are to be transferred
  • The approach outlined below performs a match on forest, app server and database names; these have to match in both environments but can be changed at a later stage
  • Forest allocation (per database) should be identical in both environments; this process will not take into account the case where extra forests are required but this is scenario is covered briefly in the following places:
  • The number of nodes in the cluster can be different as long as the number of forests remains the same
  • Many configuration steps can be scripted using MarkLogic's extensive list of Admin APIs - these are not covered here but more information is detailed in the product documentation

Please note that there may be subtle differences between major releases of MarkLogic versions; for example, the URI to the configuration manager has changed and new features have been added in newer releases of the product

Preparation steps for cloning the master environment

  • Backup all of your databases that you want to clone including their dependencies (you should exclude Security, Schemas, Triggers, Modules from this list)
    • You can also consider making backups of the master forests - but be sure to exclude the journals as you do this
  • Security, Schemas, Triggers, Modules should be individually backed up as separate tasks in this process
  • Open Configuration Manager in your browser to export all your current database settings
    • http://:8002/manage/v1/list/package/ for MarkLogic 5 and 6
    • http://:8002/nav/ for MarkLogic 7 (only select settings)
    • select only your custom app-servers and
    • databases; leave the default ones

The Configuration Manager tool was deprecated starting with MarkLogic 9.0-5, and is no longer available in MarkLogic 10. Alternatives to Configuration Manager are covered here

  • For any non-MarkLogic data (such as XQuery, Deployment scripts etc.) required to run your application, ensure these are manually zipped and copied over as part of this stage of the process
  • Copy all the data to a mountpoint which is accessible by your target environment.

If you don't have your current MarkLogic version available for installation you can download the install binary using curl; this will also allow you to download older versions if required - for example: to download the binary for MarkLogic Server 5.0-3.3, you could use the following call:

curl -O -XPOST -d'email=you@company.com&pass=yoursecret' 'https://developer.marklogic.com/download/binaries/5.0/MarkLogic-5.0-3.3.x86_64.rpm'

Steps for installation of the new data center / target environment

  • Perform a standard MarkLogic server/cluster installation on each of the new target nodes
  • Configure individual groups as required for the target environment
    • Values may differ when compared to the master environment in cases where - for example - different hardware is used
    • Manual customization is recommended for this
  • Restore the Security and Schema database from backup
    • This will restore all the user settings from your master environment
  • Verify that you still can access the server/cluster
  • Create all your forests ( including replicas if you're using any of the replication features in the product )
    • Forest names must match those in the master server/cluster as mentioned previously
    • Configure and attach the replica forests as required (optional)
  • Create all application servers and databases (only required for MarkLogic 5 and 6)
    • Again, note that these should have the same names as per the master server/cluster
    • Leave all other settings as per the defaults
  • Verify all assignments match those in the master server/cluster
  • On a node in the new target envronment, open Configuration Manager in your browser
    • http://:8002/manage/v1/list/package/ (MarkLogic 5 and 6)
    • http://:8002/nav/ for MarkLogic 7 (only select settings)
    • import the packages you exported during the preparation steps
    • apply all changes
    • verify the changes in the Admin-UI
  • Restore all other databases
  • Place your custom scripts onto the file system

After following these steps, you should now have a clone of your master server/cluster.

Please note that some steps may differ if you have created custom Security or Schemas databases; in this case you would need to create those first on the target environment and then perform the restore from the backup.

If you don't have any configuration scripts already written to setup your master server/cluster, you may want to look into the Admin API documentation (please ensure you view the correct document for the installed version of the product) for a basic setup of all your app-server, databases and forests:

Summary

Meters data can be a good resource for getting an approximation of the number of requests being managed by the server at a given time. It's also important to understand how Meters data is generated, should there be a discrepancy between the Meters samples, and the entries in the access log.

Meters Request Data

The Meters data is designed to record a sampling of activity, every few seconds. Meters data is not designed to accurately record server request rates much lower than every few seconds. Request rates are 15-second moving averages, recalculated every second and available in real time through the xdmp:host-status, xdmp:server-status and xdmp:forest-status built-in functions.

Meters Samples

The metering subsystem samples these real-time rates on the minute and saves the samples in the Meters database. Meters sampled data of events that occur less frequently than the moving average period will be lower than the number of access log entries. The difference between the two will depend on when the last event happened and when the sample was taken.

This mean that if an event happens once a minute, the request rate will rise when an event happens, but then decay away within a few seconds. If the sample is taken after the event has decayed, the saved meters data will be lower than the actual number of requests

Conclusion

The result of the Meters sampling method means that it is not unusual for Meters to under report the number of requests in certain circumstances.

Introduction

Marklogic Server persists its configuration in XML files (for example, databases.xml, hosts.xml, etc.), copies of which exist on each node of a cluster. While you could use the Admin UI to manage an individual cluster's configuration, at scale and over multiple environments, the best practice is to build a source-control managed script that uses MarkLogic's configuration management APIs - specifically Configuration Management API (CMA) and REST Management API (RMA).

What is RMA?

RMA (REST Management API- /manage/v2) provides the ability to easily capture detailed information about MarkLogic Server objects and processes such as hosts, databases, forests, application servers, and groups from any tool that can make a RESTful call.

You can read more about the MarkLogic REST Management API at - http://docs.marklogic.com/REST/management.

What is CMA?

Configuration Management API (CMA) is a new, higher-level interface built on top of the REST Management API (RMA). CMA is intended to more easily integrate with downstream tooling like MarkLogic's ml-gradle, Ops Director, and Java API, as well as third party options like node.js, bash, and curl.

Customers will typically want to package up configurations to manage the deployment of applications into development, test, and production environments. CMA makes it easier to set up complex MarkLogic features such as replication and failover across these different environments by providing common 'canned' scenarios.

You can read more about the MarkLogic Configuration Management API at - http://docs.marklogic.com/REST/configuration-management-api

How to invoke CMA?

Customer can create and apply configurations in three ways:

1. REST Management API: manage/v3: REST endpoint for generating and applying configurations.

Please refer to http://docs.marklogic.com/REST/configuration-management-api

For example: 

1.1 http://host:8002/manage/v3?format=json : This will return the configuration data in json format. If we change the format to xml, we will get the configurations in xml format.

1.2 http://host:8002/manage/v3?format=zip: This will return package.zip. This archive contains the configuration files (database, forest hosts..etc), README, and ml-gradle property/build files. README from the zip file has the details about how to install ml-gradle and apply the configurations to the other instances.

2. XQuery: cma.xqy --- XQuery library for generating and applying configurations.

Please refer to https://docs.marklogic.com/cma for more details.

cma:apply-config Apply a named configuration, overriding parameters and setting options.
cma:generate-config Retrieve an individual resource, set of resources, or full cluster configuration; generate a configuration from scenarios.

For example:

xquery version "1.0-ml";
import module namespace cma="http://marklogic.com/manage/config"
   at "/MarkLogic/cma.xqy";
cma:apply-config($zip)

3. JavaScript: cma.sjs --- JavaScript library for generating and applying configurations.

Please refer to https://docs.marklogic.com/js/cma for more details.

Function name Description
cma.applyConfig Apply a named configuration, overriding parameters and setting options.
cma.generateConfig Retrieve an individual resource, set of resources, or full cluster configuration; generate a configuration from scenarios.

For example, to create a REST server:

// Create a REST server.
'use strict';

var cma = require('/MarkLogic/cma.sjs');

var json = {
    "config": [{
    "forest":[{"forest-name":"mydb1-f1"},{"forest-name":"mymodulesdb-f1"}],
        "database": [{
                "database-name": "myDb",
        "forest":["mydb1-f1"]
            },
            {
                "database-name": "myModulesDb",
        "forest":["mymodulesdb1-f1"]
            }
        ],
        "server": [{
            "server-name": "restapiServer",
            "server-type": "http",
      "group-name":"Default",
            "root": "/",
            "port": "8900",
      "url-rewriter": "/MarkLogic/rest-api/8000-rewriter.xml",
            "content-database": "myDb",
            "modules-database": "myModulesDb"
        }]
    }]
};
 
cma.applyConfig(json);

Takeaways

  • Use source controlled scripts exercising MarkLogic's Configuration Management API to reliably and consistently manage configuration changes across your environments.
  • Avoid a single monolithic configuration script. The best practice here is to modularize your configuration changes in the form of multiple scripts. If you have multiple databases and configurations, one recommendation would be to maintain the scripts per database. It is easier to maintain and also apply these changes to the instance.
  • Depending on your configuration requirements, you may find your script needing to make calls to both the higher level Configuration Management API as well as the lower level REST Management API. If you have performance issues with CMA/RMA Rest api's, you can even call the Javascript/X-Query api directly from your code.

Introduction

HAProxy (http://www.haproxy.org/) is a free, fast and reliable solution offering high availability, load balancing and proxying for TCP and HTTP-based applications.

MarkLogic 8 (8.0-8 and above) and MarkLogic 9 (9.0-4 and above) include improvements to allow you to use HAProxy to connect to MarkLogic Server.

MarkLogic Server supports balancing application requests using both the HAProxy TCP and HTTP balancing modes depending on the transaction mode being used by the MarkLogic application as detailed below:

  1. For single-statement auto-commit transactions running on MarkLogic version 8.0.7 and earlier or MarkLogic version 9.0.3 and earlier, only TCP mode balancing is supported. This is due to the fact that the SessionID cookie and transaction id (txid) are only generated as part of a multi-statement transaction.
  2. For multi-statement transactions or for single-statement auto-commit transactions running on MarkLogic version 8.0.8 and later or MarkLogic version 9.0.4 and later both TCP and HTTP balancing modes can be configured.

The Understanding Transactions in MarkLogic Server and Single vs. Multi-statement Transactions in the MarkLogic documentation should be referenced to determine whether your application is using single or multi-statement transactions.

Note: Attempting to use HAProxy in HTTP mode with Single-statement transactions prior to MarkLogic versions 8.0.8 or 9.0.4 can lead to unpredictable results.

Example configurations

The following example configurations detail only the parameters relevant to enabling load balancing of a MarkLogic application, for details of all parameters that can be used please refer to the HAProxy documentation.

TCP mode balancing

The following configuration is an example of how to balance requests to a 3-node MarkLogic application using the "roundrobin" balance algorithm based on the source IP address. The health of each node is checked by a TCP probe to the application server every 1 second.

backend app
mode tcp
balance roundrobin
stick-table type ip size 200k expire 30m
stick on src
default-server inter 1s
server app1 ml-node-1:8012 check id 1
server app2 ml-node-2:8012 check id 2
server app3 ml-node-3:8012 check id 3

HTTP mode balancing

The following configuration is an example of how to balance requests to a 3-node MarkLogic application using the "roundrobin" balance algorithm based on the "SessionID" cookie inserted by the MarkLogic server.

The health of each node is checked by issuing an HTTP GET request to the MarkLogic health check port and checking for the "Healthy" response.

backend app
mode http
balance roundrobin
cookie SessionID prefix nocache
option httpchk GET / HTTP/1.1\r\nHost:\ monitoring\r\nConnection:\ close
http-check expect string Healthy
server app1 ml-node-1:8012 check port 7997 cookie app1
server app2 ml-node-2:8012 check port 7997 cookie app2
server app3 ml-node-3:8012 check port 7997 cookie app3

Summary

This article describes how to create a MarkLogic Support Request (commonly known as a Support Dump). To create a Support Request:

Creating a Support Request

1. Use a web browser to log in to the server's Admin interface, which is typically found on the server at port 8001. Eg., http://localhost:8001

2. Click Configure in the navigation frame

support.dump.1.png

3. Click the Support Tab

support.dump.2.png

4.  Choose the options based on the level of detail you want to provide.

For MarkLogic Server version below 9 follow these steps:

support.dump.3.png

1. In general, MarkLogic Support recommends choosing "cluster", "status and logs" and "file".
2. Make sure to zip the file when attaching to a support case.
3. If the resulting zip file is larger then 10MB, please upload the file through the portal using the HTTPS based upload (preferred) or to the Marklogic FTP server, as you will be unable to attach files greater than 10MB to the support case.

For MarkLogic Sever version 9 and onwards:

1. MarkLogic Support recommends choosing "cluster", "status and system logs", "latest" and "upload to MarkLogic Secured Storage".

In MarkLogic 9 ErrorLog files have been split to separate PII (Personally Identifiable Information) from system information:

- "System ErrorLog" which contains only MarkLogic Server related information

- "Application ErrorLog" which contain all application specific logging information including PII

2. Send the file to MarkLogic Support

It is recommended to select "upload to Marklogic Secure Storage" which will upload all collected data automatically. It only requires that the MarkLogic Server can reach telemetry.services.marklogic.com over SSL. After a successful upload provide us with the reported Cluster-ID in the support ticket.

For the other options please follow instructions for earlier MarkLogic versions above.

Summary

MarkLogic Server clusters are built on a distributed, shared nothing architecture.  Typical query loads will maximize resource utilization only when database content is evenly distributed across D-node hosts in a cluster.  That is, optimal performance will occur when the amount of concurrent work required of each node in a cluster is equivalent.   Having your data balanced across the forests in your cluster is necessary in order to achieve optimal performance.

If all of the forests in a multi-forest database are present from the time when the database was created, the forests will likely each have approximately the same number of documents. If forests were added later on, the newer forests will tend to have fewer documents. In cases like this, rebalancing the forests may be in order.

Default Document Forest Assignment (Legacy assignment policy)

Before MarkLogic 7, earlier versions used a default document forest assignment policy (or legacy policy). For MarkLogic 7 this is the default assignment policy when rebalancer enable configuration for a database is set to false.

In legacy assignment policy, in a multi-forest database, a new document gets assigned to a forest based on the URI hash.  For practical purposes, the default forest assignment is random. In most cases, the default behavior is sufficient to guarantee evenly distributed content.  

There are API functions that allow you to determine where a document resides or will reside:

  • The xdmp:document-assign() function can be used to determine the forest for which a document URI will be assigned. 
  • For existing documents, document updates will occur in the same forest as the existing document. The xdmp:document-forest() function can be used to determine which forest the document is assigned to. 

In-Forest Placement

'In-forest placement' is a technique that is used to override the default document forest assignments.  

Both xdmp:document-insert() and xdmp:document-load() allow you to specify the forest in which the document will be inserted.

mlcp has a -fastload option which will insert content directly.  See Time vs. Correctness: Understanding -fastload Tradeoffs to understand the tradeoffs when using this option.

Some common open source document loading tools also support in-forest placement. RecordLoader (http://developer.marklogic.com/code/recordloader) and XQsync (http://developer.marklogic.com/code/xqsync) support in-forest placement with the OUTPUT_FORESTS property setting.

Rebalancing

MarkLogic 7 introduced database rebalancing using a database rebalancer configured with one of several assignment policies.

A database rebalancer consists of two parts:

  1. an assignment policy for data insert and rebalancing, and
  2. a rebalancer for data movement.

The rebalancer can be configured with one of several assignment policies, which define what is considered 'balanced' for a database. The rebalancer runs on each forest and consults the database's assignment policy to determine which documents do not 'belong to' this forest and then pushes them to the correct forests. You can read more about database rebalancing at http://docs.marklogic.com/guide/admin/database-rebalancing

For a brand new database, the rebalancer is enabled by default and the assignment policy is bucket.  For older versions (before ML 7), by default, the assignment was done using legacy policy.

(Note that rebalancing forests may result in forests that contain many deleted fragments. To recover disk space, you may wish to force some forests to merge.)

Before Rebalancing, Consider This …

Before embarking on a process to rebalance the documents in your database, consider that rebalancing is generally slower than clearing the database and reloading. The reason is that rebalancing involves updating documents, and updates are more expensive than inserts. Rebalancing the forests may not be the best to solution. If you have the luxury of clearing the database and reloading everything, do it.  However, if the database must be available throughout the rebalancing process, then using the rebalancer may be appropriate.

Introduction

Prior to MarkLogic 9.0-7

For optimum efficiency, indexing information was not replicated over the network between the Master and Replica databases and is instead regenerated by the Replica database. If you want the option to switch over to the Replica database after a disaster and have queries that perform as expected, your index settings must be identical on both the Master and Replica clusters. View Documentation on Master and Replica Database Index Settings.

If you need to update index settings after configuring Database Replication, it is recommended that you update the settings on the Replica database before updating those on the Master database; this is because changes to the index settings on the Replica database only affect newly replicated documents and will not trigger reindexing on existing documents.

After MarkLogic 9.0-7 +

In recent versions of MarkLogic, we have further improved synchronization between Master and Replica, and now Index data is automatically replicated from the master. There also is a function to verify and optionally repair replica index data automatically. xdmp:forest-validate-replica-index automatically validates and optionally repairs replica index data to match the master index data. 

If you need to update index settings after configuring Database Replication, make sure they are updated on both the Master and Replica databases.

Note - For both above versions, index configuration is not automatically replicated. It is still the responsibility of the database administrator to ensure the replica index configuration matches the master index configuration. It is your responsibility to ensure the clusters are kept in sync regarding index settings.

Changes to the index settings on the Master database will trigger reindexing, after which the reindexed documents will be replicated to the Replica.

When a Database Replication configuration is removed for the Replica database (such as after a disaster), the Replica database will reindex, if necessary.

The MarkLogic Server Database Replication Guide contains additional information if you would like to learn more about this feature in the product.

Negative impact of reindexing replica cluster

If a replica database is reindexed after decoupled from the master and then re-coupled at a later time:

  1. When the databases are reconnected, every reindexed document in the database will be replicated during the bulk synchronization process.  Internally, MarkLogic Server generates a "document id” for each document at reindex time. The "document id" is random (i.e. not deterministic); When database replication bulk synchronization occurs, the manifests from master and replica will have different "document ids" for same document, resulting in the document being replicated.
  2. Because of #1, during bulk replication, there will be transient duplicate URIs on the replica cluster as the document deletes will not be coordinated with the document inserts.  If the replica database is used for (read only) queries, then this could result in query errors.    Once the databases complete the asynchronous bulk replication process, the replica database will be in a good state again. 

Additional Recommendations

  1. If a new index is added to the database replica cluster, add a verification step to make sure the indexes are available on the replica cluster once reindex is complete on the master. This will allow you to avoid any index usage issues after DR failover - saving time to bring the replica cluster up to a usable state. 
  2. Document and practice DR failover to optimize the procedure. 

Summary

The MarkLogic DBA is responsible for keeping the index settings the same across both the Master and Replica hosts. We recommend using the Admin API to script configuration settings for your databases and to store these configuration scripts in a version control system such as Subversion or Git.

Questions

Q: If the replica was to be used for query resolution and required the index to be applied on the Master, although the data had replicated to the Replica, would the query be able to use the new indexed data with the config option not enabled?

A: Yes - here's a simple test to demonstrate this working:

  1. Create some sample data on the master database
    for $i in (1 to 100) 
    return xdmp:document-insert(fn:concat("/",$i,".xml"), element index {$i})
  2. Confirm that replication is taking place across both databases.
  3. On the replica host, create an integer element range index on the element.
  4. Confirm that a cts:element-values query on the replica returns an XDMP-ELEMRIDXNOTFOUND error:
    cts:element-values(xs:QName("index"))
  5. Create the element range index on the master
  6. Confirm that both environments can now run the query successfully:
    1 2 3 [...] 100

Summary

This article will provide steps to debug applications using the Alerting API that are not triggering an alert.

Details

1) Check that all required components are present in the database where alerting is setup: config, actions, rules.   Run the attached script 'getalertconfigs.xqy' through the Query Console and review the output.  

2) As documented in our Search Developer's Guide, Test the alert manually with alert:invoke-matching-actions(). 

Example:

alert:invoke-matching-actions("my-alert-config-uri", 
      <doc>hello world</doc>, <options/>)

3) Use the rule's query to test against the database to check that the expected documents are returned by the query.

Take the query text from the rule and run it through Query Console using a cts:search() on the database.  This will confirm whether the expected documents are a positive match.  If the documents are returned and no alert is triggered, then further debugging will be needed on the configuration or the query may need to be modified.

Summary

Terraform from HashiCorp is a deployment tool that many organizations use to manage their infrastructure as code. It is platform agnostic, allowing for the deployment and configuration of on-site physical infrastructure, as well as cloud infrastructure such as AWS, Azure, VSphere and more.

Terraform uses the Hashicorp Configuration Language (HCL) to allow for concise descriptions of infrastructure. HCL is JSON compatible language, and was designed to be both human and machine friendly.

This powerful tool can be used to deploy a MarkLogic Cluster to AWS using the MarkLogic CloudFormation Template. The MarkLogic CloudFormation Template is the preferred method recommended by MarkLogic for building out MarkLogic clusters within AWS.

Setting Up Terraform

For the purpose of this example, I will assume that you have already installed Terraform, the AWS CLI and you have configured the credentials. You will also need to have a working directory that has been initialized using terraform init.

Terraform Providers

Terraform uses Providers to provide access to different resources. The Provider is responsible for understanding API interactions and exposing resources. The AWS Provider is used to provide access to AWS resources.

Terraform Resources

Resources are the most important part of the Terraform language. Resource blocks describe one or more infrastructure objects, like compute instances and virtual networks.

The aws_cloudformation_stack resource, allows Terraform to create a stack from a CloudFormation template.

Choosing a Template

MarkLogic provides two templates for creating a managed cluster in AWS.

  • MarkLogic cluster in new VPC
  • MarkLogic cluster in an existing VPC
I've chosen to deploy my cluster to an VPC. When deploying to an existing VPC, you will need to gather the VPC ID, as well as the Subnet IDs for the public and private subnets.

Defining Variables

The MarkLogic CF Template takes a number of input variables, including the region, availability zones, instance types, EC2 keys, encryption keys, licenses and more. We have to define our variables so they can be used as part of the resource.

Variables in HCL can be declared in a separate file, which allows for deployment flexibility. For instance, you can create a Development resource and a Production resource, but using different variable files.

Here is a snippet from our variables file:

variable "cloudform_resource_name" {
type = string
default = "Dev-Cluster-CF"
}
variable "stack_name" {
type = string
default = "Dev-Cluster"
}
variable "ml_version" {
type = string
default = "10.0-4"
}
variable "availability_zone_names" {
type = list(string)
default = ["us-east-1a","us-east-1b","us-east-1c"]
}
...

In the snippet above, you'll notice that we've defined the availability_zone_names as a list. The MarkLogic CloudFormation template won't take a list as an input, so later we will join the list items into a string for the template to use.

This also applies to any of the other lists defined in the variable files.

Using the CloudFormation Resource

So now we need to define the resource in HCL, that will allow us to deploy a CloudFormation template to create a new stack.

The first thing we need to do, is tell Terraform which provider we will be using, defining some default options:

    provider "aws" {
    profile = "default"
    #access_key = var.access_key
    secret_key = var.secret_key
    region = var.aws_region
    }

Next, we need to define the `aws_cloudformation_stack` configuration options, setting the variables that will be passed in when the stack is created:

    resource "aws_cloudformation_stack" "marklogic" {
    name = var.cloudform_resource_name
    capabilities = ["CAPABILITY_IAM"]


    parameters = {
    IAMRole = var.iam_role
    AdminUser = var.ml_admin_user
    AdminPass = var.ml_admin_password
    Licensee = "My User - Development"
    LicenseKey = "B581-REST-OF-LICENSE-KEY"
    VolumeSize = var.volume_size
    VolumeType = var.volume_type
    VolumeEncryption = var.volume_encryption
    VolumeEncryptionKey = var.volume_encryption_key
    InstanceType = var.instance_type
    SpotPrice = var.spot_price
    KeyName = var.secret_key
    AZ = join(",","${var.avail_zone}")
    LogSNS = var.log_sns
    NumberOfZones = var.number_of_zones
    NodesPerZone = var.nodes_per_zone
    VPC = var.vpc_id
    PublicSubnets = join(",","${var.public_subnets}")
    PrivateSubnets = join(",","${var.private_subnets}")
    }
    template_url = "${var.template_base_url}${var.ml_version}/${var.template_file_name}"
    }

Deploying the Cluster

Now that we have defined our variables and our resources, it's time for the actual deployment.

$> terraform apply

This will show us the work that Terraform is going to attempt to perform, along with the settings that have been defined so far.

Once we confirm that things look correct, we can go ahead and apply the resource.

Now we can check the AWS Console to see our stack

And we can also use the ELB to login to the Admin UI

Wrapping Up

We have now deployed a 3 node cluster to an existing VPC using Terraform. The cluster is now ready to have our Data Hub, or other application installed.

Deploying MarkLogic in AWS with Ansible

Summary

Ansible, owned by Red Hat, is an open source provisioning, configuration and application deployment tool that many organizations use to manage their infrastructure as code. Unlike options such as Chef and Puppet, it is agentless, utilizing SSH to communicate between servers. Ansible also does not need a central host for orchestration, it can run from nearly any server, desktop or laptop. It supports many different platforms and services allowing for the deployment and configuration of on-site physical infrastructure, as well as cloud and virtual infrastructure such as AWS, Azure, VSphere, and more.

Ansible uses YAML as its configuration management language, making it easier to read than other formats. Ansible also uses Jinja2 for templating to enable dynamic expressions and access to variables.

Ansible is a flexible tool can be used to deploy a MarkLogic Cluster to AWS using the MarkLogic CloudFormation Template. The MarkLogic CloudFormation Template is the preferred method recommended by MarkLogic for building out MarkLogic clusters within AWS.

Setting Up Ansible

For the purpose of this example, I will assume that you have already installed Ansible, the AWS CLI, and the necessary python packages needed for Ansible to talk to AWS. If you need some help getting started, Free Code Camp has a good tutorial on setting up Ansible with AWS.

Inventory Files

Ansible uses Inventory files to help determine which servers to perform work on. They can also be used to customize settings to indiviual servers or groups of servers. For our example, we have setup our local system with all the prerequisites, so we need to tell Ansible how to treat the local connections. For this demonstration, here is my inventory, which I've named hosts

[local]
localhost              ansible_connection=local

Ansible Modules

Ansible modules are discreet units of code that are executed on a target. The target can be the local system, or a remote node. The modules can be executed from the command line, as an ad-hoc command, or as part of a playbook.

Ansible Playbooks

Playbooks are Ansible's configuration, deployment and orchestration language. Playbooks are how the power of Ansible, and its modules is extended from basic configuration, or manangment, all the way to complex, multi-tier infrastructure deployments.

Chosing a Template

MarkLogic provides two templates for creating a managed cluster in AWS.

  1. MarkLogic cluster in new VPC
  2. MarkLogic cluster in an existing VPC

I've chosen to deploy my cluster to an VPC. When deploying to an existing VPC, you will need to gather the VPC ID, as well as the Subnet IDs for the public and private subnets.

Defining Variables

The MarkLogic CF Template takes a number of input variables, including the region, availability zones, instance types, EC2 keys, encryption keys, licenses and more. We have to define our variables so they can be used as part of the resource.

Variables in Ansible can be declared in a separate file, which allows for deployment flexibility.

Here is a snippet from our variables file:

# vars file for marklogic template and version
ml_version: '10.0-latest'
template_file_name: 'mlcluster.template'
template_base_url: 'https://marklogic-template-releases.s3.amazonaws.com/'

 

# CF Template Deployment Variables
aws_region: 'us-east-1'
stack_name: 'Dev-Cluster-An3'
IAMRole: 'MarkLogic'
AdminUser: 'admin'
...

Using the CloudFormation Module

So now we need to create our playbook, and choose the module that will allow us to deploy a CloudFormation template to create a new stack. The cloudformation module allows us to create a CloudFormation stack.

Next, we need to define the cloudformation configuration options, setting the variables that will be passed in when the stack is created.

# Use a template from a URL
- name: Ansible Test
  hosts: local

 

  vars_files:
    - ml-cluster-vars.yml

 

  tasks:
    - cloudformation:
        stack_name: "{{ stack_name }}"
        state: "present"
        region: "{{ aws_region }}"
        capabilities: "CAPABILITY_IAM"
        disable_rollback: true
        template_url: "{{ template_base_url+ml_version+'/'+ template_file_name }}"
      args:
        template_parameters:
          IAMRole: "{{ IAMRole }}"
          AdminUser: "{{ AdminUser }}"
          AdminPass: "{{ AdminPass }}"
          Licensee: "{{ Licensee }}"
          LicenseKey: "{{ LicenseKey }}"
          KeyName: "{{ KeyName }}"
          VolumeSize: "{{ VolumeSize }}"
          VolumeType: "{{ VolumeType }}"
          VolumeEncryption: "{{ VolumeEncryption }}"
          VolumeEncryptionKey: "{{ VolumeEncryptionKey }}"
          InstanceType: "{{ InstanceType }}"
          SpotPrice: "{{ SpotPrice }}"
          AZ: "{{ AZ | join(', ') }}"
          LogSNS: "{{ LogSNS }}"
          NumberOfZones: "{{ NumberOfZones }}"
          NodesPerZone: "{{ NodesPerZone }}"
          VPC: "{{ VPC }}"
          PrivateSubnets: "{{ PrivateSubnets | join(', ') }}"
          PublicSubnets: "{{ PublicSubnets | join(', ') }}"
        tags:
          Stack: "ansible-test"

Deploying the cluster

Now that we have defined our variables created our playbook, it's time for the actual deployment.

ansible-playbook -i hosts ml-cluster-playbook.yml -vvv

The -i option allows us to reference the inventory file we created. As the playbook runs, it will output as it starts and finishes tasks in the playbook.

PLAY [Ansible Test] ************************************************************************************************************

 

TASK [Gathering Facts] *********************************************************************************************************
ok: [localhost]

 

TASK [cloudformation] **********************************************************************************************************
changed: [localhost]

When the playbook finishes running, it will print out a recap which shows the overall results of the play.

PLAY RECAP *********************************************************************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

This recap tells us that 2 tasks ran successfully, resulted in 1 change, and no failed tasks, which is our sign that things worked.

If we want to see more information as the playbook runs we can add one of the verbose flags (-vor -vvv) to provide more information about the parameters the script is running, and the results.

Now we can check the AWS Console to see our stack:

And we can also use the ELB to login to the Admin UI

Wrapping Up

We have now deployed a 3 node cluster to an existing VPC using Ansible. The cluster is now ready to have our Data Hub, or other application installed.  We can now use the git module to get our application code, and deploy our code using ml-gradle.

SUMMARY

This article will help MarkLogic Administrators to monitor the health of their MarkLogic cluster. By studying the attached scripts, you will learn how to find out which hosts are down and which forests have failed over, enabling you to take the necessary recovery actions.

Initial Setup

On a separate Linux host (not a member of the cluster), download the file attachments from this article, making sure that they all reside within the same directory.

Here is a general description of each file:

cluster-name.conf - Example configuration file used by script. Configures information for monitoring one ML cluster. 

ml-ck-for-life.sh - A very simple, low-load check that all the nodes of a cluster are up and running.

ml-ck-for-health.sh - A more detailed check for essential cluster functionality with alerting (paging and/or emails to DBAs) if warranted. This script relies on at least one external XQuery file (mon-report-failed-over-forests.xqy) and makes use of the REST MGMT API as well as REST XQuery requests.

mon-report-failed-over-forests.xqy - External XQuery file used by ml-ck-for-health.sh

 

Preparing the CONF File for Use on Your Cluster

Before running the scripts, the cluster-name.conf needs to be customized for your specific cluster. Start by changing the file name to match the name of your cluster, e.g.,

$ mv cluster-name.conf some-other-name.conf

Where "some-other-name" is the actual name of the cluster, or of the application that is hosted on that cluster.

Next, you will need to customize some of the internal variables inside the CONF file itself. Here is the contents of the cluster-name.conf file, as downloaded:

CLUSTER_NAME="CLUSTER-NAME"
CLUSTER_NODES=( node1.my-company.com node2.my-company.com node3.my-company.com )
# MarkLogic Credentials for the REST Management port - 8002
USER_PW_MGMT=rest-manager-user:re-manager-password
# MarkLogic Credentials for the XQuery eval port - 8000
USER_PW_XQ=user-name:user-password
UNIX_USER=unix-user-name
PAGE_ADDRESSES=ml.alert.page@my-company.com
MAIL_ADDRESSES=ml.alert.mail@my-company.com

---------  end of listing ---------

For CLUSTER_NAME, provide the cluster-name listed in the cluster's /var/log/MarkLogic/clusters.xml file.

For CLUSTER_NODES, write in the host-names for each node in your cluster.

For USER_PW_MGMT, provide the user-name and password for the REST MANAGEMENT user, the format is name:password.

For USER_PW_XQ, provide the user-name and password for the user who will execute the XQuery scripts, the format is name:password.

The UNIX_USER is a local Unix username with the correct rwx access rights for this directory.

The PAGE_ADDRESSES & MAIL_ADDRESSES are alert email addresses who will be notified whenever there is a failover event.

Periodicity

The script ml-ck-for-health.sh was created with the idea it would be run repeatedly at a certain interval to keep tabs on system health. For example, it can be configured to be invoked with a cron job. A frequency of 5 to 120 minutes is a good candidate range. Ten minutes is a good time if you would like to be woken up (on average) within 5 minutes of a failover event.

Setting up SSH Passwordless Login

In monitoring script ml-ck-for-health.sh, section (6) FOREST STATUS CHANGE, requires ssh access to the cluster hosts. That is because this section greps through MarkLogic server ErrorLogs. To enable this part of the script to run without prompting the user, "ssh passwordless login" should be setup between the monitoring host and all the cluster hosts.There are many examples of how to do this on the internet, for example: http://www.tecmint.com/ssh-passwordless-login-using-ssh-keygen-in-5-easy-steps/ Alternatively, this monitoring section can be commented out.

Also regarding section (6), the “grep” command is setup up to grep the latest 10 minutes from the ErrorLog. If this script is configured to be run less often then every 10 minutes, the “grep” command line should be adapted to cover the desired period between script runs.

Example Usage

You are now ready to execute the failover monitoring scripts! Here is how you would execute them:


$ ./ml-ck-for-health.sh some-other-name.conf MY-CLUSTER-NAME

$ ./ml-ck-for-life.sh some-other-name.conf

[where "some-other-name" and MY-CLUSTER-NAME are your actual CONF and cluster-name, as described above]

Monitoring Multiple Clusters

So, given a monitoring machine with a directory of cluster configuration files in the style of cluster-name.conf, those configuration files could be iterated through to monitor a suite of clusters from a single monitoring machine. It should be fairly easy to build a custom shell script to iterate through various cluster CONF files.

Final thought and Limitations

Please be aware that the ml-ck-for-health.sh script is only partially implemented. In particular, the Replication Lag and Replication Failure sections are left as exercises for the user.

This script is being presented as a backup, lowest common denominator monitoring solution. For a more complete solution, you should explore other options, such as Splunk or Nagios.

 

 

 

Introduction

MarkLogic Server provides a variety of  disaster recovery (DR) facilities including full backup, incremental backup, and journal archiving that when combined with other ML features can create a complete disaster recovery strategy. This paper shows some examples of how these features can be combined. It is not comprehensive nor does it reflect features offered only in the latest releases.

Details

This article will cover three perspectives. First, a quick overview of the metrics used by businesses to measure the quality of their Disaster Recovery strategies will be covered. Next, an overview of how to combine the features that MarkLogic offers in various categories will be given.

More?: High Availability and Disaster Recovery features ,  High Availability & Disaster Recovery datasheetScalability, Availability, and Failover Guide 

Disaster Recovery Criteria

In order to configure MarkLogic Server to perform well in Disaster Recovery situations, we should first define what parameters we will use to measure each possible approach. For most situations, these four measures are used: 

Long Term Retention Policy (LTR): Long Term Retention Policy can be driven by any number of business, regulatory and other criteria. It is included here because MarkLogic's backup files are often a key part of an LTR strategy. 

Recovery Point Objective (RPO)The requirement for how up-to-date the database has to be post-recovery with respect to its state immediately before the incident that required recover.

Recovery Time Objective (RTO)The requirement for the time elapsed between the incident and the recovery to the RPO.

CostThe storage cost, the computational resource cost and  the operations cost of the overall deployment strategy.

Flexible Replication Features

Flexible replication can be used to support LTR objectives but is generally not useful for Disaster Recovery

More? Flexible Replication Guide

Platform Support Features

Flash backup provides a way to leverage backup features of your deployment platform while maintaining transaction integrity. Platform specific solutions can often achieve RPO and RTO targets that would be impossible through other means.

More? Flash Backup

High Availability Features

Forest replication provides recovery from host failures.

More? Scalability, Availability, and Failover Guide

Disaster Recovery Features

Database Replication

Database Replication is the process of maintaining copies of forests on databases in multiple MarkLogic Server clusters.

More? Understanding Database Replication

Backups

Of all your backup options, full backups restore the quickest, but take the most time to backup and possibly the most storage space. Each full backup is a backup set in that it contains everything you need to restore to the point of the backup.

Full backups with journal archiving allow restores to a point after the backup, but the journal archive grows in an unbounded way with the number of transactions, and replaying the journals to get to your recovery point takes time proportional to the number of transactions in the journal archive, so over time, this becomes less efficient.

With full + incremental backups, a backup set is a full backup, plus the incremental backups taken after that full backup. Incremental backups are quick to backup, but take longer to restore, and over time the backup set gets larger and larger, so it may end up consuming more backup space than a full backup alone (depending on your backup retention policy).

Full + incremental backups with journal archiving have the same characteristics as incremental backups, except that you can roll forward from the most recent incremental. With this strategy, the journal archive doesn't grow in an unbounded way because the archive is purged when you take the next incremental backup. Note that if your RPO is between incremental backups, you must also enable a merge timestamp by setting the merge timestamp to a negative value (see below).

More?: Administrator’s Guide to Backing Up and Restoring a Database  How does "point-in-time" recovery work with Journal Archiving? 

Forest Merge Configurations

Forest merges recover the disk space occupied by deleted documents. A negative merge timestamp delays that permanent deletion. If we want incremental backups to contain all the fragments that were deleted since the last incremental backup then we want to set the delay to a period greater than the incremental backup period. This requires more disk space for the incremental backups and also requires additional space in the live database, but provides the most flexibility.

Setting retain-until-backup on a given database (thru the Admin UI or thru an API call) has a similar effect by telling the server to keep the deleted fragments until a full backup or an incremental backup completes. Many clients choose to use both the negative merge timestamp and retain until backup options together.

More?: admin:database-set-merge-timestamp  admin:database-set-retain-until-backup

Other Features

The need for a negative merge timestamp can be understood by remembering that forest merges recover the disk space occupied by deleted documents. A negative merge timestamp delays that permanent deletion. If we want incremental backups to contain all the fragments that were deleted since the last incremental backup then we want to set the delay to a period greater than the incremental backup period. This requires more disk space for the incremental backups and also requires additional space in the live database, but provides the most flexibility.

Setting retain-until-backup on a given database (thru the Admin UI or thru an API call) has a similar effect by telling the server to  keep the deleted fragments until a full backup or an incremental backup. Many clients choose to use both the negative merge timestamp and retain until backup options together.

More?: admin:database-set-merge-timestamp,  admin:database-set-retain-until-backup 

Conclusion

Planning to meet a Long Term Retention (LTR) policy, a Recovery Point Objective (RPO) and a Recovery Time Objective (RTO) and a Cost goal is a key part of developing an overall MarkLogic deployment plan. MarkLogic offers a wealth of tools that can complement each other when they are properly coordinated. As is clear from this article, the choices are many, broad, and interrelated.

Introduction

This article talks about best practices for use of external proxies vs using rewriter rules in the Enhanced HTTP server.

Details

Whether to use external proxies versus using rewriter rules in the Enhanced HTTP application server is an application design tradeoff not dissimilar to using a single HTTP application server and a XQuery rewriter or endpoint that can dynamically dispatch to different databases and modules (using eval-in).  The Enhanced HTTP server does this type of dispatching much more efficiently, but the concept is similar, with the same pros and cons.

It is mostly an application and business management issue—by sharing the same port you share the same server configuration (authentication, server settings) and the "outside world" only sees one port, so configuring port-based security on firewalls, routers, or load balancers is more difficult.

[Deprecated]

MarkLogic no longer supplies entity enrichment libraries.

 

Introduction

Query Console is an interactive web-based query development tool for writing and executing ad-hoc queries in XQuery, Server-Side JavaScript, SQL and SPARQL. Query Console enables you to quickly test code snippets, debug problems, profile queries, and run administrative XQuery scripts.  Query Console uses workspaces to assist users with organizing queries.  A user can have multiple workspaces, and each workspace can have multiple queries.

Issue

In MarkLogic Server v9.0-11, v10.0-3 and earlier releases, users may experience delays, lag or latency between when a key is pressed on the keyboard, and when it appears in the Query Console query window.  This typically happens when there are a large number of queries in one of the users workspaces.

Workaround

A workaround to improve performance is to reduce the number of queries in each workspace.  The same number of queries can be managed by increasing the number of workspaces and reducing the number of queries in each workspace.  We suggest keeping no more than 30 queries in a workspace to avoid these latency issues.  

The MarkLogic Development team is looking to improve the performance of Query Console, but at the time of this writing, this performance issue has not yet been resolved. 

Further Reading

Query Console User Guide

Introduction

 A "fast data directory" is configurable for each forest, and can be set to a directory built on a fast file system, such as one using SSDs. Refer to Using a mix of SSD and spinning drives. If configured MarkLogic Server will try to put as many writes and seeks to the Fast Data Directory (FDD) as it can. As such, it will try to put as many on disk stands as possible onto the FDD. Frequently updated documents tend to reside in the smaller stands and thus are more likely to reside on the FDD.

This article attempts to explain how you should account for the FDD when sizing disk space for your MarkLogic Server.

Details

Forest journals will be placed on the fast data directory. 

Each time an automatic merge is performed, MarkLogic Server will attempt to save the results onto the forest's fast data directory. If there is not sufficient space on the FDD, MarkLogic Server will use the forest's primary data directory. To preserve space for future small stands, MarkLogic Server is conservative in deciding whether to put the merge destination stands on the FDD, which means that even if there is enough available space, it may store the result to the forests regular data directory. For more details, refer to the fundamental of resource consumption white paper. 

It is also important to know when the Fast Data Directory is not used: Stands created from a manually triggered merges do not get stored on the fast data directory, but in the forest's primary data directory. Manual merges can be executed by calling the xdmp:merge function or from within the Admin UI; Forest-migrate  and Restoring backups do not put stands in the fast data directory.

Conclusion

MarkLogic Server maintains some disk space in the FDD for checkpoints and journaling. However, since the Fast Data Directory is not used in some procedures, we should not count the size of the FDD when sizing the disk space needed for forest data.

Introduction

Attached to this article is an XQuery module: "appserver-status.xqy", which will generate a report on all requests currently "in-flight" across all application servers in your cluster

Usage

Run this in Query Console (be sure to display results as html output), it will generate an html table showing all requests currently "in-flight" across all application servers in your cluster. For any transaction taking over 60 seconds, it provides extra detail to help understand and identify bottlenecks where specific modules (or tasks) may be having an adverse effect on the overall performance of the cluster.

The information generated by this module can be used in conjunction with any ticket opened with the support team where assistance is required to better understand and resolve performance issues relating to specific modules. This module could also be used in a situation where DBAs want to perform routine health checks on their cluster to find and identify slow running queries.

SUMMARY

This article discusses the MarkLogic group-level caches and Linux Huge Page configurations.

Group Caches

MarkLogic utilizes caches to increase retrieval performance of frequently-accessed objects. In particular, MarkLogic caches:

1. Expanded trees (Expanded Tree Cache)

On any groups that have app servers configured for your application (E-nodes), the Expanded Tree Cache is used to hold frequently-accessed XML documents. This cache is used as workspace for holding the result set of a particular query. MarkLogic recommends that most customers set the Expanded Tree Cache size to 1/8th of the physical memory on the server.

For groups that only manage forest content and do not have app servers configured (D-nodes), the Expanded Tree Cache is used only during the process of reindexing content. The cache size should be set to 1024 for D-nodes.

2. Compressed trees (Compressed Tree Cache)

On any groups that do not manage forest content (E-nodes), the Compressed Tree Cache is unused, and should be set to 128.

For groups that manage forest content (D-nodes), the Compressed Tree Cache is used to hold recently-accessed XML content in a compressed form. Its purpose is to minimize random disk reads for frequently-accessed content. MarkLogic recommends that most customers set the Compressed Tree Cache size to 1/16th of the physical memory on the server.

3. Lists (List Cache)

On any groups that do not manage forest content (E-nodes), the List Cache is unused, and should be set to 128.

For groups that manage forest content (D-nodes), the List Cache is used to hold recently-accessed index termlists. Its purpose is to minimize disk reads for frequently-accessed index terms, which are used for almost every MarkLogic XQuery. MarkLogic recommends that most customers set the List Cache size to 1/8th of the physical memory on the server.

Rule of Thirds

By default, MarkLogic Server will allocate roughly one third of physical memory to the aforementioned caches, but the server will try to utilize as much memory as possible. The "Rule of Thirds" provides a conceptual explanation of how MarkLogic uses memory on a server:

  • One third of physical memory for MarkLogic group-level caches
  • One third of physical memory for in-memory content (range indexes and in-memory stands)
  • One third of physical memory for workspace, app server overhead, and Linux filesystem buffer

It is very common for Linux servers running MarkLogic to show high memory utilization. In fact, it is desirable to have MarkLogic utilize much of the memory on the server. However, the server should use very little swap, as that will have a severe negative impact on performance. Adhering to the Rule of Thirds should generally ensure that a server is properly sized, and any cases of memory-related performance degradations should be compared against this rule to identify improper sizing.

Huge Pages

MarkLogic server memory use falls into two major categories: large block and small block. Caches and in-memory stands look for large blocks of contiguous memory space, while range indexes, workspace memory, and the Linux filesystem buffer utilize smaller blocks of memory. In order to efficiently allocate the large blocks of memory for the group-level caches and in-memory stands, MarkLogic recommends the usage of Linux Huge Pages. Instead of the kernel allocating 4k pages of memory, huge pages are 2048k in size and can be quickly allocated for larger blocks of memory. At a minimum, MarkLogic recommends allocating enough huge pages to cover the group-level caches (roughly one third of physical memory). The upper end of recommended huge pages includes both the caches and in-memory stands.

The Installation Guide for All Platforms offers the following guidelines for setting up Linux Huge pages:

On Linux systems, MarkLogic recommends setting Linux Huge Pages to 3/8 the size of your physical memory, and should be configured to reserve the space at boot time. For details on setting up Huge Pages, see the following Red Hat Enterprise Linux (RHEL) KB:

How can I configure huge pages in Red Hat Enterprise Linux

If you have Huge Pages set up on a Linux system, your swap space on that machine should be equal to the size of your physical memory minus the size of your Huge Page (because Linux Huge Pages are not swapped), or 32GB, whichever is lower. For example, if you have 64 GB of physical memory, and if you have Huge Pages set to 24 GB, then you need swap space of 40 GB (64 - 24).

At system startup on Linux machines, MarkLogic Server logs a message to the ErrorLog.txt file showing the Huge Page size, and the message indicates if the size is below the recommended level."

Further Reading

Linux Huge Pages and Transparent Huge Pages

MarkLogic default Group Level Cache and Huge Pages settings

The table below shows the default (and recommended) group level cache settings based on a few common RAM configurations for the 9.0-9.1 release of MarkLogic Server:

Total RAM List Cache Compressed Tree Cache Expanded Tree Cache Triple Cache Triple Value Cache Default Huge Page Ranges
8192 (8GB) 1024 (1 partition) 512 (1 partition) 1024 (1 partition) 512 (1 partition) 1024 (2 partitions) 1280 to 1994
16384 (16GB) 2048 (1 partition) 1024 (2 partitions) 2048 (1 partition) 1024 (2 partitions) 2048 (2 partitions) 2560 to 3616
24576 (24GB) 3072 (1 partition) 1536 (2 partitions) 3072 (1 partition) 1536 (2 partitions) 3072 (4 partitions) 3840 to 4896
32768 (32GB) 4096 (2 partitions) 2048 (3 partitions) 4096 (2 partitions) 2048 (3 partitions) 4096 (6 partitions) 5120 to 6176
49152 (48GB) 6144 (2 partitions) 3072 (4 partitions) 6144 (2 partitions) 3072 (4 partitions) 6144 (8 partitions) 7680 to 8736
65536 (64GB) 8064 (3 partitions) 4032 (6 partitions) 8064 (3 partitions) 4096 (6 partitions) 8192 (11 partitions) 10080 to 11136
98304 (96GB) 12160 (4 partitions) 6080 (8 partitions) 12160 (4 partitions) 6144 (8 partitions) 12160 (16 partitions) 15200 to 16256
131072 (128GB) 16384 (6 partitions) 8192 (11 partitions) 16384 (6 partitions) 8192 (11 partitions) 16384 (22 partitions) 20480 to 21020
147456 (144GB) 18432 (6 partitions) 9216 (12 partitions) 18432 (6 partitions) 9216 (12 partitions) 18432 (24 partitions)

23040 to 24096

262144 (256GB) 32768 (9 partitions) 16384 (11 partitions) 32768 (9 partitions) 16128 (22 partitions) 32256 (32 partitions)

40320 to 42432

Note that these values are safe to use for MarkLogic 7 and above.

For all the databases that ship with MarkLogic Server, the Huge Pages ranges on this table will cover the out-of-the box configuration. Note that adding more forests will cause the second value in the range to increase.

From MarkLogic Server 9.0-7 and above

In the 9.0-7 release and above (and all versions of MarkLogic 10), automatic cache sizing was introduced; this setting is usually recommended.

Maximum group level cache settings

Assuming a Server configured with 256GB RAM (and above), these are the maximum sizes for the three main group level caches and will utilise 180GB (184320MB) per host for the Group Level Caches:

  • Expanded Tree Cache - 73728 (72GB) (with 9 8GB partitions)
  • List Cache - 73728 (72GB) (with 9 8GB partitions)
  • Compressed Tree Cache - 36864 (36GB) (with 11 3 GB partitions)

We have found that configuring 4GB partitions for the Expanded Tree Cache and the List Cache generally works well in most cases; for this you would set the number of partitions to 18

For the Compressed Tree Cache the number of partitions can be set to 22.

Important note

The maximum number of configurable partitions is 32

Each cache partition should be no more than 8192 MB

Introduction

MarkLogic Server has a notion of groups, which are sets of similarly configured hosts within a cluster.

Application servers (and their respective ports) are scoped to their parent group.

Therefore, you need to make sure that the host and its exposed port to which you're trying to connect both exist in the group where the appropriate application server is defined. For example, if you attempt to connect to a host defined in a group made up of d-nodes, you'll only see application servers and ports defined in the d-nodes group. If the application server you actually want is in a different group (say, e-nodes), you'll get a connection error, instead.

Questions

Can I use any xdmp builtins to show which application servers are linked to particular groups?

The code example below should help with this:

Introduction

OK, so you have written an amazing "killer App" using XQuery on MarkLogic Server and you are ready to make it available to the world. Before pressing the deploy button, you may want to verify that your application is not susceptible to hackers and other malicious users. There are many reliable scanners available to help find vulnerabilities in your MarkLogic installation. MarkLogic does not recommend any particular scanner.  

This article presents recommendation to handle some of the issues that might be flagged by a vulnerability scan over your MarkLogic Server Application.

Recommendations

Put ports 7998 - 8002 behind a firewall.

For vulnerabilities related to OpenSSH, TCP/IP attack, and other OS related known weaknesses, these can easily be warded off by taking the following steps:

  • Use a strong name/password.
  • Upgrade to the latest version of MarkLogic Server to get the most recently included OpenSSH library; or don’t use SSH and close port 22. 
  • Place production behind a firewall and only expose ports required by public application. 

It is important to guard against Denial Of Service attacks. Here are some ways you can harden against that:

Introduction: getting more information about the bugs fixed between releases

As a general recommendation, we encourage customers to keep the server up-to-date with patch releases at any case.

If you would like a list of some of the published bugs that were addressed between two releases of the server (for example: 5.0-3 and 5.0-4.1), you can perform the following steps:

- Log into the support portal at http://help.marklogic.com
- Click on the "Fixed bugs" icon to take you to the bugtrack list
- Select 5.0-3 in the From: dropdown box
- Select 5.0-4.1 in the To: dropdown box
- Click 'Show' to generate an HTML table or View PDF to export the results in a PDF document

Step one: login

Provide your credentials and use the form on the left-hand side to log in to access the support portal

Log into the support portal

Step two: select the "Fixed bugs" link from the icons on the page

Select 'Fixed Bugs' to go to the bugtrack list

Step three: select the release 'range' from the two dropdown lists on the Fixed Bugs page

Use the Show button to update the page or download the list in PDF format as required

Select the versions from the 'From' and 'To' lists to generate the report

Summary

MarkLogic Server has several different features that can help manage data across multiple database instances. Those features differ from each other in several important ways - this article will focus on high-level distinctions and will provide pointers to other materials to help you decide which of these features could work best for your particular use case.

 Details

Backup/Restore - database backup and restore operations in MarkLogic Server provide consistent database-level views of your data. Propagating data from one instance to another via backup/restore involves a MarkLogic administrator using a completed backup from the source instance as the restore archive on the destination instance. You can read more about Backup/Restore here: http://docs.marklogic.com/guide/admin/backup_restore.

Flexible Replication - can be used to maintains copies of data on multiple MarkLogic Servers. Unlike backup/restore (which relies on taking a consistent, database level view of the data at a particular timestamp), Flexible Replication creates a copy of a document in another database and keeps that copy in sync (possibly with some time-lag/latency) with the original in the course of normal operations. You can read more about Flexible Replication here: http://docs.marklogic.com/guide/flexrep/rep_intro. Do note that:

  • Flexible Replication is asynchronous. Asynchronous Replication refers to a configuration in which the Master does not wait for confirmation that the update has been received by the Replica before sending further updates.
  • Flexible Replication does not use the same transaction boundaries on the replica as on the master. For example, 10 documents might be inserted in a single transaction on a Flexible Replication master. Those 10 documents will eventually be inserted on a Flexible Replication replica, but there is no guarantee that the replica instance will also use a single transaction to do so.

Database Replication - is used maintains copies of data on multiple MarkLogic Servers. Database Replication creates a copy of a document in another database and keeps that copy in sync (possibly with some time-lag/latency) with the original in the course of normal operations. You can read more about Database Replication here: http://docs.marklogic.com/guide/database-replication/dbrep_intro. Note that:

a. Database Replication is, like Flexible Replication, asynchronous.

b. In contrast to Fleixble Replication, Database Replication operates by copying journal frames from the Master database and replays the transactions described by those journal frames on the foreign Replica database.

XA Transactions - MarkLogic Server can participate in distributed transactions by acting as a Resource Manager in an XA/JTA transaction. If there are multiple MarkLogic Server instances participating as XA resources in a given XA transaction, then it's possible to use that XA transaction as a synchronized means of replicating data across those multiple MarkLogic instances. You can read more about XA Transactions in MarkLogic Server here: http://docs.marklogic.com/guide/xcc/concepts#id_57048.

Introduction

Upgrading individual MarkLogic instances and clusters is generally very easy to do and in most cases requires very little downtime. In most cases, shutting down the MarkLogic instance on each host in turn, uninstalling the current release, installing the updated release and restarting each MarkLogic instance should be all you need to be concerned about...

However, unanticipated problems do sometimes come to light and the purpose of this Knowledgebase article is to offer some practical advice as to the steps you can take to ensure the process goes as easily as possible - this is particularly important if you're planning an upgrade between major releases of the product.

Prerequisites

While the steps outlined under the process heading below offer practical advice as to what to do to ensure your data is safeguarded (by recommending that backups are taken prior to upgrading), another very useful step would be to ensure you have your current configuration files backed up.

Each host in a MarkLogic cluster is configured using parameters which are stored in XML Documents that are available on each host. These are usually relatively small files and will zip up to a manageable size.

If you cd to your "Data" directory (on Linux this is /var/opt/MarkLogic; on Windows this is C:\Program Files\MarkLogic\Data and on OS X this is /Users/{username}/Library/Application Support/MarkLogic), you should see several xml files (assignments, clusters, databases, groups, hosts, server).

Whenever MarkLogic updates any of these files, it creates a backup using the same naming convention used for older ErrorLog files (_1, _2 etc). We recommend backing up all configuration files before following the steps under the next heading.

Process

1) Take a backup for each database in your cluster

2) Turn reindexing off for each database in your cluster

3) Starting with the node hosting your Security and Schemas forests, uninstall the current maintenance release MarkLogic version on your cluster, then install the latest maintenance release in that feature release (for example, if you're currently running version 10.0-2, you'll want to update to the latest available MarkLogic 10 maintenance release - at the time of this writing, it is 10.0-4).

4) Start up the host in your cluster hosting your Security and Schemas forests, then the remaining hosts in the cluster.

5) Access the Admin UI on the node hosting your Security and Schemas forests and accept the license agreement, either for just that host (Accept button) or for all of the hosts in the cluster (Accept for Cluster button). If you choose the Accept for Cluster button, a summary screen appears showing all of the hosts in the cluster. Click the Accept for Cluster button to confirm acceptance (all of the hosts must be started in order to accept for the cluster). If you accepted the license just for the one host in the previous step, you must go to all of the Admin Interface for all of the other hosts and accept the license for each host before each host can operate.

6) If you're upgrading across feature releases, you may now repeat steps #3-5 until you reach the desired feature and maintenance release on your cluster (for example, if trying to upgrade from MarkLogic 8 to MarkLogic 10,  after installing 8.0-latest, you'll repeat steps 3-5 for version 9.0-latest).

7) After you've finished upgrading across all the relevant feature releases, re-enable reindexing for each database in your cluster.

For more details, please go through Section  “Upgrading a Cluster to a New Maintenance Release of MarkLogic Server” of “Scalability, Availability, and Failover” guide.

If you've got database replication in place across both a master and replica cluster, then be aware that:

1) You do not need to break replication between the clusters

2) You should plan to upgrade both the master cluster and replica cluster. If you upgrade just the master, connectivity between the two clusters will stop due to different XDQP versions. 

3) If the Security database isn't replicated, then there shouldn't be anything special you need to do other than upgrade the two clusters.

4) If the security database is replicated, do the following:

  • Upgrade the Replica cluster and run the upgrade scripts. This will update the Replica's Security database to indicate that it is current. It will also do any necessary configuration upgrades.
  • Upgrade the Master cluster and run the upgrade scripts. This will update the Master's Security database to indicate that it is current. It will also do any necessary configuration upgrades.

For more here Updating Clusters Configured with Database Replication

Back-out Plan

MarkLogic does not support restoring a backup made on a newer version of MarkLogic Server onto an older version of MarkLogic Server. Your Back-out plan will need to take this into consideration.

See the section below for recommendations on how this should be handled.

Further reading

Backing out of your upgrade: steps to ensure you can downgrade in an emergency

Product release notes

The "Upgrade Support" section of the release notes.

All known incompatibilities between releases

The "Upgrading from previous releases" section of the documentation

MarkLogic Support Fixed Bug List

Background

A database consists of one or more forests. A forest is a collection of documents (mostly XML trees, thus the name), implemented as a physical directory on disk. Each forest holds a set of documents and all their indexes. 

When a new document is loaded into MarkLogic Server, the server puts this document in an in-memory stand and writes the action to an on-disk journal to maintain transactional integrity in case of system failure. After enough documents are loaded, the in-memory stand will fill up and be flushed to disk, written out as an on-disk stand. As more document are loaded, they go into a new in-memory stand. At some point this in-memory stand fills up as well, and the in-memory stand gets written as yet another new on-disk stand.

To read a single term list, MarkLogic must read the term list data from each individual stand and unify the results. To keep the number of stands to a manageable level where that unification isn't a performance concern, MarkLogic runs merges in the background. A merge takes some of the stands on disk and creates a new singular stand out of them, coalescing and optimizing the indexes and data, as well as removing any previously deleted fragments
Each forest has its own in-memory stand and set of on-disk stands. Loading and indexing content is a largely parallelizable activity so splitting the loading effort across forests and potentially across machines in a cluster can help scale the ingestion work.

Deletions and Multi-Version Concurrency Control (MVCC)

What happens if you delete or change a document? If you delete a document, MarkLogic marks the document as deleted but does not immediately remove it from disk. The deleted document will be removed from query results based on its deletion markings, and the next merge of the stand holding the document will bypass the deleted document when writing the new stand. MarkLogic treats any changed document like a new document, and treats the old version like a deleted document.

This approach is known in database circles as which stands for Multi-Version Concurrency Control (or MVCC).
In an MVCC system changes are tracked with a timestamp number which increments for each transaction as the database changes. Each fragment gets its own creation-time (the timestamp at which it was created) and deletion-time (the timestamp at which it was marked as deleted, starting at infinity for fragments not yet deleted).

For a request that doesn't modify data the system gets a performance boost by skipping the need for any URI locking. The query is viewed as running at a certain timestamp, and throughout its life it sees a consistent view of the database at that timestamp, even as other (update) requests continue forward and change the data.

Updates and Deadlocks

An update request, because it isn't read-only, has to use read/write locks to maintain system integrity while making changes. Read-locks block for write-locks; write-locks block for both read and write-locks. An update has to obtain a read-lock before reading a document and a write-lock before changing (adding, deleting, modifying) a document. Lock acquisition is ordered, first-come first-served, and locks are released automatically at the end of a request.

In any lock-based system you have to worry about deadlocks, where two or more updates are stalled waiting on locks held by the other. In MarkLogic deadlocks are automatically detected with a background thread. When the deadlock happens on the same host in a cluster, the update farthest along (with the most locks) wins and the other update gets restarted. When it happens on different hosts, because lock count information isn't in the wire protocol, both updates start over. MarkLogic differentiates queries from updates using static analysis. Before running a request, it looks at the code to determine if it includes any calls to update functions. If so, it's an update. If not, it's a query. Even if at execution time the update doesn't actually invoke the updating function, it still runs as an update.

For the most part it's not under the control of the user. The one exception is there's an xdmp:lock-for-update($uri) call that requests a write-lock on a document URI, without actually having to issue a write and in fact without the URI even having to exist.

When a request potentially touches millions of documents (such as sorting a large data set to find the most recent items), a query request that runs lock-free will outperform an update request that needs to acquire read-locks and writelocks. In some cases you can speed up the query work by isolating the update work to its own transactional context. This technique only works if the update doesn't have a dependency on the outer query, but that turns out to be a common case. For example, let's say you want to execute a content search and record the user's search string to the database for tracking purposes. The database update doesn't need to be in the same transactional context as the search itself, and would slow things down if it were. In this case it's better to run the search in one context (read-only and lock-free) and the update in a different context. See the xdmp:eval() and xdmp:invoke() functions for documentation on how to invoke a request from within another request and manage the transactional contexts between the two.

Document Lifecycle

Let's track the lifecycle of a document from first load to deletion until the eventual removal from disk. A document load request acquires a write-lock for the target URI as part of the xdmp:document-load() function call. If any other request is already doing a write to the same URI, our load will block for it, and vice versa. At some point, when the full update request completes successfully (without any errors that would implicitly cause a rollback), the actual insertion work begins, processing the queue of update work orders. MarkLogic starts by parsing and indexing the document contents, converting the document from XML to a compressed binary fragment representation. The fragment gets added to the in-memory stand. At this point the fragment is considered a nascent fragment, a term you'll see sometimes on the administration console status pages. Being nascent means it exists in a stand but hasn't been fully committed. (On a technical level, nascent fragments have creation and deletion timestamps both set to infinity, so they can be managed by the system while not appearing in queries prematurely.) If you're doing a large transactional insert you'll accumulate a lot of nascent fragments while the documents are being processed. They stay nascent until they've been committed. Once the fragment is placed into the in-memory stand, the request is ready to commit. It obtains the next timestamp value, journals its intent to commit the transaction, and then makes the fragment available by setting the creation timestamp for the new fragment to the transaction's timestamp. At this point it's a durable transaction, replayable in event of server failure, and it's available to any new queries that run at this timestamp or later, as well as any updates from this point forward (even those in progress). As the request terminates, the write-lock gets released.

Our document lives for a time in the in-memory stand, fully queryable and durable, until at some point the in-memory stand fills up and gets written to disk. Our document is now in an on-disk stand. Sometime later, based on merge algorithms, the on-disk stand will get merged with some other on-disk stands to produce a new on-disk stand. The fragment will be carried over, its tree data and indexes incorporated into the larger stand. This might happen several times.

At some point a new request makes a change to the document, such as with an xdmp:node-replace() call. The request making the change first obtains a read-lock on the URI when it first accesses the document, then promotes the read-lock to a write-lock when executing the xdmp:node-replace() call. If another write-lock were already present on the URI from another executing update, the read-lock would have blocked until the other write-lock released. If another read-lock were already present, the lock promotion to a write-lock would have blocked. Assuming the update request finishes successfully, the work runs similar to before: parsing and indexing the document, writing it to the in-memory stand as a nascent fragment, acquiring a timestamp, journaling the work, and setting the creation timestamp to make the fragment live. Because it's an update, it has to mark the old fragment as deleted also, and does that by setting the deletion timestamp of the original fragment to the transaction timestamp. This combination effectively replaces the old fragment with the new. When the request concludes, it releases its locks. Our document is now deleted, replaced by the new version.

The old fragment still exists on disk, of course. In fact, any query that was already in progress before the update incremented the timestamp, or any query doing time travel with an old timestamp, can still see it. Eventually the on-disk stand holding the fragment will be merged again, at which point the old fragment will be completely removed from the system. It won't be written into the new on-disk stand. That is, unless the administration "merge timestamp" was set to allow deep time travel. In that case it will live on, sticking around in case any new queries want to time travel to see old fragments.

Introduction

With the release of MarkLogic 5, a new feature - "Journal Archiving" - was added to the product. This feature allows for point-in-time recoveries to be made to a given database (forest) at any time; essentially, this option allows the restore of changes to all forests in a given database (using a call to a new function - xdmp:forest-rollback() at a particular moment in time. This article will provide a quick "getting started" demo to show the feature in action.

For more information, documentation is available online

Process overview

In order to successfully perform a point-in-time recovery, the following steps need to take place:

  • A backup of a database at a given timestamp
  • Performing an insert of a number of new documents after that timestamp
  • A restore back to the previous timestamp

Adding Content

We're using xdmp:eval to insert each document as part of a separate (isolated) transaction:

Important - for this demonstration, we're going to rely on properties fragments to keep track of the last modified date, please ensure it is enabled when you create your test database for this exercise

Check Timestamps

To get an order-wise list of the documents in the database (note: ensure 'maintain last modified' is set to 'true' so properties fragments are generated):

First point-in-time: 10 documents now in the database

Running the code above against the database should yield something like this:

Doc: /test/1.xml was inserted at: 2013-02-28T17:54:57Z
Doc: /test/2.xml was inserted at: 2013-02-28T17:54:58Z
Doc: /test/3.xml was inserted at: 2013-02-28T17:54:59Z
Doc: /test/4.xml was inserted at: 2013-02-28T17:55:00Z
Doc: /test/5.xml was inserted at: 2013-02-28T17:55:01Z
Doc: /test/6.xml was inserted at: 2013-02-28T17:55:02Z
Doc: /test/7.xml was inserted at: 2013-02-28T17:55:03Z
Doc: /test/8.xml was inserted at: 2013-02-28T17:55:04Z
Doc: /test/9.xml was inserted at: 2013-02-28T17:55:05Z
Doc: /test/10.xml was inserted at: 2013-02-28T17:55:06Z

Delete Everything

In this scenario, a test script was inadvertently run on the production server resulting in a loss of data:

Rollback (with xdmp:forest-rollback)

We can use the following code to rollback to a safe timestamp - note that in the example below, we're rolling back the forest to a point just after the 10th document was inserted into the database 2013-02-28T17:55:07Z - but before the "delete-all" script was executed:

Following the same process using Backup and Restore

We're going to repeat the same process, only this time, we will backup after the crisis took place (in this case backing up an empty database). We will then use xdmp:forest-rollback to get the newly-restored forests to a safe timestamp.

The steps are as follows:

  • Run the delete script again to clear the database
  • Create the backup (ensure that you set "Archive Journals" to true when doing this)
  • Set the merge timestamp on the database to a value greater than zero, for example: 1 (if you don't do this, when you attempt to restore, you will see an error message 'Error: XDMP-MERGETIMESTAMPMISSING: Merge timestamp must be set to non-zero value when restoring with journal archiving and restore-to-time is zero)'
  • Run the restore - remember to add the full path in the 'restore from directory' so the restore routine can find the BackupTag. Please ensure 'Use journal archive' is enabled and leave the 'Restore to time' blank so a full restore is performed
  • Run the xdmp:forest-rollback code as before and confirm the documents are all available after doing so
  • Set the merge timestamp back to zero

Introduction

This Knowledgebase article is a general guideline for backups using the journal archiving feature for both free space requirements and expected file sizes written to the archive journaling repository when archive journaling is enabled and active.

The MarkLogic environment used here was an out-of-the box version 9.x with one change of adding a new directory specific to storing the archive journal backup files.

It is assumed that the reader of this article already has a basic understanding of the role of Journal Archiving in the Backup and Restore feature of MarkLogic Server. See references below for further details(below).

How much free space is needed for the Archive Journal files in a backup?

MarkLogic Server uses the forest size of the active forest to confirm whether the journal archive repository has enough free space to accommodate that forest, but if additional forests already exist on the same volume, then there may be an issue in the Server's "free-space" calculation as the other forests are never used in the algorithm that calculates the free space available for the backup and/or archive journal repositories. Only one forest is used in the free-space calculation.

In other words, if multiple forests exist on the same volume, there may not be enough free space available on that specific volume due to the additional forests; especially during a high rate of ingestion. If that is the case, then it is advised to provide enough free space on that volume to accommodate the sizes of all the forests. Required Free Space(approximately) = (Number of Forests) x (Size of largest Forest).

What can we expect to see in the journal archiving repository in terms of files sizes for specific ingestion types and sizes? That brings us to the other side.

How is the Journal Archive repository filling up?

1 MByte of raw XML data loaded into the server (as either a new document ingestion or a document update) will result in approximately 5 to 6 MBytes of data being written to the corresponding Journal Archive files.  Additionally, adding Range Indexes will contribute to a relatively small increase in consumed space.

Ingesting/updating RDF data results in slightly less data being written to the journal archive files.

In conclusion, for both new document ingestion and document updates, the typical expansion ratio of Journal Archive size to Input file size is between 5 an 6 but can be higher than that depending on the document structure and any added range indexes.

References:

Introduction

Sometimes, when a host is removed from a cluster in an improper manner -- e.g., by some means other than the Admin UI or Admin API, a remote host can still try to communicate with its old cluster, but the cluster will recognize it as a "foreign IP" and will log a message like the one below:

2014-12-16 00:00:20.228 Warning: XDQPServerConnection::init(10.0.80.7:7999-10.0.80.39:44247): SVC-SOCRECV: Socket receive error: wait 10.0.80.7:7999-10.0.80.39:44247: Timeout

Explanation: 

XDQP is the internal protocol that MarkLogic uses for internal communications amongst the hosts in a cluster and it uses port 7999 by default. In this message, the local host 10.0.80.7 is receiveng socket connections from foreign host 10.0.80.39.

 

Debugging Procedure, Step 1

To find out if this message indicates a socket connection from an IP address that is not part of the cluster, the first place is to look is in the hosts.xml files. If the IP address in not found in the hosts.xml, then it is a foreign IP. In that case, the following are the steps will help to identify the the processes that are listening on port 7999.

 

Debugging Procedure, Step 2

To find out who is listening on XDQP ports, try running the following command in a shell window on each host:

      $ sudo netstat -tulpn | grep 7999

You should only see MarkLogic as a listner:

     tcp 0 0 0.0.0.0:7999 0.0.0.0:* LISTEN 1605/MarkLogic

If you see any other process listening on 7999, yopu have found your culprit. Shot down those processes and the messages will go away.

 

Debugging Procedure, Step 3

If the issue persists, run tcpdump to trace packets to/from "foreign" hosts using the following command:

     tcpdump -n host {unrecognized IP}

Shutdown MarkLogic on those hosts. Also, shutdown any other applications that are using port 7999.

 

Debugging Procedure, Step 4

If the cluster are hosts on AWS, you may also want to check on your Elastic Load Balancer ports. This may be tricky, because instances will change IP addresses if they are rebooted, so  work with AWS Support to help you find the AMI or load balancer instance that is pinging your cluster.

In the case that the "foreign host" is an elastic load balancer, be sure to remove port 7999 from its rotation/scheduler. In addition, you should set the load balancer to use port 7997 for the heartbeat functionality.

Introduction

For hosts that don't use a standard US locale (en_US) there are instances where some lower level calls will return data that cannot be parsed by MarkLogic Server. An example of this is shown with a host configured with a different locale when making a call to the Cluster Status page (cluster-status.xqy):

XDMP-LEXVAL exception

The problem

The problem you have encountered is a known issue: MarkLogic Server uses a call to strtof() to parse the values as floats:

http://linux.die.net/man/3/strtof

Unfortunately, this uses a locale-specific decimal point. The issue in this environment is likely due to the Operating System using a numeric locale where the decimal point is a comma, rather then a period.

Resolving the issue

The workaround for this is as follows:

1. Create a file called /etc/marklogic.conf (unless one already exists)

2. Add the following line to /etc/marklogic.conf:

export LC_NUMERIC=en_US.UTF-8

After this is done, you can restart the MarkLogic process so the change is detected and try to access the cluster status again.

Summary

Diagnostic trace events can be particularly useful in situations where you need access to more internal diagnostic information than is available in the standard MarkLogic ErrorLog or in the Operating System logs.

The host / cluster can be configured to output trace events and from the point at which these diagnostics are enabled and added, the server will write information to the ErrorLog every time the diagnostic event is encountered.

Enabling Server trace events

Trace events need to be enabled and added by an administrator using the following steps:

  1. Log into the Admin Interface.
  2. Select Groups > group_name > Diagnostics
  3. The Diagnostics Configuration page appears
  4. Click the true button for trace events activated
  5. Enter the trace event [for example Query Trace] to enable the diagnostic event
  6. Click the OK button to activate the event(s)
  7. Example To verify "Query Trace" trace event set
    1. Open a session in qconsole and execute (something like) the following query:
    2. (xdmp:query-trace(true()), cts:search(doc(), cts:word-query("test")))
    3. Observe the ErrorLog for specific query trace information

Manual trace events

In place of using xdmp:log() you can also create custom trace events. The following steps outline this:

  1. Log into the Admin Interface.
  2. Select Groups > group_name > Diagnostics
  3. The Diagnostics Configuration page appears
  4. Click the true button for trace events activated
  5. Enter the trace event [such as Hello World] to enable the diagnostic event
  6. Click the OK button to activate the event(s)
  7. Example To verify "Hello World" trace event set
    1. Open a session in qconsole and execute the following query:
    2. fn:trace("Here is a 'Hello World' trace event.", "Hello World")
    3. Observe the ErrorLog for specific query trace information

Trace event groups

Adding any of these to your diagnostics will cause the server to output a number of trace events relating to that particular group

  • httpserver
  • xdbcserver
  • forest
  • xqueryeval
  • xdqpserver
  • stand
  • list
  • tree
  • forestlabel
  • relevance
  • lock
  • cluster
  • config
  • module
  • cpf
  • classifier

Further reading

Introduction

This article discusses the effects of the incremental backup implementation on Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO).

Details

With MarkLogic 8 you can have multiple daily incremental backups with minimal impact on database performance.

Incrementals complete more quickly than full backups reducing the backup window. A smaller backup window enables more frequent backups, reducing the RPO of the database in case of disaster.

However, RTO can be longer when using incremental backups compared to just full backups, because multiple backups must be restored to recover.

There are two modes of operation when using incremental backups:

Incremental since last full. Here, each incremental has to store all the data that has changed since the last full backup. Since a restore only has to go through a single incremental data set, the server is able to perform a faster restore.  However, each incremental data set is bigger and takes longer to complete than the previous data set because it stores all changes that were included in the previous incremental.

Please note when doing “Incremental since last full”:-

- Create a new incremental backup directory for each incremental backup

- Call database-incremental-backup with incremental-dir set to the new incremental backup directory

 

Incremental since last incremental.  In this case, a new incremental stores only changes since the last incremental, also known as delta backups. By storing only the changes since the last incremental, the incremental backup sets are smaller in size and are faster to complete.  However, a restore operation would have to go through multiple data sets.

Please note when doing “Incremental since last incremental”:-

- Create an incremental backup directory ONCE

- Call database-incremental-backup with the same incremental backup directory.

See also the documentation on Incremental Backup.

 

 

Summary

MarkLogic Admin GUI is convenient place to deploy the Normal Certificate infrastructure or use the Temporary Certificate generated by MarkLogic. However for certain advance solutions/deployment we need XQuery based admin operations to configure MarkLogic.

This knowledgebase discusses the solution to deploy SAN or Wildcard Certificate in 3 node (or more) cluster.

 

Certificate Types and MarkLogic Default Config

Certificate Types

In general, When browsers connect to a Server using HTTPS, they check to make sure your SSL Certificate matches the host name in the address bar. There are three ways for browsers to find a match:

a).The host name (in the address bar) exactly matches the Common Name in the certificate's Subject.

b).The host name matches a Wildcard Common Name. Please find example at end of article. 

c).The host name is listed in the Subject Alternative Name (SAN) field as part of X509v3 extensions. Please find example at end of article.

The most common form of SSL name matching is for the SSL client to compare the server name it connected to with the Common Name (CN field) in the server's Certificate. It's a safe bet that all SSL clients will support exact common name matching.

MarkLogic allows this common scenario (a) to be configured from Admin GUI, and we will discuss the Certificate featuring (b) and (c) deployment further.

Default Admin GUI based Configuration 

By default, MarkLogic generates Temporary Certificate for all the nodes in the group for current cluster when Template is assigned to MarkLogic Server ( Exception is when Template assignment is done through XQuery ).

The Temporary Certificate generated for each node do have hostname as CN field for their respective Temporary Certificate - designed for common Secnario (a).

We have two path to install CA signed Certificate in MarkLogic

1) Generate Certificate request, get it signed by CA, import through Admin GUI

or 2) Generate Certificate request + Private Key outside of MarkLogic, get Certificate request signed by CA, import Signed Cert + Private Key using Admin script

Problem Scenario

In both of the above cases, while Installing/importing Signed Certificate, MarkLogic will look to replace Temporary Certificate by comparing CN field of Installed Certificate with Temporary Certificaet CN field.

Now, if we have WildCard Certificate (b) or SAN Certificate (c), our Signed Certificate's CN field will never match Temporary Certificate CN field, hence MarkLogic will Not remove Temporary Certificates - MarkLogic will continue using Temporary Certificate.

 

Solution

After installing SAN or wildcard Certificate, we may run into AppServer which still uses Temporary installed Certificate ( which was not replaced while installing SAN/wild-card Certificate).

Use below XQuery against Security DB to remove all Temporary Certificates. XQuery needs uri lexicon to be enabled (default enabled). [Please change the Certificate Template-Name in below XQuery to reflect values from your environment.] 

xquery version "1.0-ml";

import module namespace pki = "http://marklogic.com/xdmp/pki"  at "/MarkLogic/pki.xqy";
import module namespace admin = "http://marklogic.com/xdmp/admin"  at "/MarkLogic/admin.xqy";
      

let $hostIdList := let $config := admin:get-configuration()
                   return admin:get-host-ids($config)
                     
for $hostid in $hostIdList
return
  (: FDQN name matching Certificate CN field value :)
  let $fdqn := "TestDomain.com"

  (: Change to your Template Name string :)
  let $templateid := pki:template-get-id(pki:get-template-by-name("YourTemplateName"))

  for $i in cts:uris()
  where 
  (   (: locate Cert file with Public Key :)
      fn:doc($i)//pki:template-id=$templateid 
      and fn:doc($i)//pki:authority=fn:false()
      and fn:doc($i)//pki:host-name=$fdqn
  )
  return <h1> Cert File - {$i} .. inserting host-id {$hostid}
  {xdmp:node-insert-child(doc($i)/pki:certificate, <pki:host-id>{$hostid}</pki:host-id>)}
  {
      (: extract cert-id :)
      let $certid := fn:doc($i)//pki:certificate/pki:certificate-id
      for $j in cts:uris()
      where 
      (
          (: locate Cert file with Private key :)
          fn:doc($j)//pki:certificate-private-key/pki:template-id=$templateid 
          and fn:doc($j)//pki:certificate-private-key/pki:certificate-id=$certid
      )
      return <h2> Cert Key File - {$j}
      {xdmp:node-insert-child(doc($j)/pki:certificate-private-key,
        <pki:host-id>{$hostid}</pki:host-id>)}
      </h2>
  } </h1>

Above will remove all Temporary Certificates (including Template CA) and their private-key, leaving only Installed Certificate associated with Template, forcing all nodes to use Installed Certificate. 

 

Example: SAN (Subject Alternative Name) Certificate

For 3 node cluster (engrlab-128-101.engrlab.marklogic.com, engrlab-128-164.engrlab.marklogic.com, engrlab-128-130.engrlab.marklogic.com)

$ opensl x509 -in ML.pem -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 9 (0x9)
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: C=US, ST=NY, L=NewYork, O=MarkLogic, OU=Engineering, CN=Support CA
        Validity
            Not Before: Apr 20 19:50:51 2016 GMT
            Not After : Jun  6 19:50:51 2018 GMT
        Subject: C=US, ST=NJ, L=Princeton, O=MarkLogic, OU=Eng, CN=TestDomain.com
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
            RSA Public Key: (1024 bit)
                Modulus (1024 bit):
                    00:97:8e:96:73:16:4a:cd:99:a8:6a:78:5e:cb:12:
                    5d:e5:36:42:d2:b8:52:51:53:6c:cf:ab:e4:c6:37:
                    2c:15:12:80:c1:1b:53:29:4c:52:76:84:80:1d:ee:
                    16:41:a6:31:c5:7b:0d:ca:d7:e5:da:d7:67:fe:80:
                    89:9f:0d:bc:46:4f:f0:7e:46:88:26:d5:a0:24:a6:
                    06:d1:fa:c0:c7:a2:f2:11:7f:5b:d5:8d:47:94:a8:
                    06:d9:46:8f:af:dd:31:d5:15:d2:7a:13:39:3e:81:
                    32:bd:5c:bd:62:9d:5a:98:1d:20:0e:30:d4:57:3f:
                    7f:89:e6:20:ae:88:4d:85:d7
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: 
                Key Encipherment, Data Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Subject Alternative Name: 
                DNS:engrlab-128-101.engrlab.marklogic.com, DNS:engrlab-128-164.engrlab.marklogic.com, DNS:engrlab-128-130.engrlab.marklogic.com
    Signature Algorithm: sha1WithRSAEncryption
        52:68:6d:32:70:35:88:1b:70:df:3a:56:f6:8a:c9:a0:9d:5c:
        32:88:30:f4:cc:45:29:7d:b5:35:18:a0:9a:45:37:e9:22:d1:
        c5:50:1d:50:b8:20:87:60:9b:c1:d6:a8:0c:5a:f2:c0:68:8d:
        b9:5d:02:10:39:40:b3:e5:f6:ae:f3:90:31:57:4c:e0:7f:31:
        e2:79:e6:a8:c0:e6:3f:ea:c5:75:67:3e:cd:ea:88:5d:60:d6:
        01:59:3c:dc:e0:47:96:3b:59:4a:13:85:bb:87:70:d0:a2:6b:
        0f:d4:84:1d:d1:be:e8:a5:67:c3:e3:59:05:0d:5d:a5:86:e6:
        e4:9e

Example: Wild-Card Certificate

For 3 node cluster (engrlab-128-101.engrlab.marklogic.com, engrlab-128-164.engrlab.marklogic.com, engrlab-128-130.engrlab.marklogic.com). 

$ openssl x509 -in ML-wildcard.pem -text -noout
Certificate:
    Data:
        Version: 1 (0x0)
        Serial Number: 7 (0x7)
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: C=US, ST=NY, L=NewYork, O=MarkLogic, OU=Engineering, CN=Support CA
        Validity
            Not Before: Apr 24 17:36:09 2016 GMT
            Not After : Jun 10 17:36:09 2018 GMT
        Subject: C=US, ST=NJ, L=Princeton, O=MarkLogic Corporation, OU=Engineering Support, CN=*.engrlab.marklogic.com
 

Summary

This article contains a high level overview of Transparent Huge Pages and Huge Pages. It covers the configuration of Huge Pages and offers advice as to when Huge Pages should be used and how they can be configured.

To the Linux kernel, "pages" describe a unit of memory; by default this should be 2048 KiB. You can confirm this from the terminal by issuing a call to getconf PAGESIZE.

Huge Pages (and Transparent Huge Pages) allow areas of memory to be reserved for resources which are likely to be accessed frequently, such as group level caches. Enabling (and configuring) Huge Pages can increase performance because - when enabled - caches should always be resident in memory.

Huge Pages

In general you should follow the recommendation stated in the MarkLogic Installation Guide for All Platforms, which states:

On Linux systems, MarkLogic recommends setting Linux Huge Pages to 3/8 the size of your physical memory. For details on setting up Huge Pages, refer to the following KB

Group Caches and Linux Huge Pages

Caution

Since the OS and server perform many memory allocations that do not and cannot use huge pages, it may not be possible to configure the full 3/8x for huge pages.  It is not advised to configure more than 3/8x of memory to huge pages.

Calculating the number of Huge Pages to configure:

On an x86 system the default Huge Page size is 2048 KiB.  This can be confirmed using the command "cat /proc/meminfo | grep Hugepagesize".  On a system with 64GiB of physical memory it would be advised to configure 12288 Huge Pages or 24GiB.

Alternatively, MarkLogic provides a recommended range for the number of Huge Pages that should be used.  This recommendation can be seen in the ErrorLog.txt file located in /var/opt/MarkLogic/Logs/ just after the server is started.  Right after starting the server, look for a message that looks like this:

2019-09-01 17:33:14.894 Info: Linux Huge Pages: detected 0, recommend 11360 to 15413

The lower bound includes all group level caches, while the upper bound also includes in-memory stand sizes.

Allocating Huge Pages

Since Huge Pages require large areas of contiguous physical memory, it is advised to allocate huge pages at boot time.  This can be accomplished by placing vm.nr_hugepages = 12288 into the /etc/sysctl.conf file.

Transparent Huge Pages

The Transparent Huge Page (THP) implementation in the Linux kernel includes functionality that provides compaction. Compaction operations are system level processes that are resource intensive, potentially causing resource starvation to the MarkLogic process. Using static Huge Pages is the preferred memory configuration for several high performance database platforms including MarkLogic Server. The recommended method to disable THP on Red Hat Enterprise Linux (RHEL) 7 and 8, is to disable it in the grub2 configuration file, and then rebuild the grub2 configuration.  The following articles from Red Hat detail the process of disabling THP:

How to disable transparent hugepages (THP) on Red Hat Enterprise Linux 7

How to disable transparent hugepage (THP) on Red Hat Enterprise Linux 8

Previous Releases

If you are using Red Hat Enterprise Linux or CentOS 6, you must turn off Transparent Huge Pages (Transparent Huge Pages are configured automatically by the operating system).

The preferred method to disable Transparent HugePages is to add "transparent_hugepage=never" to the kernel boot line in the "/etc/grub.conf" file.

This solution (disabling Transparent HugePages) is covered in detail in this article on RedHat's website

[ref:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-memory-tuning#sect-Virtualization_Tuning_Optimization_Guide-Memory-Huge_Pages]

[ref: http://docs.marklogic.com/guide/installation/intro#id_11335]


RHEL 6 | CentOS 6: Kernels newer than kernel-2.6.32-504.23.4. A race condition can manifest itself as system crash in region_* functions when HugePages are enabled. [1]

Resolution options:
1. Update kernel to 2.6.32-504.40.1.el6 or later, which includes the fix for this issue.

2. Downgrade the kernel to a version before 2.6.32-504.23.4.

3. Disable Huge Pages completely, which might impact MarkLogic’s performance by ~5-15%.
    (Set the value of vm.nr_hugepages=0)

References:
[1] [ref: RedHat Knowledge-base article: https://access.redhat.com/solutions/1992703]

Summary

MarkLogic performs best if swap space is not used.  There are other knowledge base articles that discuss sizing recommendations when configuring your MarkLogic Server. This article discusses the Linux swappiness setting that can limit the amount of swap activity in the event that swap space is required.

Details

Beginning with kernel version 3.5 (RHEL7) and 2.6.32-303 (RHEL 6.4),  MarkLogic suggests setting vm.swappiness to '1', and our recommendation is for it to be set to a value no greater than 10.  With older Linux kernel versions, vm.swappiness can be set to zero, but we do not recommend setting swappiness to 0 on newer kernels. 

You can check the swappiness setting on your Linux servers with the following command: 

    sysctl -a | grep swapp

If it doesn't exist already, add vm.swappiness=1 to the /etc/sysctl.conf file and execute the following command to apply this change:

     sudo sysctl -f

Depending on kernel version, the default Linux value for swappiness is 40-60. This is good for a desktop system that run a variety of applications but has a very small amount of memory. The default setting is not good for a server system that wants to run a single dedicated process – such as MarkLogic Server.

A warning on swappiness

The behaviour of swappiness on newer Linux kernels has changed. On kernels for Linux greater than 3.5 and 2.6.32-303, setting swappiness to 0 will more aggressively avoid swapping, which increases the risk of out-of-memory (OOM) killing under strong memory and I/O pressure. To achieve the same behavior of swappiness as previous versions in which the recommendation was to set swappiness to 0, set swappiness to the value of 1. We do not recommend setting swappiness to 0 on newer kernels. 

Other Settings

While you are making changes, the following kernels settings will also help

vm.dirty_background_ratio=1

vm.dirty_ratio=40

vm.dirty_background_ratio is the percentage of system memory that can be filled with “dirty” pages — memory pages that still need to be written to disk — before the pdflush/flush/kdmflush background processes kick in to write it to disk. We suggest setting it to 1%, so if your virtual server has 64 GB of memory that’s ~2/3 GB of data that can be sitting in RAM before something is done.

vm.dirty_ratio is the absolute maximum amount of system memory that can be filled with dirty pages before everything must get committed to disk. When the system gets to this point all new I/O blocks until dirty pages have been written to disk. This is often the source of long I/O pauses, but is a safeguard against too much data being cached unsafely in memory.

Swappiness explained

When the computer runs out of memory, it has to kick out some memory.  At that moment, it has 2 choices:

1)    Kick out mmap’ed files.  This is cheaper if the file is mmaped, as in read-only (for example, MarkLogic range indexes are read only)

2)    Kick out anon memory.

If swappiness is large, Linux would prefer #2 over #1.  If swappiness is small, Linux would prefer #1 over #2.

Summary

MarkLogic Server maintains an access log for logging each HTTP application server request.  However, the access log only contains summary information. In order to log additional HTTP request detail along with parameters, you can do so in the error log by using a URL rewriter.

Detail

A URL rewriter can be configured for each application server.  The URL rewriter will receive request object and can log the request details accordingly. Below is a sample URL rewriter that can be used to log HTTP request fields:

xquery version "1.0-ml" 
(
xdmp:log(fn:concat("Request URI: ", xdmp:get-request-url())),
for $field in xdmp:get-request-field-names() 
return
xdmp:log(fn:concat("Request Field - [Name:] ", $field," [Value:] ", xdmp:get-request-field($field)))
) 


To configure a URL rewriter on an application server using the MarkLogic Admin UI, navigate to -> Configure -> {group-name} -> App Servers -> {app-server-name} -> set 'url rewriter' value to the rewriter script URI. 

Introduction

If you have an existing MarkLogic Server cluster running on EC2, there may be circumstances where you need to upgrade the existing AMI with the latest MarkLogic rpm available. You can also add a custom OS configuration.

This article assumes that you have started your cluster using the CloudFormation templates with Managed Cluster feature provided by MarkLogic.

Procedure
To upgrade manually the MarkLogic AMI, follow these steps:

1. Launch a new small MarkLogic instance from the AWS MarketPlace, based on the latest available image. For example, t2.small based on MarkLogic Developer 9 (BYOL). The instance should be launched only with the root OS EBS volume.
Note: If you are planning to leverage the PAYG-PayAsYouGo model, you must choose MarkLogic Essential Enterprise.
a. Launch a MarkLogic instance from AWS MarketPlace, click Select and then click Continue:

b. Choose instance type. For example, one of the smallest available, t2.small
c. Configure instance details. For example, default VPC with a public IP for easy access
d. Remove the second EBS data volume (/dev/sdf)
e. Optional - Add Tags
f. Configure Security Group - only SSH access is needed for the upgrade procedure
g. Review and Launch
Review step - AWS view:

2. SSH into your new instance and switch the user to root in order to execute the commands in the following steps.

$ sudo su -

Note: As an option, you can also use "sudo ..." for each individual command.

3. Stop MarkLogic and uninstall MarkLogic rpm:

$ service MarkLogic stop
$ rpm -e MarkLogic

4. Update-patch the OS:

$ yum -y update

Note: If needed, restart the instance (For example: after a kernel upgrade/core-libraries).
Note: If you would like to add more custom options/configuration/..., they should be done between steps 4 and 5.

5. Install the new MarkLogic rpm
a. Upload ML's rpm to the instance. (For example, via "scp" or S3)
b. Install the rpm:

$ yum install [<path_to_MarkLogic_RPM>]/[MarkLogic_RPM]

Note: Do not start MarkLogic at any point of AMI's preparation.

6. Double check to be sure that the following files and log traces do not exist. If they do, they must be deleted.

$ rm -f /var/local/mlcmd.conf
$ rm -f /var/tmp/mlcmd.trace
$ rm -f /tmp/marklogic.host

7. Remove artifacts
Note: Performing the following actions will remove the ability to ssh back into the baseline image. New credentials are applied to the AMI when launched as an instance. If you need to add/change something, mount the root drive to another instance to make changes.

$ rm -f /root/.ssh/authorized_keys
$ rm -f /home/ec2user/.ssh/authorized_keys
$ rm -f /home/ec2-user/.bash_history
$ rm -rf /var/spool/mail/*
$ rm -rf /tmp/userdata*
$ rm -f [<path_to_MarkLogic_RPM>]/[MarkLogic_RPM]
$ rm -f /root/.bash_history
$ rm -rf /var/log/*
$ sync

8. Optional - Create an AMI from the stopped instance.[1] The AMI can be created at the end of step 7.

$ init 0

[1] For more information: https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/tkv-create-ami-from-instance.html

At this point, your custom AMI should be ready and it can be used for your deployments. If you are using multiple AWS regions, you will have to copy the AMI as needed.
Note: If you'd like to add more custom options/configuration/..., they should be done between steps 4 and 5.

Additional references:
[2] Upgrading the MarkLogic AMI - https://docs.marklogic.com/8.0/guide/ec2/managing#id_69624

Summary

All hosts in a MarkLogic cluster of two or more servers must run the same MarkLogic Server installation package.

Operating System Architecture

MarkLogic Server installation packages are created for each supported operating system architecture (e.g. Windows 64 bit, Linux 64 bit …). Consequently, all hosts in a MarkLogic cluster must employ the same operating system architecture. 

Version

For MarkLogic 8 and previous releases, all hosts within a MarkLogic cluster must be running the same version of MarkLogic Server. Mixed version clusters are not recommended, are not tested by MarkLogic, and are not supported. The behavior of a mixed version cluster is not defined and could lead to corrupt or inconsistent data.

MarkLogic 9 and MarkLogic 10 include the "Rolling Upgrade" feature that allows a cluster to run on a mixed version for a very short period of time.  For additional details and restrictions, please refer to the Rolling Upgrade section of our Administrators Guide.

Security and Schema Databases

In a MarkLogic Cluster, if the Security databases is configured with a Schemas database, the forests must be placed on the same host.

Cluster Upgrades

When upgrading a MarkLogic cluster to a new release, the upgrade should occur on all hosts in the cluster within a short period of time. The first server to be upgraded must be the server on which the Security database is mounted. 

In addition to the MarkLogic Server’s Installation Guide, you will want to refer to the “Upgrading a Cluster to a New Maintenance Release of MarkLogic Server” section in the MarkLogic Server’s Scalability, Availability, and Failover Guide for details regarding the required procedure to upgrade a cluster.  

Summary

There are scenarios where you may want to restore a database from a MarkLogic Server backup that was taken from a database on a different cluster. 

Examples

Two example scenarios where this may be appropriate:

- For development or testing purposes - you may want to take the content from one system to perform development of testing on a different cluster.

- A system failed, and you need to recreate a cluster and restore the database to the last known good state.

Constraints

There are constraints on performing a database restore from a MarkLogic database backup across clusters

  1. The source and target servers must be the same Operating System.  More specifically, they must be able to use the same MarkLogic Server installation package.
  2. The backups must be accessible from all servers on which a forest in the target database resides.   
  3. The path to the backups must be identical on all of the servers.
  4. The MarkLogic process must have sufficient access credentials to read the files in the backup.
  5. If the number of hosts and/or forests is different, see Restoring a Reconfigured Database.

If running MarkLogic versions prior to 9.0-4 then the following conditions must also be met

  1. The forest names must be identical in both the source database and the target database.
  2. The number of forests in both the source and target databases should be the same.  If the source database has a forest that does not reside on the target, then that forest data will not be included in the target after the database restore is complete.

Note: Differences in index configuration and/or forest order may result in reindexing or rebalancing after the restore is complete

Debugging Problems

If you are experiencing difficulties restoring a database backup, you can validate the backup using xdmp:database-backup-validate, or xdmp:database-incremental-backup-validate:

1. In the Query Console, execute a simple script that validates restoring the backup.  Something like

xquery version "1.0-ml";

let $db-name := "Documents"

let $db-backup-path := "/my-backup-dir/test"

return xdmp:database-restore-validate(

    xdmp:database-forests( xdmp:database($db-name)),

    $db-backup-path)

But with the $db-name and $db-backup-dir set appropriately.  The results will be a backup plan in xml format. Look at both the ‘forest-status’ and ‘directory-status’ for each of the forests.  Both should have the “okay” value.

A common error for the ‘directory-status’ is “non-existent”.  If you get this error, check the following.

- Verify that the backup directory exists on each server in the cluster that has a forest in the database;

- Verify that the backup directory has a “Forests” subdirectory, and the “Forests” directory contains subdirectories for each of the forests that reside on the Server.

- For the above directories, subdirectories and file contents, verify that the MarkLogic process has the proper credentials to access them.

2. If xdmp:database-backup-validate, or xdmp:database-incremental-backup-validate does not indicate any errors, then look in the MarkLogic Server’s ErrorLog.txt for entries during the time of the restore for any errors reported.  It is a good idea to set the MarkLogic Server group’s ‘File log level’ to ‘debug’ in order to get detailed error messages.

Helpful Commands:  

On Unix Systems, the following commands may be useful in troubleshooting:

  • Check the 'file system access user ID' for the MarkLogic process
    • ps -A -o fuser,pid,comm | grep MarkLogic
  • View file/directory permissions, owner and group
    • ls -l
  • Change ownership recursively.  In a default installation this should be daemon
    • chown -R daemon.daemon /path/to/Backup
  • Add read and write permissions recursively
    • chmod -R +rw /path/to/Backup

Further Reading

Transporting Resources to a New Cluster

Phases of Backup or Restore Operation

Restoring a Reconfigured Database

Introduction

By default, at present, MarkLogic 10 comes with four default users already configured on a new install.

These are admin, healthcheck, nobody, and infostudio-admin.

About these default users

Many times customers want to know more about these users and the reasons they exist. Also, to know if these can be removed after installing MarkLogic for security purposes.

The table below provides some basic details about these users and the reasons for their existance:

    admin

During the installation, you are prompted to specify the username and password for this user. Most of the time, 'admin' is used as a username and is created as an authorized administrator with the admin role.

See https://docs.marklogic.com/guide/security/authentication#id_95214

However, it can be created with a different name as well.

If there are other users with the 'admin role' assigned to them, and if there is an 'admin' user too, then this default 'admin' user can be deleted. In general, it is good security practice to have administrator users with names other than 'admin'.

Note an administrator is the most important user and one must not lose the password for users with the admin roles. At least one such user (or superuser) should be there to perform admin tasks and you must have at least one such user in case of External Authentication failures to recover.

   healthcheck

The healthcheck user is created by default and is used with the HealthCheck app server so can not be deleted. It simply reports back the same message "Healthy" if the server is responding well (port is 7997). For example, load balancers detect the proper running of MarkLogic via the HealthCheck App Server on port 7997.
Also, as it has no other privilege part from above by default, it can not be used to access the MarkLogic server via GUI/qconcole.

Note: The healthcheck user is used as a default user with healthcheck app server which uses application level authentication, internal security (means uses Security database) and requires no authentication. Thus, it is fine to use any password.

   infostudio-admin

This is now an obsolete user so this can be deleted. However, if you are upgrading make sure your systems are not using it anywhere. This is mainly left for some backward compatibility. In future releases, this user may be removed.

    nobody

The nobody user is created by default when MarkLogic Server is installed and by default, it just has the app-user role (inherits rest-reader role). User nobody is given a password that is randomly generated. The nobody user can only access pages and perform functions for which no privileges are required.

Note: This is the default user for various app servers such as App-Services, Admin, Manage and so it can not be deleted. By default, this can not be used to login to the MarkLogic server.

Refer security guide for more details.

Introduction

This article details changes to the upgrade procedures for MarkLogic 9 AMIs.

MarkLogic 9 now supports 1-click deployment in AWS Marketplace. This is an addition to existing options of manual launch of an AMI and launching MarkLogic clusters via CloudFormation templates. In order to make 1-click launch possible, our AMIs have pre-configured data volume (device on /dev/sdf).  The updated cloud formation templates account for the pre-configured data volume. This change also requires a different approach to our documented upgrade process.

Details

As per MarkLogic EC2 Guide, the main goal of the upgrade is to update AMI IDs in CloudFormation in order to upgrade all instances in the stack. There is now an additional step to handle the blank data volume that is pre-configured on MarkLogic AMIs.

Always backup your data before attempting any upgrade procedures!

Scenario 1:  You are using unmodified CF templates that were published by MarkLogic on http://developer.marklogic.com/products/cloud/aws starting from version 8.0-3.

  1. Update your CloudFormation stack with the latest template as there were no breaking changes since 8.0-3. The current templates for MarkLogic 9 include new AWS regions, new AMI IDs, and code to remove blank data volume that is bundled with current AMIs.
  2. In the EC2 Dashboard, stop one instance at the time and wait for it to be replaced with a new one.
  3. For a rolling upgrade (and as a good practice) terminate the other nodes one by one starting with the node that has Security database. They will come up and reconnect without any UI interaction.
  4. Go to 8001 port on any new instance where an upgrade prompt should be displayed.
  5. Click OK and wait for the upgrade to complete on the instance.

Scenario 2: You made some changes to MarkLogic templates or you are using custom templates.

  1. Download current templates from http://developer.marklogic.com/products/cloud/aws.
  2. Locate the AMI IDs by searching for "AWSRegionArch2AMI" block in the template.
    "AWSRegionArch2AMI": {
          "us-east-1": {
            "HVM": "ami-54a8652e"
          },
          "us-east-2": {
            "HVM": "ami-2ab29f4f"
          }, ...
  3. Locate AMI IDs in the old template and replace them with the ones from the new template. 
  4. Locate "BlockDeviceMappings" section in the new template that was downloaded in step 1. This block of code was added to remove blank volume that is part of the new 1-click AMIs.
  5. Update the old template to include "BlockDeviceMappings" as a property of LaunchConfig. There will be one or three LaunchConfig blocks depending on the template used. Those can by located by searching for "AWS::AutoScaling::LaunchConfiguration". Here is an example of the new property under LaunchConfig.
    "LaunchConfig":
    {
      "Type":"AWS::AutoScaling::LaunchConfiguration",
    "Properties":
    {
    "BlockDeviceMappings":
    [{
    "DeviceName":"/dev/sdf",
    "NoDevice":true,
    "Ebs": {}
    }],
    ...
  6. Once all the changes are saved, update your stack with the updated CloudFormation template. Make sure the stack update is complete.
  7. In the EC2 Dashboard, terminate nodes one by one starting with the node that has Security database. New nodes will come up after a couple of minutes and reconnect without any UI interaction.
  8. Wait for all nodes to be up and in green state.
  9. Go to 8001 port on any new instance where an upgrade prompt should be displayed.
  10. Click OK and wait for the upgrade to complete on the instance.

Scenario 3: You have instances that were brought up directly from MarkLogic AMI. For each MarkLogic instance in your cluster, do the following:

  1. Terminate the instance.
  2. Launch a new instance from the upgraded AMI.
  3. Detach blank volume that is mounted on /dev/sdf (should be 10GB in size)
  4. Attach the EBS data volume associated with the original instance.

More details on how to update CloudFormation stack can be found at http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks.html

There are various operating system settings that MarkLogic prescribes for best performance. During the startup of a MarkLogic Server instance, some of these parameters are set to the recommended values. These parameters include:

  • File descriptor limit
  • Number of processes per user
  • Swappiness
  • Dirty background ratio
  • Max sectors
  • Read ahead

For some settings, Info level error log messages are recorded to indicate that these values were changed.  For example, the MarkLogic Server error log might include a line similar to:

2020-03-03 12:40:25.512 Info: Reduced Linux kernel swappiness to 1
2020-03-03 12:40:25.512 Info: Reduced Linux kernel dirty background ratio to 1
2020-03-03 12:40:25.513 Info: Reduced Linux kernel read ahead to 512KB for vda
2020-03-03 12:40:25.513 Info: Increased Linux kernel max sectors to 2048KB for vda

SUMMARY

This article will help MarkLogic Administrators and System Architects who need to understand how to provision the I/O capacity of their MarkLogic installation.

MarkLogic Disk Usage

Databases in MarkLogic Server are made up of forests. Individual forests are made up of stands. In the interests of both read and write performance, MarkLogic Server doesn't update data already on disk. Instead, it simply writes to the current in-memory stand, which will then contain the latest version of any new or changed fragments, and old versions are marked as obsolete. The current in-memory stand will eventually become yet another on-disk stand in a particular forest.

Ultimately, however, the more stands or obsolete fragments there are in a forest, the more time it takes to resolve a query. Merges are a background process that reduce the number of stands and purge obsolete fragments in each forest in a database, thereby improving the time it takes to resolve queries. Because merges are so important to the optimal operation of MarkLogic Server, it's important to provision the appropriate amount of I/O bandwidth, where each forest will typically need 20MB/sec read and 20MB/sec write. For example, a machine hosting four forests will typically need sufficient I/O bandwidth for both 80MB/sec read and 80MB/sec write.

Determining I/O Bandwidth

One way to determine I/O bandwidth would be to use a synthetic benchmarking utility to return the available read and write bandwidth for the system as currently provisioned. While useful in terms of getting a ballpark sense of the I/O capacity, this approach unfortunately does not provide any information about the real world demand that will ultimately be placed on that capacity.

Another way would be to actually load test a candidate provisioning against the application you're going to deploy on this cluster. If you start from our general recommendations (from MarkLogic: Understanding System Resources) then do an application level load test (paying special attention to I/O heavy activities like ingestion or re-indexing, and the subsequent merging), the system metrics from that load test will then tell you what, if any, bottlenecks or extra capacity may exist on the system across not only your I/O subsystem, but for your CPU and RAM usage as well.

For both of these approaches (measuring capacity via synthetic benchmarks or measuring both capacity and demand vs. synthetic application load), it would also be useful to have some sense of the theoretical available I/O bandwidth before doing any testing. In other words, if you're provisioning shared storage like SAN or NAS, your storage admin should have some idea of the bandwidth available to each of the hosts. If you're provisioning local disk, you probably already have some performance guidance from the vendors of the I/O controllers or disks being used in your nodes. We've seen situations in the past where actual available bandwidth has been much different from expected, but at a minimum the expected values will provide a decent baseline for comparison against your eventual testing results.

Additional Resources

 

Introduction

This article provides a list of IP ports that MarkLogic Server uses.

MarkLogic Server Ports

The following IP ports should be open and accessible on every host in the cluster:

Port 7997 (TCP/HTTP) is the default HealthCheck application server port and is required to check health/proper running of a MarkLogic instance.

Port 7998 (TCP/XDQP) is the default "foreign bind port" on which the server listens for foreign inter-host communication - required for the database replication feature. This port is configurable and can be set with the admin:host-set-foreign-port() function. 

Port 7999 (TCP/XDQP)is the default "bind port" on which the server listens for inter-host communication within the cluster. The bind port is required for all MarkLogic Server Clusters. This port is configurable and can be set with the admin:host-set-port() function. 

Port 8000 (TCP/HTTP) is the default App-Services application server port and is required by Query Console.

Port 8001 (TCP/HTTP) is the default Admin application server port and is required by the Admin UI.

Port 8002 (TCP/HTTP) is the default Manage application server port and is required by Configuration Manager and Monitoring Dashboard.

MarkLogic 9 and Telemetry

Port 443 (TCP/HTTP) Outbound connections must be allowed to use the MarkLogic Telemetry feature introduced in MarkLogic 9 (and above).

Ops Director Ports

The following ports are the default ports used by Ops Director.  These can be changed during the installation process.

Port 8003 (TCP/HTTP) is the "SecureManage" default port and must be open on the managed cluster, to allow the Ops Director cluster to monitor the cluster.  If 8003 is already in use, it will choose the next open port above 8003.

Port 8008 (TCP/HTTP) is the "OpsDirectorApplication" default application server port, and allows access to the Ops Director UI.

Port 8009 (TCP/HTTP) is the "OpsDirectorSystem" default application server port, and allows access to the Ops Director APIs.

Data Hub Framework (DHF) Ports

The following ports are the default ports used by the Data Hub Framework.  Both the ports and the database/app server names can be changed during the installation process.

Port 8010 (TCP/HTTP) is the "data-hub-STAGING" default application server port for accessing ingested data for further processing.

Port 8011 (TCP/HTTP) is the "data-hub-FINAL" default application server port for downstream applications to access harmonized data.

Port 8013 (TCP/HTTP) is the "data-hub-JOBS" default application server port for jobs (flow runs).

Recommendations

In production, the ports listed above should be hidden behind a firewall. Only your customer application ports should be accessible to outside users. We also recommend disabling Query Console and CQ instances in production to avoid an errant query that may run away with system resources.

Netstat Utility

The netstat utility is useful for checking open ports:

Linux:

  • netstat -an | egrep 'Proto|LISTEN'

Windows:

Open cmd as Administrator

  • C:\Windows\System32>netstat -abon -p tcp

Look for MarkLogic.exe entries in the list.

Commonly Used Ports

The following is a list of commonly used ports and services that may need to have access limited or be disabled/blocked, based on your local network security policies.

Port

General Service Ports

20 FTP Data Transfer Mode
21 FTP Control(command) Mode
22 SSH
23 Telnet
43 WHOIS
53 DNS
Port

Web Service Ports

119 NNTP
80 HTTP
3306 MySQL
Port

Control Panel Default Ports

2082 cPanel
2083 Secure cPanel
2086 WHM
2087 Secure WHM
2095 cPanel Webmail
2096 Secure cPanel Webmail
8443 Secure Plesk
8880 Plesk
10000 Webmin
Port

E-mail Service Ports

25 SMTP
465 SMTPS
109 POP2
110 POP3
143 IMAP
993 IMAPS

This Wikipedia article contains a more comprehensive list:

http://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers

Summary

Version downgrades are not supported by MarkLogic Server.

Backup your configuration files before you do anything else

Please ensure you have all your current configuration files backed up.

Each host in a MarkLogic cluster is configured using parameters which are stored in XML Documents that are available on each host. These are usually relatively small files and will zip up to a manageable size.

If you cd to your "Data" directory (on Linux this is /var/opt/MarkLogic; on Windows this is C:\Program Files\MarkLogic\Data and on OS X this is /Users/{username}/Library/Application Support/MarkLogic), you should see several xml files (assignments, clusters, databases, groups, hosts, server).

Whenever MarkLogic updates any of these files, it creates a backup using the same naming convention used for older ErrorLog files (_1, _2 etc). We recommend backing up all configuration files before following the steps under the next heading.

Upgrade Guidelines:

If you follow these upgrade guidelines and recommendations, then it is unlikely that you will need to downgrade your MarkLogic Server instance.

  • Stay current with upgrades. MarkLogic Server typically gets faster and more stable with each release.  
  • Upgrade to the latest maintenance release of your current version before upgrading to the next feature version (for example: when upgrading from v6.0-2 to v7.0-2.3, first upgrade to the latest 6.0-x release). This may be required for older maintenance releases.
  • [Note] Upgrading across 2 feature releases in a single step may not be supported - please refer to the Installation and Upgrade Guide for supported upgrade paths.
  • When planning an upgrade across feature releases of MarkLogic Server, plan for a full test cycle of your application.
  • In addition to testing your application on the new version, test the upgrade process.
  • Always read the release notes before performing an upgrade. Pay particular attention to the "Known Incompatibilities" section.

Back-out Plan

Although it is not expected that you will ever need to back out a version upgrade of MarkLogic Server, it is always prudent to have a contingency plan for the worst case scenario.

Before an upgrade, you will want to

  • Create a backup of all of your forests in all of your databases.
  • Save your deployment scripts (used for creating your system configuration)

In the unlikely event you want to restore your system to a previous version level, you will need to first make a decision regarding the data in your databases.

MarkLogic does not support restoring a backup made on a newer version of MarkLogic Server onto an older version of MarkLogic Server. Your Back-Out Plan will need to take this into consideration.

  • If it is sufficient to roll back to the data as it existed previous to the upgrade, you will be able to just restore the data from the backup you made prior to the upgrade;
  • If you can recreate / reload all of the data that was inserted after the upgrade, you can restore the data from the pre-upgrade backup and then reload / recreate the new data;
  • If you need to capture the current data as it now exists in the database, you can use a tool like XQSync to backup the database contents to a Zip file, and then use XQSync again to load this backup;

Once you have decided how to handle your data, you will need to recreate your MarkLogic Instance from a fresh install.  This can be done on fresh hardware, or in place if you are careful with your data. The steps will include

  • Uninstall MarkLogic on each host
  • Remove any previous instance of MarkLogic data and configuration files from each host.
  • Install MarkLogic Server.
  • Recreate your configuration
  • Restore your data using the method you decided on previously.

In addition to testing an upgrade, you should also test your Back-Out Plan.

What is MarkLogic Data Hub?

MarkLogic’s Data Hub increases data integration agility, in contrast to time consuming upfront data modeling and ETL. Grouping all of an entity’s data into one consolidated record with that data’s context and history, a MarkLogic Data Hub provides a 360° view of data across silos. You can ingest your data from various sources into the Data Hub, standardize your data - then more easily consume that data in downstream applications. For more details, please see our Data Hub documentation.

Note: Prior to version 5.x, Data Hub was previously known as Data Hub Framework (DHF)

Takeaways:

  • In contrast to previous versions, Data Hub 5 is largely configuration-based. Upgrading to Data Hub 5 will require either:
    • Conversion of legacy flows from the code-based approach of previous versions to the configuration-based format of Data Hub 5
    • Executing your legacy flows with the “hubRunLegacyFlow” Gradle task
  • It’s very important to verify the “Version Support” information on the Data Hub GitHub README.md before installing or upgrading to any major Data Hub release

Pre-requisites:

One of the pre-requisites for installing Data Hub is to check for the supported/compatible MarkLogic Server version. For details, see our version compatibility matrix. Other pre-requisites can be seen here.

New installations of Data Hub

We always recommend installing the latest Data Hub version compatible with your current MarkLogic Server version. For example:

-If a customer is running MarkLogic Server 9.0-7, one should install the most recent compatible Data Hub version (5.0.2), even if the previous Data Hub versions (such as 5.0.1, 5.0.0, 4.x and 3.x) also work with server version 9.0-7.

-Similarly, if a customer is running 9.0-6, the recommended Data Hub version would be 4.3.1 instead of previous versions 4.0.0, 4.1.x, 4.2.x and 3.x.

Note: A specific MarkLogic server version can be compatible with multiple Data Hub versions and vice versa, which allows independent upgrades of either Data Hub or MarkLogic Server.

 

Upgrading from a previous version

  1. To determine your upgrade path, first find your current Data Hub version in the “Can upgrade from” column in the version compatibility matrix.
  2. While Data Hub should generally work with future server versions, it’s always best to run the latest Data Hub version that's also explicitly listed as compatible with your installed MarkLogic Server version.
  3. If required, make sure to upgrade your MarkLogic Server version to be compatible with your desired Data Hub version. You can upgrade MarkLogic Server and Data Hub independently of each other as long as you are running a version of MarkLogic Server that is compatible with the Data Hub version you plan to install. If you are running an older version of MarkLogic Server, then you must upgrade MarkLogic Server first, before upgrading Data Hub.

Note: Data Hub is not designed to be 'backwards' compatible with any version before the MarkLogic Server version listed with the release. For example, you can’t use Data Hub 3.0.0 on 9.0-4 – you’ll need to either downgrade to Data Hub 2.0.6 while staying on MarkLogic Server 9.0-4, or alternatively upgrade MarkLogic Server to version 9.0-5 while staying on Data Hub 3.0.0.

  • Example 1 - Scenario where you DO NOT NEED to upgrade MarkLogic Server:

         

  • Current Data Hub version: 4.0.0
  • Target Data Hub version: 4.1.x
  • ML server version: 9.0-9
  • The “Can upgrade from” value for the target version shows 2.x which means you need to be at least be on Data Hub 2.x. Since, the current Data Hub version is 4.0.0, this requirement has been met.
  • Unless there is a strong reason for choosing 4.1.x, we highly recommend to upgrade to the latest version compatible with MarkLogic Server 9.0-9 in 4.x - which in this example is 4.3.2. Consequently, the recommended upgrade path here becomes 4.0.0-->4.3.2 instead of 4.0.0-->4.1.x.
  • Since 9.0-9 is supported by the recommended Data Hub version 4.3.2, there is no need to upgrade ML server.
  • Hence, recommended path will be Data Hub 4.0.0-->4.3.2

 

  • Example 2 - Scenario where you NEED to upgrade MarkLogic Server:

           

  • Current Data Hub version: 3.0.0
  • Target Data Hub version: 5.0.2
  • ML server version: 9.0-6
  • The “Can upgrade from” value for the target version shows Data Hub version 4.3.1 which means you need to be at least be on 4.3.x (4.3.1 or 4.3.2 depending on your MarkLogic Server version). Since the current Data Hub version 3.0.0 doesn’t satisfy this requirement, upgrade path after this step becomes Data Hub 3.0.0-->4.3.x
  • As per the matrix, the latest compatible Data Hub version for 9.0-6 is 4.3.1, so the path becomes 3.0.0-->4.3.1
  • From the matrix, the minimum supported MarkLogic Server version for 5.0.2 is 9.0-7, so you will have to upgrade your MarkLogic Server version before upgrading your Data Hub version to 5.0.2.
  • Because 9.0-7 is supported by all 3 versions under consideration (3.0.0, 4.3.1 and 5.0.2), recommended path can be either
    1. 3.0.0-->4.3.1-->upgrade MarkLogic Server version to at least 9.0-7-->upgrading Data Hub version to 5.0.2
    2. Upgrading MarkLogic Server version to at least 9.0-7-->upgrade Data Hub from 3.0.0 to 4.3.1-->upgrade Data Hub version to 5.0.2
  • Recall that Data Hub 5 moved to a configuration-based approach from previous versions’ code-based approach. Upgrading to Data Hub 5 from a previous major version will require either:
    • Conversion of legacy flows from the code-based approach of previous versions to the configuration-based format of Data Hub 5
    • Executing your legacy flows with the “hubRunLegacyFlow” Gradle task

Links for Reference:

https://docs.marklogic.com/datahub/upgrade.html

 

 

 

 

Introduction

The MarkLogic Monitoring History feature allows you to capture and view critical performance data from your cluster. By default, this performance data is stored in the Meters database. This article explains how you can plan for the additional disk space required for the Meters database.

Meters Database Disk Usage

Just like any other database, Meters database is also made up of forests which in turn are made up of stands that reside physically on-disk. As Meters database is used by Monitoring History to store critical performance data of your cluster, the amount of information can grow significantly with more number of hosts, forests, databases etc. Thus the need to plan and manage the disk space required by Meters database.

Recommendation

Meters database stores critical performance data of your cluster. The size of data is proportional to the number of hosts, app servers, forests, databases etc. Typically, the raw retention settings have the largest impact on size.

MarkLogic's recommendation for a new install is to start with the default settings and monitor usage over the first two weeks of an install. The performance history charts, constrained to just show the Meters database, will show an increasing storage utilization over the first week, then leveling off for the second week. This would give you a decent idea of space utilization going forward.

You can then adjust the number of days of raw measurements that are retained.

You can also add additional forests to spread the Meters database over more hosts if needed.

Monitoring History

The Monitoring History feature allows you to capture and view critical performance data from your cluster. Monitoring History capture is enabled at the group level. Once the performance data has been collected, you can view the data in the Monitoring History page.

By default, the performance data is stored in the Meters database. A consolidated Meters database that captures performance metrics from multiple groups can be configured, if there is more than one group in the cluster.

Monitoring History Data Retention Policy

How long the performance data should be kept in the Meters database before it is deleted can be configured with the data retention policy. (http://docs.marklogic.com/guide/monitoring/history#id_80656)

If it is observed that meters data is not being cleared according to the retention policy, the first place to check would be the range indexes configured for the Meters database.

Range indexes and the Meters Database

Meters database is configured with a set of range indexes which, if not configured correctly (or not present) can prevent the cleaning up of Meters database according to the set retention policy.

It is possible to have missing or misconfigured range indexes in either of the below scenarios

  •  if the cluster was upgraded from a version of ML before 7.0 and the upgrade had some issues
  •  if the indexes were manually created (when using another database for meters data instead of the default Meters database)

The size of the meters database can grow significantly as the cluster grows, so it is important that the meters database is cleared per the retention policy.

The required indexes (as of 8.0-5 and 7.0-6) are attached as an ML Configuration Manager package(http://docs.marklogic.com/guide/admin/config_manager#id_38038). Once these are added, the Meters database will reindex and the older data should be deleted.

Note that deletion of data older than the retention policy occurs no sooner than the retention policy. Data older than the retention policy may still be maintained for an unspecified amount of time.

Related documentation

http://docs.marklogic.com/guide/monitoring

https://help.marklogic.com/Knowledgebase/Article/View/259/0/metering-database-disk-space-requirements

 

 

 

 

 

 

 

 

 

 

 

 

SUMMARY:

Prior to MarkLogic 4.1-5, role-ids were randomly generated.  We now use a hash algothm that ensures that roles created with the same name will be assigned the same role-id.  When attempting to migrate data from a forest created prior to MarkLogic 4.1-5 to a newer installation can cause the user to be met with a "role not defined error".  In order to work around this issue, we will need to create a new role with the role-id defined in the legacy system. 

Procedure:

This process creates a new role with the same role-id from your legacy installation and assigns this old role to your new role with the correct name.

Step 1: You will need to find the role-id of the legacy role. This will need to be run against the security DB on the legacy server. 

<code>

xquery version "1.0-ml";
import module namespace sec="http://marklogic.com/xdmp/security" at
"/MarkLogic/security.xqy";

let $role-name := "Enter Roll Name Here" 

return
/sec:role[./sec:role-name=$role-name]/sec:role-id/text()

</code>


Step 2: In the new environment, store the attached module to the following location on the host containing the security DB.

/opt/MarkLogic/Modules/role-edit/create-master-role.xqy

Step 3: Ensure that you have created the role on the new cluster.

Step 4: Run the following code against the new clusters security DB. This will create a new role with the legacy role-id. Be sure to enter the role name, description, and role-id from Step 1.

<code>
xquery version "1.0-ml";
import module namespace cmr="http://role-edit.com/create-master-role" at
"/role-edit/create-master-role.xqy";

let $role-name := "ENTER ROLE NAME"
let $role-description := "ENTER ROLE DESCRIPTION"
let $legacy-role-id := 11658627418524087702 (: Replace this with the Role ID from Step 1:)

let $legacy-role := fn:concat($role-name,"-legacy")
let $legacy-role-create := cmr:create-role-with-id($legacy-role, $role-description, (), (), (), $legacy-role-id)

return
fn:concat("Inserted role named ",$legacy-role," with id of ",$legacy-role-id)

</code>


Step 5: Run the following code against the new clusters security database to assign the legacy role to the new role.

<code>
xquery version "1.0-ml";
import module namespace sec="http://marklogic.com/xdmp/security" at
"/MarkLogic/security.xqy";

let $role-name := "ENTER ROLE NAME"
let $legacy-role := fn:concat($role-name,"-legacy")

return
(
sec:role-set-roles($role-name, ($legacy-role)),
"Assigned ",$legacy-role," role to ",$role-name," role"
)

</code>

 

You should now have a new role named [your-role]-legacy.  This legacy role will contain the role-id from your legacy installation and will be assigned to [your-role] on the new installation.  Legacy documents in your DB will now have the same rights they had in the legacy system.

Introduction

In this article, we discuss use of xdmp:cache-status in monitoring cache status, and explain the values returned.

Details

Note that this is a relatively expensive operation, so it’s not something to run every minute, but it may be valuable to run it occasionally for information on current cache usage.

Output format

The values returned by xdmp:cache-status are per host, defaulting to the current host. It takes an optional host-id to allow you to gather values from a specific host in the cluster.

The output of xdmp:cache-status will look something like this:

<cache-status xmlns="http://marklogic.com/xdmp/status/cache">
  <host-id>18349804367231394552</host-id>
  <host-name>macpro-2113.local</host-name>
  <compressed-tree-cache-partitions>
    <compressed-tree-cache-partition>
      <partition-size>512</partition-size>
      <partition-table>0.2</partition-table>
      <partition-used>0.8</partition-used>
      <partition-free>99.2</partition-free>
      <partition-overhead>0</partition-overhead>
    </compressed-tree-cache-partition>
  </compressed-tree-cache-partitions>
  <expanded-tree-cache-partitions>
    <expanded-tree-cache-partition>
      <partition-size>1024</partition-size>
      <partition-table>0.7</partition-table>
      <partition-busy>0</partition-busy>
      <partition-used>30.4</partition-used>
      <partition-free>69.6</partition-free>
      <partition-overhead>0</partition-overhead>
    </expanded-tree-cache-partition>
  </expanded-tree-cache-partitions>
  <list-cache-partitions>
    <list-cache-partition>
      <partition-size>1024</partition-size>
      <partition-table>0.2</partition-table>
      <partition-busy>0</partition-busy>
      <partition-used>0</partition-used>
      <partition-free>100</partition-free>
      <partition-overhead>0</partition-overhead>
    </list-cache-partition>
  </list-cache-partitions>
  <triple-cache-partitions>
    <triple-cache-partition>
      <partition-size>1024</partition-size>
      <partition-busy>0</partition-busy>
      <partition-used>0</partition-used>
      <partition-free>100</partition-free>
    </triple-cache-partition>
  </triple-cache-partitions>
  <triple-value-cache-partitions>
    <triple-value-cache-partition>
      <partition-size>512</partition-size>
      <partition-busy>0</partition-busy>
      <partition-used>0</partition-used>
      <partition-free>100</partition-free>
    </triple-value-cache-partition>
  </triple-value-cache-partitions>
</cache-status>

Values

cache-status contains information for each partition of the caches:

  • The list cache holds search term lists in memory and helps optimize XPath expressions and text searches.
  • The compressed tree cache holds compressed XML tree data in memory. The data is cached in memory in the same compressed format that is stored on disk.
  • The expanded tree cache holds the uncompressed XML data in memory (in its expanded format).
  • The triple cache hold triple data.
  • The triple value cache holds triple values.

The following are descriptions of the values returned:

  • partition-size: The size of a cache partition, in MB.
  • partition-table: The percentage of the table for a cache partition that is currently used. The table is a data structure that has a fixed overhead per cache entry, for cache admin. This will fix the number of entries that can be resident in the cache. If the partition table is full, something will need to be removed before another entry can be added to the cache.
  • partition-busy: The percentage of the space in a cache partition that is currently used and cannot be freed.
  • partition-used: The percentage of the space in a cache partition that is currently used.
  • partition-free: The percentage of the space in a cache partition that is currently free.
  • partition-overhead: The percentage of the space in a cache partition that is currently overhead.

When do I get errors?

You will get a cache-full error when nothing can be removed from the cache to make room for a new entry.

The "partition-busy" value is the most useful indicator of getting a cache-full error. It tells you what percent of the cache partition is locked down and cannot be freed to make room for a new entry. 

 

Update:

Since the time this article was originally written, MarkLogic included Forest Rebalancing and Forest Retiring Features in the more recent versions of MarkLogic Server.  For zero downtime movement of forests, please refer to our documentation for these features - http://docs.marklogic.com/guide/admin/database-rebalancing.  

The legacy Article follows: 

Summary

There are many reasons why you may need to move a forest from one storage device to another. For example:

  • Transition from shared storage to dedicated storage (or vice versa);
  • Replace a small storage device with a larger one;
  • Reorganize - forest placement;

No matter what the reason, the action of moving forests should be well planned and deliberate, while the procedure should be well tested.  This article lists both the steps that should be followed as well as issues to be considered when planning a move.

We will present two different techniques for moving a forest.  The first being appropriate for databases that the can be restricted from updates for the duration of the forest move.  The second being appropriate for production databases where database downtime needs to be minimized.

Simple Procedure to Move a Forest

The simple procedure to move a forest can be used on any forest whose database can be restricted from updates for the duration of the process.  This will typically be for test, development and staging systems, but may also include production environments that can be disabled for extended maintenance windows.

To retain data integrity, this procedure requires that the associated database is restricted from updates.    The update restriction can be enforced in a variety of ways:

  • By setting all forests in the database to “read-only”; 
  • By disabling all application servers that reference the database.  You will also need to verify that there are no tasks in the task queue that can update the database.
  • By restricting access at the application level.
  • By restricting access procedurally – this is a common approach in test, development and staging environments.    

The following steps can be used to move a forest: 

Step 1: Begin enforcement of update restriction;

Step 2: Create a backup of the forest you would like to move;

Step 3: Create a new forest, specifying the new location for the forest;

Step 4: Restore the forest data from step 2 to the newly created forest;

Step 5: Verify that the forest data is restored successfully;

Step 6: Switch forests attached to the database;

a. Detach the original forest from the database;

b. Attach the new forest to the database;

WARNING: When moving a forest in the Security database, this step must occur in a single transaction (i.e. detaching original security forest and attaching a new security forest in a single transaction). The MarkLogic Server must have an operational Security database to function properly

Step 7: Remove update restriction (from step 1);

Step 8: (Optional) Remove/delete the original forest.

Moving a Forest Minimizing Downtime

If the forest to be moved resides on a production system whose content databases are continually being updated, and if you cannot afford the database to be restricted from updates for the duration of a backup and a restore, then you can use the local disk failover feature to synchronize your forests before switching them.  This approach will minimize the required downtime of the database.

The following steps can be used to move a forest while minimizing downtime: 

Step 1: Create a new forest, specifying the new location for the forest.

Step 2: (Optional) Seed the new forest from backup. Although we will be using the local disk failover feature to synchronize forest content, seeding the new forest from a recent backup will result in faster synchronization and will use less resources (i.e. less disruptive to the production system)

Step 3: If you do not have a recent forest backup of the forest you would like to move, create one.

Step 4: Perform a forest level restore to the newly created forest.

Step 5: Configure the new forest as a forest replica of the original forest.

Step 5: Wait until the Forest is in the “sync replicating” state.  You can use the Admin UI Forest status page to check for sync replicating.

Step 6: Switch forests: This step requires that the database is OFFLINE for a short period of time.

a. Detach the original forest from the database;

b. Remove the forest replica configuration created in step 5;

c. Attach the new forest to the database ;

  1. WARNING: When moving a forest in the Security database , this step must occur in a single transaction (i.e. detaching original Security forest and attaching a new Security forest in a single transaction). The MarkLogic Server must have an operational Security database to function properly

Step 7: (Optional) Remove/delete the original forest.

Retaining Forest Name

Both forest move procedures presented require the new forest to have a different name than the original because forest names must be unique within a MarkLogic Server cluster and both procedures have the original and new forests existing in the system at the same time. Although rare, some applications have forest name dependencies (i.e. applications that perform in-forest query evaluations or in-forest placement of document inserts). If this is the case, you will either need to update your application, or change the method used to move the forest (since MarkLogic Server does not provide a mechanism to change the name of a forest).  

  • You can modify the “Simple Forest Move” procedureby performing the forest delete after (step 2) ‘creating a successful forest backup’, and before (step 3) ‘creating a new forest’.  This way, in step 3, you can create the new forest with the same name as the forest that was deleted.
  • To retain the forest name while minimizing database downtime, you can perform the “Moving a Forest Minimizing Downtime” procedure twice – the first time to a temporary forest and the second time to the final destination. 

Forest Replicas and Failover Hosts

If the original forest has ‘forest replicas’ or ‘failover hosts’ configured, you will need to detach these configurations before you can delete the original forest.

If you would like the new forest to be configured with ‘forest replicas’ of ‘failover hosts’, you must first detach these configurations from the original forest before reattaching them to the new forest.

Estimate Time

The majority of the time will be spent transferring content from the original forest to the new forest.  You can estimate the amount of time this will take from

  • The size of the forest on disk (forest-size in MB);
  • The I/O read rate available for the device where the original forest resides (read-rate in MB/second); and
  • The I/O write rate available for the device where the new forest resides (write-rate in MB/second).

Estimate time = (Forest-size / read-rate) + (Forest-size / write-rate)

Sizing Rules and Recommendations

When determining the resources allocated to forest data, it is recommended that you stay within the following guidelines:

[MarkLogic Recommendation]The I/O subsystem should have capacity for sustained I/O at 20-MB/sec per content forest in each direction (i.e., 20-MB/sec reads and 20-MB/sec writes at the same time.”

[MarkLogic Recommendation]The size of all forest data on a server should not exceed 1/3 of the available disk space.  The other 2/3rds should be available for forest merges and reindexing, otherwise you will risk merge or reindex failures.”

     (  The 3x disk space requirement was always true for MarkLogic 6 and earlier releases. However, beginning in MarkLogic 7, the 3x disk space requirement can be reduced if configured and managed. )

[MarkLogic Rule of thumb]Provision at least 2 CPU cores per active forests. This facilitates concurrent operations. “

[MarkLogic Rule of thumb]Forests should not grow beyond 200GB or 64-million fragments. These thresholds do not guarantee a particular level of performance and may need to be lowered depending on the application.

Additional Related Knowledgebase articles

Knowledgebase Article: Understand the Logs during rebalancer and reindex activity

Knowledgebase Article: Data Balancing in MarkLogic 

Knowledgebase Article: Rebalancing, replication and forest reordering 

Knowledgebase Article: Diagnosing Rebalancer issues after adding or removing a forest 

 

 

Summary

Clock synchronization plays a critical part in the operation of a MarkLogic Cluster.

MarkLogic Server expects the system clocks to be synchronized across all the nodes in a cluster, as well as between Primary and Replica clusters. The acceptable level of clock skew (or drift) between hosts is less than 0.5 seconds, and values greater than 30 seconds will trigger XDMP-CLOCKSKEW errors, and could impact cluster availability.

Tools

Network Time Protocol (NTP) is the recommended solution for maintaining system clock synchronization.  NTP services can be provided by public (internet) servers, private servers, network devices, peer servers and more.

NTP Basics

NTP uses a daemon process (ntpd) that runs on the host.  The ntpd periodically wakes up, and polls the configured NTP servers to get the current time, and then adjust the local system clock as necessary.  Time can be adjusted two ways, by immediately changing to the correct time, or by slowly speeding up or slowing down the system clock as necessary until it has reached the correct time. The frequency that the ntpd wakes up, called the polling interval, can be adjusted based on the level of accuracy needed anywhere between 1 and 17 minutes.  NTP uses a hierarchy of servers called a strata.  Each strata synchronizes with the layer above it, and provides synchronization to the later below it.

Public NTP Reference Servers

There are many public NTP reference servers available for time synchronization.  It's important to note that the most common public NTP reference server addresses are for a pool of servers, so hosts synchronizing against them may end up using different physical servers.  Additionally, the level of polling recommended for cluster synchronization is usually higher, and excessive polling could result in the reference server throttling or blocking traffic from your systems.

Stand Alone Cluster

For a cluster that is not replicated or connected to another cluster in some way, the primary concern is that all the hosts in the cluster be in sync with each other, rather than being accurate to UTC.

Primary/Replica Clusters

Clusters that act as either Primary or Replicas need to be synchronized with each other for replication to work correctly.  This usually means that the hosts in both clusters should reference the same NTP servers.

NTP Configuration

Time Synchronization Configuration Files

It is common to have multiple servers referenced in the chronyd configuration file, /etc/chrony.conf or the ntpd configuration file, /etc/ntpd.conf. NTP may not choose the server based on the order in the file.  Because of this, hosts could synchronize with different reference servers, introducing differences in the system clocks between the hosts in the cluster. Most organizations may have devices that can act as NTP servers in their infrastructure already, as many network devices are capable of acting as NTP servers, as are Windows Primary Domain Controllers.  These devices can use default polling intervals, which avoids excessive polling against public servers.

Once you have identified your NTP server, you can configure the NTP daemon on the cluster hosts. We suggest using a single reference server for all the cluster hosts, then add all the hosts in the cluster as peers of the current node.  We also suggest adding an entry for the local host as it's own server, assigning it a low strata. Using peers allows the cluster hosts to negotiate and elect a host to act as the reference server, providing redundancy in case the reference server is unavailable.

Common Configuration Options

The burst option sends a burst of 8 packets when polling to increase the average quality of time offset statistics.  Using it against a public NTP server is considered abuse.

The iburst sends a burst of 8 packets at initial synchronization which is designed to speed up the initial synchronization at startup.  Using it against a public NTP server is considered aggressive.

The minpoll and maxpoll settings are measured in seconds to the power of two, so a setting of 4 is 16 seconds, so setting minpoll and maxpoll to 4 will cause the host to check time approximately every minute.

Time Synchronization with chronyd

The following is a sample chrony.conf file:

# Primary NTP Source

server *.*.*.200 burst iburst minpoll 4 maxpoll 4

# Allow peering as a backup to the primary time servers

peer mlHost01 burst iburst minpoll 4 maxpoll 4
peer mlHost02 burst iburst minpoll 4 maxpoll 4
peer mlHost03 burst iburst minpoll 4 maxpoll 4

# Serve time even if not synchronized to a time source (for peering)
local stratum 10

# Allow other hosts on subnet to get time from this host (for peering)
# Can also be specified by individual IP
# https://chrony.tuxfamily.org/manual.html#allow-directive
allow *.*.*.0

# By default chrony will not step the clock after the initial few time checks.
# Changing the makestep option allows the clock to be stepped if its offset is larger than .5 seconds.
makestep 0.5 -1

The other settings (driftfile, rtsync, log) can be left as is, and the new settings will take effect after the chronyd service is restarted.

Time Synchronization with ntpd

The following is a sample ntpd.conf file:

#The current host has an ip of 10.10.0.1
server ntpserver burst iburst minpoll 4 maxpoll 4
 
#All of the cluster hosts are peered with each other.
peer mlHost01 burst iburst minpoll 4 maxpoll 4
peer mlHost02 burst iburst minpoll 4 maxpoll 4
peer mlHost03 burst iburst minpoll 4 maxpoll 4
 
#Add the local host so the peered servers can negotiate
# and choose a host to act as the reference server
server 10.10.0.1
fudge 10.10.0.1 stratum 10

The fudge setting is used to alter the stratum of the server from the default of 0.

Choosing Between NTP Daemons

Red Hat states that chrony is the preferred NTP daemon, and should be used when possible.

Chrony should be preferred for all systems except for the systems that are managed or monitored by tools that do not support chrony, or the systems that have a hardware reference clock which cannot be used with chrony.

As always, system configuration changes should always be tested and validated prior to putting them into production use.

References

Introduction

Ops Director enables you to monitor MarkLogic clusters ranging from a single node to large multi-node deployments. A single Ops Director server can monitor multiple clusters. Ops Director provides a unified browser-based interface for easy access and navigation.

Ops Director presents a consolidated view of your MarkLogic infrastructure, to streamline monitoring and troubleshooting of clusters with alerting, performance, and log data. Ops Director provides enterprise-grade security of your cluster configuration and performance data with robust role-based access control and information security powered by MarkLogic Server.

Problems installing Ops Director 2.0.0, 2.0.1 & 2.0.1-1

Check gradle.properties

To successfully install Ops Director, the value for mlhost in gradle.properties must have a hostname and that hostname must match the name of one of the hosts in the cluster.  You can not use localhost to install Ops Director, nor can you use a host name other than one that is listed as a host in the cluster as this effects the use of certificates for authentication to the OpsDirectorSystem application server.

Check for App-Services

Ops Director can sometimes encounter errors when attempting to install in groups other than Default. To successfully install, the Ops Director installer needs to be able to connect to the App-Services application server on port 8000 in the group where Ops Director is being installed.  There are two ways to work around this issue:

  • Create a copy of the App-Services app server in the new group, then install Ops Director
    • Be aware this allows QConsole access in the new group, for users with appropriate privileges. 
    • If you wish to prevent QConsole access in that group, the App-Services application server should be deleted after Ops Director has been installed.
  • Install Ops Director in the Default group, then move the host to the new group, and create the OpsDirector app servers in the new group.
    • Be aware this allows Ops Director access to remain in the Default group.
    • If you wish to prevent Ops Director access in the Default, the Ops Director application servers should be deleted from the Default group.
      • To do this you must also copy the scheduled tasks associated with Ops Director over to the new group, and delete the scheduled tasks from the old group

See the attached Workspace OpsDirCopyAppServers.xml which has scripts to do the following:

  • Copy and/or remove the App-Services app server
  • Copy and/or remove the OpsDirectorSystem/OpsDirectorApplication/SecureManage app servers
  • Copy and/or remove the scheduled tasks associated with the Ops Director application.

Also note that Ops Director will install forests on all hosts in the cluster, regardless of group assignments.

Managing a Cluster

Check DNS Settings

When setting up a managed host, it's important to note that the hosts in both the Ops Director cluster, and the cluster being managed must be able to resolve hostnames via DNS.  Modifying the /etc/hosts file is not sufficient.

Check Ops Director Scheduled Tasks

When setting up a managed host, you may encounter a XDMP-DEADLOCK error, or have an issue seeing the data for a managed cluster.  If this occurs do the following:

  • Un-manage the affected cluster.  If there are any issues un-managing the cluster, use the procedures in this KB under the Problems with Un-managing Clusters to un-manage the cluster
  • Disable the scheduled tasks associated with Ops Director
    • /common/tasks/info.xqy
    • /common/tasks/running.xqy
    • /common/tasks/expire.xqy
    • /common/tasks/health.xqy
  • Manage the cluster again
  • Enable the scheduled tasks that were disabled

Verify Necessary Ports are Open

Assuming the default installation ports are in use, verify the following access:

  • 8003 Inbound TCP on the Managed Cluster, accessed by the Ops Director Cluster.
  • 8008 Inbound TCP on the Ops Director Cluster, accessed by the Ops Director Users.
  • 8009 Inbound TCP on the Ops Director Cluster, accessed by the Managed Cluster

Upgrading Ops Director

When upgrading to a new version of Ops Director, it may necessary to uninstall the previous version.  To do that, you must un-manage any clusters being managed by Ops Director, prior to uninstalling the application.

Un-managing Clusters

The first step in uninstalling Ops Director is to remove any clusters from being managed from Ops Director.  This is done via the Admin UI on a host in the managed cluster, as detailed in the Ops Director Guide: Disconnecting a Managed Cluster from Ops Director

Uninstalling Ops Director 2.0.0 & 2.0.1

These versions of Ops Director use the ml-gradle plugin for deployment.  To uninstall these versions, you will also use gradle, as detailed in the Ops Director Guide: Removing Ops Director 2.0.0 and 2.0.1

Uninstalling Ops Director 1.1 or Earlier

If you are using the 1.1  version that was installed via the Admin UI, then it can be uninstalled via the Admin UI as detailed in the Ops Director Guide: Removing Ops Director 1.1 or Earlier

Problems with Uninstalling Ops Director

Occasionally an Ops Director installation may partially fail, due to misconfiguration, or missing dependencies.  Issues can also occur that prevent the standard removal methods from working correctly.  In these cases, Ops Director can be removed manually using the attached QConsole Workspace, OpsDirRemove.xml.  The instructions for running the scripts are contained in the first tab of the workspace.

Problems with Un-managing Clusters

Occasionally, disconnecting a managed cluster from Ops Director may partially fail.  If this occurs, you can use the attached QConsole Workspace, OpsDirUnmanage.xml.  The instructions for running the scripts are contained in the first tab of the workspace.

Further Reading

Installing, Uninstalling, and Configuring Ops Director

Monitoring MarkLogic with Ops Director

Introduction

Administrators can achieve very fine granularity on restores when incremental backups are used in conjunction with log archiving.

Details

Journal archiving can enable a restore to be performed to any timestamp since the last incremental backup.  For example, when using daily incremental backups in conjunction with 24-hour log archive retention, a restore can be made to any point in the previous 24 hours.

This capability enables administrators to go back to the exact point in time before a user error caused bad data to be ingested into the database, minimizing any data loss on the restore. Although this is a very powerful capability, the entire operation to perform a restore is simplified. Administrators can execute a simple operation as the server restores the backup set and replays the journal starting from the timestamp given by the admin.

For further information, see the documentation Restoring from an Incremental Backup with Journal Archiving.

Introduction

Seeing too many "stand limit" messages in your logs frequently? This article explains what this message means to your application and what actions should you take.

 

What are Stands and how their numbers can increase?

A stand holds a subset of the forest data and exists as a physical subdirectory under the forest directory. This directory contains a set of compressed binary files with names like TreeData, IndexData, Frequencies, Qualities, and such. This is where the actual compressed XMLdata (in TreeData) and indexes (in IndexData) can be found.

At any given time, a forest can have multiple stands. To keep the number of stands to a manageable level MarkLogic runs merges in the background. A merge takes some of the stands on disk and creates a new singular stand out of them, coalescing and optimizing the indexes and data, as well as removing any previously deleted fragments.

MarkLogic Server has a fixed limit for the maximum number of stands (64). When that limit is reached you will no longer be able to update your system. While MarkLogic automatically manage merges and it is unlikely to reach this limit, there are few configurations under user control that may impact merges and you may see this issue. e.g.

1.) You can manage merges using Merge Policy Controls. e.g. setting a low merge max size would stop merges beyond the configured size and hence the overall number of stands would keep growing.

2.) Low value of background-io-limit would mean less amount of I/O for background tasks such as merges. This may also adversely affect the merge rate and hence the number of stands may grow.

3.) Low in-memory settings not keeping up with an aggressive data load. e.g. If you are bulk loading large documents and have low in memory tree size then stands may accumulate and reach the hard limit.

 

What you can do to keep the number of stands within manageable limit?

While MarkLogic automatically manage merges to keep the number of stands at a manageable level, it adds WARNING entry to the logs when it sees the number of stands growing alarmingly! e.g. Warning: Forest XXXXX is at 92% of stand limit

If you see such messages in your logs, you should take some action as reaching the hard limit of 64 would mean you will no longer be able to update your system.

Here's what you can check and do to lower the number of stands.

1.) If you have configured merge policy controls then check if they actually match with your application usage. You could change the required settings as needed. For instance:

2.) There should be no merge blackouts during ingestion, or any time there is heavy updating of your content.

3.) Beginning with MarkLogic version 7, the server is able to manage merges with less free space required on your drives (1.5 times the size of your content). This is accomplished by setting the merge max size to 32768 (32GB). Although this does create more stands, this is OK on newer systems, since the server is able to use extra CPU cores in parallel.

2.) If you have configured background-io-limit then check if that is sufficient for your application usage. If needed, increase the value so that merges can make use of more IO. You should only use this setting on systems that have limited disk IO. In general you want to first set it to 200, and if the disk IO seems to still be overwhelmed, set it to 150 and so on. A setting of 1oo may be too low for systems that are doing ingestion, since the merge process needs to be able to keep up with stand creation.

3.) If you are performing bulk loads then check if the in-memory settings are suffificient and can be increased. If needed, increase the required value so that in-memory stands (and as a result on-disk stands) accomodate more data and thereby decreases the number of stands. If you do grow the in-memory caches, make sure to grow the database journal files by a corresponding amount. This will insure that a single large transaction will be able to fit in the journals.

 

Conclusion 

If you decide to control MarkLogic's merge process, you should monitor the system for any adverse effect that it may cause and take actions accordingly. MarkLogic Server continuously assesses the state of each database and the default merge settings and the dynamic nature of merges will keep the database tuned optimally at all times. So if you are unsure - let MarkLogic handle the merges for you!

Summary

When used as a file system, GFS needs to be tuned for optimal performance with MarkLogic Server.

Recommendations

Specifically, we recommend tuning the demote_secs and statfs_fast parameters. The demote_secs parameter determines the amount of time GFS will wait before demoting a lock on a file that is not in use. (GFS uses a time-based locking system.) One of the ways that MarkLogic Server makes queries go fast is its use of memory mapped index files. When index files are stored on a GFS filesystem, locks on these memory-mapped files are demoted purely on the basis of demote_secs, regardless of use. This is because they are not accessed using a method that keeps the lock active -- the server interacts with the memory map, not direct access to the on-disk file.

When a GFS lock is demoted, pages from the memory-mapped index files are removed from cache. When the server makes another request of the memory-mapped file, GFS must acquire another lock and the requested page(s) from the on-disk file must be read back into cache. The lock reacquisition process, as well as the I/O needed to load data from disk into cache, may causes noticeable performance degradation.

Starting with MarkLogic Server 4.0-4, MarkLogic introduced an optimization for GFS. From that maintenance release forward, MarkLogic gets the status of its memory-maps files every hour, which results in the retention of the GFS locks on those files so that they do not get demoted. Therefore, it is important that demote_secs is equal to or greater than one hour. It is also recommended that the tuning parameter statfs_fast is set to "1" (true), which makes statfs on GFS faster.

Using gfs_tool, you should be able to set the demote_secs and statfs_fast parameters to the following values:

demote_secs 3600

statfs_fast 1

While we're discussin tuning a Linux filesystem, it is worth noting the following Linux tuning tips also:

  • Use the deadline elevator (aka I/O scheduler), rather than cfq, on all hosts in the cluster. This has been added to our installation requirements for RHEL. With RHEL-4, this requires the elevator=deadline option at boot time. With RHEL-5, this can be changed at any time via /sys/block/*/queue/scheduler
  • If you are running on a VM slice, then no-op I/O scheduler is recommended.
  • Set the following kernel tuning parameters:

Edit /etc/sysctl.conf:

vm.swappiness = 0

vm.dirty_background_ratio=1

vm.dirty_ratio=40

Use sudo sysctl -f to apply these changes.

  • It is very important to have at least one journal per host that will mount the filesystem. If the number of hosts exceeds the number of journals, performance will suffer. It is, unfortunately, impossible to add more journals without rebuilding the entire filesystem, so be sure to set journals up for each host during your initial build.

 

Working with RedHat

Should you run into GFS-related problems, running the following Script will provide all the information that you need in order to work with the Redhat Support Team:


mkdir /tmp/debugfs

mount -t debugfs none /tmp/debugfs

mkdir /tmp/$(hostname)-hangdata

cp -rf /tmp/debugfs/dlm/ /tmp/$(hostname)-hangdata

cp -rf /tmp/debugfs/gfs2/ /tmp/$(hostname)-hangdata

echo 1 > /proc/sys/kernel/sysrq 

echo 't' > /proc/sysrq-trigger 

sleep 60

cp /var/log/messages /tmp/$(hostname)-hangdata/

clustat > /tmp/$(hostname)-hangdata/clustat.out

cman_tool services > /tmp/$(hostname)-hangdata/clustat.out

mount -l > /tmp/$(hostname)-hangdata/mount-l.out

ps aux > /tmp/$(hostname)-hangdata/ps-aux.out

tar cjvf /tmp/$(hostname)-hangdata.tar.bz /tmp/$(hostname)-hangdata/

umount /tmp/debugfs/

rm -rf /tmp/debugfs

rm -rf /tmp/$(hostname)-hangdata

Summary

Disk utilization is an important part of the hosts ecosystem.  The results of filling the file system can have disastrous effects on server performance and data integrity.  It is very important to ensure that your host always has an appropriate amount of free disk space. 

Detection

When the file system runs out of space there will be a dramatic decrease in query performance and merges will cease.  There will also be a number entries in the ErrorLog.txt file that look like these:

SVC-FILWRT: File write error: write 'filename': No space left on device

Error in merge of forest [Forest-Name]: XDMP-MERGESPACE: Not merging due to disk space limitations, need=xxxMB, have=xxxMB

Mitigation

The best practice is to ensure that the total physical disk space available is sufficient to store all your forest data.  If you happen into a situation where you are dangerously low on disk space the following methods can be used to correct the situation.

  1. Move/Remove any unwanted files from the file system. This might include cleaning up log files that have grown very large.
  2. Add additional storage to the host.  This may require that you move the location of forest data.  Please see Moving Forests Across Storage Devices for more details.

In the event that the data directory containing the forest data and the directory containing the MarkLogic config files are on the same partition (which is common on Windows installations), it is possible to encounter a unique situation. In this situation, if the file system is completely full, MarkLogic will be unable to write config files.  If this is the case, you will not be able to perform any task in the Admin UI.  For this situation you will need to manually move your forest data if you cannot free enough disk space to allow for writing of configuration files.  This is a special case and is not recommended for situation where the Admin UI can be used.  Please refer to "Moving Forests Across Storage Devices" for our recommended process.

Step 1. Stop MarkLogic

Step 2. Move the forest data to another location.  Be sure to maintain permissions assigned to the forest data.

Step 3. Start MarkLogic

Step 4.  Detach your forest from the DB

  • In the Admin UI navigate to Configure -> Databases -> [Your-Database] -> Forests
  • Uncheck the box next to your forest and click "OK".

Step 5. Create a new forest, specifying the new storage location

  • In the Admin UI navigate to Configure -> Forests
  • Click the "Create" tab and fill in the appropriate information

Step 6. Copy all data from the old forest directory into the new forest directory

Step 7.  Attach the new forest to the DB

  • In the Admin UI navigate to Configure -> Database -> [Your-Database] -> Forests
  • Check the box next tot the newly created database and click "ok".

Step 8. Ensure there were no errors while mounting the new forest

  • In the Admin UI navigate to Configure -> Database -> [Your-Database] and click the status tab
  • It is also a good idea to look for any errors in the error log (/var/opt/MarkLogic/Logs/ErrorLog.txt)

Step 9. Provided there were no issues with the new forest, you can now delete the old forest

  • In the Admin UI navigate to Configure -> Forests -> [Old-Forest]
  • On the "Configure" tab select delete.
  • The old forest data will need to be deleted manually.

Database and Forest Size References:

MarkLogic Installation Guide: Disk Space Requirements 

Knowledge Base Article: Beginning in MarkLogic 7, the 3x disk space requirement can be reduced if configured and managed.

 

 

Introduction

In some situations an existing cluster node needs to be replaced. There are multiple reasons for this activity like hardware failure or hardware replacement.

In this Knowledgebase article we will outline the steps necessary to replace the node by reusing the existing cluster configuration without registering it again.

Important notes:

  • The replacement node must have the same architecture as all other nodes of the cluster (e.g., Windows, Linux, Solaris). The CPUs must also have the same number of bits (e.g., 64, 32).
  • The replacement node must have the same (or higher) count of CPU cores
  • The replacement node must have the same (or higher) allocated disk space and mount points as the old node
  • The replacement node must have the same hostname as the old node, unless the node is an AWS EC2 instance using MARKLOGIC_EC2=1(default when using MarkLogic AMIs)

Preparation steps for re-joining a node into the cluster

  • Install and configure the operating system
    • make sure the mount points are matching the old setup
    • in case the previous storage is healthy it can be reused (forests located on it will be mounted)
  • For any non-MarkLogic data (such as XQuery modules, Deployment scripts etc.) required to run on this node, ensure these are manually zipped and copied over as part of the staging process
  • Copy over MarkLogic configuration files (/var/opt/MarkLogic/*.xml) from a backup of the old node
    • If xdqp ssl enabled is set to true, change the setting to false.  If you can’t do this through the Admin UI, you can manually update the value of xdqp-ssl-enabled to false.
    • To re-enable ssl for xdqp connections once the node has rejoined the cluster, you will need to regenerate the replacement host certificate.  Follow the instructions in theRegenerating a XDQP Host Certificatessection of this article.

Downloading MarkLogic for the New Host

MarkLogic Server, and the optional MarkLogic Converters and Filters, can be downloaded from the MarkLogic Developer Community, the most recent versions can be found at the following URLS, and will provide you the option of downloading by either https or curl:

If the exact version you are running is not available, you may still be able to download it by getting the download link for the closest current version (8,9 or 10), and editing the minor version number in the link.

So if you need 10.0-1, and the current available version is 10.0-2, when you choose the Download via Curl option, you will get a download link that looks like this:

https://developer.marklogic.com/download/binaries/10.0/MarkLogic-10.0-2-amd64.msi?t=SomeHashValue/1&email=myemail%40mycompany.com

Update the URL with the minor release version you need:

https://developer.marklogic.com/download/binaries/10.0/MarkLogic-10.0-1-amd64.msi?t=SomeHashValue/1&email=myemail%40mycompany.com

If you are unable to get the version you need this way, then contact MarkLogic Support.

Rejoining the Replacement Node to the Cluster

There are two methods to rejoin a host into the cluster, depending on the availability of configuration files.

  1. Using an older set of configuration files from the node being replaced
  2. Creating a new set of configuration files from another node in the cluster

Method 1: Rejoining the Cluster With Existing Configuration Files

This procedure can be only performed if existing configuration files from /var/opt/MarkLogic/*.xml are available from the lost/old node otherwise it will fail causes a lot of problems.

  • Perform a standard MarkLogic server installation on the new target node
    • $ rpm -Uvh /path/to/MarkLogic-<version>.x86_64.rpm or yum install /path/to/MarkLogic-<version>.x86_64.rpm
    • $ rpm -Uvh /path/to/MarkLogicConverters-<version>.x86_64.rpm or yum install /path/to/MarkLogicConverters-<version>.x86_64.rpm (optional)
    • Verify local configuration settings in/etc/marklogic.conf (optional)
    • Do not start MarkLogic server
  • Create a new data directory
    • $ mkdir /var/opt/MarkLogic (default location; might already exist if this separate mount point)
    • Verify ownership of the data directory, daemon.daemon by default.
      • To fix: $ chown -R daemon:daemon /var/opt/MarkLogic
  • Copy an existing set of configuration files into the data directory
    • $ cp /path/to/old/config/*.xml /var/opt/MarkLogic
    • Verify ownership of the configuration files, daemon.daemon by default.
      • To fix: $ chown daemon:daemon /var/opt/MarkLogic/*.xml
  • Perform a last sanity check
    • Hostname must be the same as the old node, except for AWS EC2 nodes as mentioned above
    • Verify firewall or Security Group rules are correct
    • Verify mount points, file ownership and permissions are correct
  • Start MarkLogic
    • $ service MarkLogic start
  • Monitor the startup process

After starting the node it will reuse the existing configuration settings and assume the identity of the missing node. 

Method 2: Rejoining the Cluster With Configuration Files From Another Node

This procedure is required if there is no older configuration file set available. For example no file backup was made from /var/opt/MarkLogic/*.xml. It requires manual editing of a configuration file.  

  • Perform a standard MarkLogic server installation on the new target node
    • $ rpm -Uvh /path/to/MarkLogic-<version>.x86_64.rpm or yum install /path/to/MarkLogic-<version>.x86_64.rpm
    • $ rpm -Uvh /path/to/MarkLogicConverters-<version>.x86_64.rpm or yum install /path/to/MarkLogicConverters-<version>.x86_64.rpm (optional)
    • Verify local configuration settings in /etc/marklogic.conf (optional)
  • Start MarkLogic, and perform a normal server setup as a single node. DO NOT join the cluster now.
    • $ service MarkLogic start
    • Perform a basic setup
    • DO NOT join the host to the cluster!
  • Stop MarkLogic, and move current configuration files in /var/opt/MarkLogic to a new location
    • $ service stop MarkLogic
    • $ mv /var/opt/MarkLogic/*.xml/some/place
  • Copy a configuration files set from one of the other nodes over
    • $ scp <othernode>:/var/opt/MarkLogic/*.xml /var/opt/MarkLogic
    • Verify ownership of the data directory, daemon.daemon by default.
      • To fix: $ chown -R daemon:daemon /var/opt/MarkLogic
  • Make note of the <host-id> for the node be recreated in hosts.xml
    • $ grep -B1 hostname /var/opt/MarkLogic/hosts.xml
  • Edit /var/opt/MArkLogic/server.xml **Note: This step is critically important to ensure correct operation of the cluster.
    • Use a UTF-8 safe editor like nano or vi
    • Update <host-id> with the value found in/var/opt/MarkLogic/hosts.xml
    • Update <license-key> value if necessary.
    • Update <licensee> value if necessary.
    • Save the changes
  • Perform a last sanity check
    • <host-id> must match the <host> defined in hosts.xml.
      • Important: host will not start if these values do not match 
    • Hostname must be the same as the old node, unless the node is an AWS EC2 instance using the configuration option MARKLOGIC_EC2=1, which is the default when using the MarkLogic provided AMIs.
    • Firewall or Security Group rules are correct
    • Mount points, ownership and permissions are correct
  • Start MarkLogic and monitor the startup process

As emphasized in the procedures, it is very important to update server.xml and change the <host-id> to match the value defined in hosts.xml and apply the correct license information. Without these changes the node may not start up, may confuse the other nodes, or it may exhibit unexpected behavior.

Wrapping Up

For both methods, the startup process is the same. MarkLogic will use the configuration files to rejoin the cluster. Forests that no longer exist will automatically be recreated. Existing forests that have been mounted or copied to the correct location, will be mounted like before. Forests configured for local disk failover will automatically start synching with the online forests.  If configured, replication will start replicating the forests after the node is started. The forests can also be restored from backup, in case there is no local disk failover, or replication configured.

Regenerating a XDQP Host Certificates

The first step in the process is to check the Certificate to see whether it is valid or not.  If you replaced your node using method 1, the certificate is likely to be valid.  If you replaced your node using method 2, then the certificate is likely to be invalid.

Log into a terminal on the newly replaced host, and extract the private key from /var/opt/MarkLogic/server.xml and the hosts certificate from /var/opt/MarkLogic/hosts.xml:

  • $ cp /var/opt/MarkLogic/server.xml /tmp/server.key
  • Edit /tmp/server.key to remove all XML formatting
    • File should start with "-----BEGIN PRIVATE KEY-----"
    • File should end with "-----END PRIVATE KEY-----"

Now extract the certificate for the new host from/var/opt/MarkLogic/hosts.xml.

  • $ grep -A25 my-host.name /var/opt/MarkLogic/hosts.xml > /tmp/server.crt
  • Remove all the data from the file, except the certificate for the new host
    • File should start with "-----BEGIN CERTIFICATE-----"
    • File should end with "-----END CERTIFICATE-----"

Once you have the private key, and the certificate, you can compare the md5 signatures of the files usingopenssl, to see if they match.

  • $ openssl rsa -in /tmp/server.key -noout -modulus | openssl md5; openssl x509 -in /tmp/server.crt -noout -modulus | openssl md5

If the values match, STOP HERE.  The certificate is valid and does not need to be regenerated. If the values do not match, then the certificate needs to be regenerated.

Make note of the <host-id> from /var/opt/MarkLogic/server.xml.  This will be used to populate the value for the Common Name (CN) when the certificate is generated.

  • $ grep -B1 hostname /var/opt/MarkLogic/hosts.xml

Create the new self-signed certificate using the servers private key.  Typically these are set to 10 years (3650 days) by default when MarkLogic first runs, but you can choose another value if needed.  Use the <host-id> from the previos step as the CN.

  • $ sudo openssl req -key /tmp/server.key -new -x509 -days 3650 -out /tmp/new-server.crt -subj "/CN=[server-id-number]"

Compare the MD5 Checksums with openssl, this time they should match:

  • $ openssl rsa -in /tmp/server.key -noout -modulus | openssl md5; openssl x509 -in /tmp/new-server.crt -noout -modulus | openssl md5

Make a copy of hosts.xml to replace the certs, also note the host-id for use in a later step.

  • $ cp -p /var/opt/MarkLogic/hosts.xml /tmp/hosts.xml

Edit /tmp/hosts.xml and replace the old certificate for the host with the new certificate.  Find the entry with the correct <host-id> and replace the <ssl-certificate> field with the new certificate in /tmp/new-server.crt

Replace the existing hosts.xml with our updated copy

  • $ cp -p /tmp/hosts.xml /var/opt/MarkLogic/hosts.xml

Restart MarkLogic on the node.  This can be done from any host in the cluster, using the Admin Interface, the REST Management API endpoint, or Query Console.

  • Admin Interface: In the left tree menu, click onConfigure à Hosts à [Hostname], then select theStatus tab and click Restart
  • REST Management API: $ curl --anyauth --user password:password -X POST -i --data "state=restart" -H "Content-type: application/x-www-form-urlencoded" http://localhost:8002/manage/v2/hosts/[host-name]
  • Query Console: xdmp:restart((xdmp:host("engrlab-129-179.engrlab.marklogic.com")), "To reload hosts.xml after certificate update")

Verify the changes to hosts.xml have propagated to all hosts in the cluster.  Check that the hosts.xml is now the same for the hosts in the cluster.  One way of doing this is comparing md5 checksums.

  • $ md5sum /var/opt/MarkLogic/hosts.xml

You should now be able to set xdqp ssl enabled to true in the group configurations.  Check the cluster status page in the Administrative Interface to ensure all the hosts have reconnected successfully, or review the ErrorLog files to ensure there are no SVC-SOCACC errors in the log.

Additional Notes

This article explains how to directly replace a node in a cluster by using the same host name. Another way is to add a new node to the cluster and transfer the forests which is explained in the following knowledge base article "Replacing a D-Node with local disk failover".

Some of these steps may differ, such as operating system calls or file system locations. On a different OS, the specific commands will need to be adjusted to match the environment.

Related Reading

Replacing a failed MarkLogic node in a cluster: a step by step walkthrough

Introduction

In a multiple node cluster with local disk failover configured, there may be a need to replace a server with new hardware. This article explains how to do that while preserving the failover configuration.

Sample configuration

Consider a 3-node cluster with local disk failover for database Test, and the forest assignment for the hosts looks like this:  (all forests ending with 'p' are primary and those ending with 'r' are replica)

Host A Host B Host C
forest a-1p forest b-3p forest c-5p
forest a-2p forest b-4p forest c-6p
forest a-3r forest b-1r forest c-2r
forest a-6r forest b-5r forest c-4r

With this configuration under normal operations, each host will have the two primary forests "open" and the replica forests "sync replicating".

Failover Example

In the event of a node failure of say, Host B, primary forests on Host B will failover to Hosts A & C as expected. The forests a-3r and c-4r are now "open" and acting as master forests. 

When Host B comes back online, the replica forests a-3r and c-4r will continue as acting masters, and forests b-3p & b-4p on Host B will now act as replicas; This state will persist until another failover event occurs or the forests are manually restarted.

Replacing a Host 

In the case where a node in the cluster needs to be physically replaced with another node, it is important to preserve the original master-replica configuration of the forests, so that there is no performance burden on a single node hosting all the primary forests.

Example: replacing Host-B with a new Host-D

The steps listed below show how to replacing a node (old Host-B with new Host-D) without affecting the failover configuration:

  1. Shut down Host B and make sure forest failover successful - Forests c-4r & a-3r are "open" (acting masters).
  2. Add Host D as a node to the cluster; 
  3. Create new replica forests (d-1r and d-5r) on Host D and make them replicas of the corresponding primary forests on Host A & C. 
  4. Create new primary forests 'd-3p' and 'd-4p' on Host D  (These will replace b-3p and b-4p); 
  5. Break replication between a-1p and b-1r, and between c-5p and b-5r by updating the forest configuration for the primary forests.
  6. Take forest level backup of the failed over forests ('forest a-3r' and 'forest c-4r')
  7. Restore the backups from step 6 to the new primary forests 'forest d-3p' and 'forest d-4p' on Host D
  8. Attach forests 'forest d-3p' and 'forest d-4p' to the database and make forests 'forest a-3r' and 'forest c-4r'  their replicas.

This will replace Host B with Host D, as well as preserve the previously existing primary-replica configuration among the hosts.

Host A Host D Host C
forest a-1p forest a-2p forest a-3r
forest d-3p forest d-4p forest d-1r
forest c-5p forest c-6p forest c-2r
forest a-6r forest d-5r forest c-4r

Additional Notes

It is important to make sure that the database is quiesced before taking the forest backups. The idea is to disallow ingestion/updates on the database: One technique is to quiesce a database by making all of its forests 'read-only' -  http://docs.marklogic.com/guide/admin/forests#id_72520 during the process and revert once complete.

Note: This example assumes a distributed master-replica configuration of a 3-node cluster. However, the same procedure works with other configurations with some careful attention to the number of forests on each host and breaking replication between the right set of hosts.

 

 

 

 

 

 

Backwards Compatibility

Newer versions of MarkLogic will support backups taken from older versions of the software.  This restore may cause a reindex of the data in order to upgrade the database to the current feature release version.  Information on backing-up/restoring can be found in the following documentation:

Database Level Backups: Backing Up and Restoring a Database

Forest Level Backups and Restores: Making Backups of a Forest, Restoring a Forest

Upgrade compatibility: Upgrades and Database Compatibility

Downgrading

MarkLogic does not support downgrading to an older version.  Therefore, backups that were taken on a newer version of MarkLogic will not be compatible with older versions of MarkLogic.  For more details please see MarkLogic Server Version Downgrades are Not Supported.

Backup and Restore Across OS Versions

Notes about Backup and Restore Operations

  • The backup files are platform specific--backups on a given platform should only be restored onto the same platform. This is true for both database and forest backups.

Platform is used to indicate OS Families, e.g. Windows, Linux and MacOS. MarkLogic supports backup and restore operations across OS version changes, e.g. from RHEL 6 to RHEL 7, but not across OS changes such as Windows to Linux.

Introduction

There have been a number of reported incidents where database replication has been configured and where the main Schema database on the replica has been used alongside database replication; in a situation where MarkLogic's default Schema database is used for data, replicating the Schemas database to itself on a foreign cluster will cause instability and likely will lead to a system outage that can only be resolved by breaking replication for the Schemas database.

MarkLogic's documentation (pre-ML9) on database replication warns users against this, stating that:

You cannot replicate a Master database that acts as its own schema database. When replicating a Master schema database, create a second empty schema database for the Replica schema database on the Replica cluster.

What happens if I attempt to do this?

In earlier releases of MarkLogic Server, this has been known to cause cluster outage and several support cases have been raised due to this configuration causing undesired - and unexpected - results.

In newer releases of the product (such as MarkLogic 8.0-5 and above), the Admin GUI on port 8001 and our admin APIs will now stop users from making this configuration change in the following ways:

  • In the Admin UI on port 8001, setting up replication between clusters and configuring database replication for the Schemas database should now fail with the message "Cannot setup replication for a database whose schema database is itself"
  • Using the Admin API with a call to error 2. Couple a cluster and call to admin:database-set-foreign-master against the Schemas Database should fail, instead causing an ADMIN-DBREPSCHEMADBSAMEASDB exception to be thrown.

This is still an issue for MarkLogic 7 and earlier releases, so it's important to always ensure that you are using a separate database for your replica as advised in our documentation.

Best Practice: Always separate out Schemas Databases where necessary

If your application makes use of Schemas, create a completely separate schemas database for your application.  Doing this keeps your application self contained and allows you to replicate it as you would with any other database.

Summary

This article explores fragmentation policy decisions for a MarkLogic database, and how search results may be influenced by your fragmentation settings.

Discussion

Fragments versus Documents

Consider the below example.

1) Load 20 test documents in your database by running

let $doc := <test>{
for $i in 1 to 20 return <node>foo {$i}</node>
}</test>
for $i in 1 to 20
return xdmp:document-insert ('/'||$i||'.xml', $doc)

Each of the 20 documents will have a structure like so:

<test>
    <node>foo 1</node>
    <node>foo 2</node>
           .
           .
           .
    <node>foo 20</node>
</test>

2) Observe the database status: 20 documents and 20 fragments.

3) Create a fragment root on 'node' and allow the database to reindex.

4) Observe the database status: 20 documents and 420 fragments. There are now 400 extra fragments for the 'node' elements.

We will use the data with fragmentation in the examples below.


Fragments and cts:search counts

Searches in MarkLogic work against fragments (not documents). In fact, MarkLogic indexes, retrieves, and stores everything as fragments.

While the terms fragments and documents are often used interchangeably, all the search-related operations happen at fragment level. Without any fragmentation policy defined, one fragment is the same as one document. However, with a fragmentation policy defined (e.g., a fragment root), the picture changes. Every fragment acts as its own self-contained unit and is the unit of indexing. A term list doesn't truly reference documents; it references fragments. The filtering and retrieval process doesn't actually load documents; it loads fragments. This means a single document can be split internally into multiple fragments but they are accessed by a single URI for the document.

Since the indexes only work at the fragment level, operations that work at the level of indexing can only know about fragments.

Thus, xdmp:estimate returns the number of matching fragments:

xdmp:estimate (cts:search (/, 'foo')) (: returns 400 :)

while fn:count counts the actual number of items in the returned sequence:

fn:count (cts:search (/, 'foo')) (: returns 20 :)


Fragments and search:search counts

When using search:search, "... the total attribute is an estimate, based on the index resolution of the query, and it is not filtered for accuracy." This can be seen since


import module namespace search = "http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy";
search:search("foo",
<options xmlns="http://marklogic.com/appservices/search">
<transform-results apply="empty-snippet"/>
</options>
)

returns

<search:response snippet-format="empty-snippet" total="400" start="1" page-length="10" xmlns:search="http://marklogic.com/appservices/search">
<search:result index="1" uri="/3.xml" path="fn:doc(&quot;/3.xml&quot;)" score="2048" confidence="0.09590387" fitness="1">
<search:snippet/>
</search:result>
<search:result index="2" uri="/5.xml" path="fn:doc(&quot;/5.xml&quot;)" score="2048" confidence="0.09590387" fitness="1">
<search:snippet/>
</search:result>
.
.
.
<search:result index="10" uri="/2.xml" path="fn:doc(&quot;/2.xml&quot;)" score="2048" confidence="0.09590387" fitness="1">
<search:snippet/>
</search:result>


Notice that the total attribute gives the estimate of the results, starting from the first result in the page, similar to the xdmp:estimate result above, and is based on unfiltered index (fragment-level) information. Thus the value of 400 is returned.

When using search:search:

  • Each result in the report provided by the Search API reflects a document -- not a fragment. That is, the units in the Search API are documents. For instance, the report above has 10 results/documents.
  • Search has to estimate the number of result documents based on the indexes.
  • Indexes are based on fragments and not documents.
  • If no filtering is required to produce an accurate result set and if each fragment is a separate document, the document estimate based on the indexes will be accurate.
  • If filtering is required or if documents aggregate multiple matching fragments, the estimate will be inaccurate. The only way to get an accurate document total in these cases would be to retrieve each document, which would not scale.

Fragmentation and relevance

Fragmentation also has an effect on relevance.  See Fragments.


Should I use fragmentation?

Fragmentation can be useful at times, but generally it should not be used unless you are sure you need it and understand all the tradeoffs. Alternatively, you can break your document into subdocuments instead. In general, the search API is designed to work better without fragmentation in play.

SUMMARY

Some MarkLogic Server sites are intalled in a 1GB network environment. At some point, your cluster growth may require an upgrade to 10GB ethernet. Here are some hints for knowing when to migrate up to 10GB ethernet, as well as some ways to work around it prior to making the move to 10GB.

General Approach

A good way to check if you need more network bandwidth is to monitor the network packet retransmission rate on each host.  To do this, use the "sar -n EDEV 5" shell command. [For best results, make sure you have an updated version of sar]

Sample results:

# sar -n EDEV 5 3
... 10:41:44 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:49 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:49 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:49 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:54 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:54 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:54 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:59 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:59 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s Average: lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


Explanation of terms:

FIELDDESCRIPTION
IFACE LAN interface
rxerr/s Bad packets received per second
txerr/s Bad packets transmitted per second
coll/s Collisions per second
rxdrop/s Received packets dropped per second because buffers were full
txdrop/s Transmitted packets dropped per second because buffers were full
txcarr/s Carrier errors per second while transmitting packets
rxfram/s Frame alignment errors on received packets per second
rxfifo/s FIFO overrun errors per second on received packets
txfifo/s FIFO overrun errors per second on transmitted packets

If the value of txerr/s and txcarr/s is none zero, that means that the packets sent by this host are being dropped over the network, and that this host needs to retransmit.  By default, a host will wait for 200ms to see if there is an acknowledgment packet before taking this retransmission step. This delay is significant for MarkLogic Server and will factor into overall cluster performance.  You may use this as an indicator to see that it's time to upgrade (or, debug) your network. 

Other Considerations

10 gigabit ethernet requires special cables.  These cables are expensive, and easy to break.  If a cable is just slightly bent improperly, you will not get 10 gigabit ethernet out of it. So be sure to work with your IT department to insure that everything is installed as per the manufaturer specification. Once installed, double-check that you are actually getting 10GB from the installed network.

Another option is to use bonded ethernet to increase network bandwidth from 1GB to 2GB and to 4GB prior to jumping to 10GB.  A description of Bonded ethernet lies beyond the scope of this article, but your IT department should be familiar with it and be able to help you set it up.

 

The recommended way to run MarkLogic on AWS is to use the "managed" Cloud Formation template provided by MarkLogic:

https://developer.marklogic.com/products/cloud/aws

The documentation for it is here:

https://docs.marklogic.com/guide/ec2/CloudFormation

By default, the MarkLogic nodes are hidden in Private Subnets of a VPC and the only way to access them from the Internet is via the Elastic Load Balancer.

This is optimal as it distributed the load and shields from common attack vectors.

However, for some types of maintenance it may be useful, or even necessary to SSH directly into individual MarkLogic nodes.

Examples where this is necessary:

1. Configuring Huge Pages size so that it is correct for the instance size/amount of RAM: https://help.marklogic.com/Knowledgebase/Article/View/420/0/group-level-cache-settings-based-on-ram

2. Manual MarkLogic upgrade where a new AMI is not yet available (for example for emergency hotfix): https://help.marklogic.com/Knowledgebase/Article/View/561/0/manual-upgrade-for-marklogic-aws-ami

 


To enable SSH access to MarkLogic nodes you need to:

I. Create an intermediate EC2 host, commonly known as 'bastion' or 'jump' host.

II. Put it in the correct VPC and correct (public) subnet and ensure that it has public / Internet-facing IP address

III. Adjust security settings so that SSH connections to bastion host as well SSH connection from bastion to MarkLogic nodes are allowed and launch the bastion instance.

IV. Additionally, you will need to configure SSH key forwarding or a similar solution so that you don't need to store your private key on the bastion host.

I. Creating the EC2 instance in AWS Console:

1. The EC2 instance needs to be in the same region as the MarkLogic Cluster so the starting console URL will be something like this (depending on the region and your account):

https://eu-west-1.console.aws.amazon.com/ec2/home?region=eu-west-1#LaunchInstanceWizard:

2. The instance OS can be any Linux of your choice and the default Amazon Linux 2 AMI is fine for this. For most scenarios the jump host does not need to be powerful so any OS that is free tier eligible is recommended:

Step1-AMI.png

3.Choose instance size. For most scenarios (including SSH for admin access), the free tier t2.micro is the most cost-effective instance:

Step2-Instance-type.png

4. Don't launch the instance just yet - go to Step 3 of the Launch Wizard ("Step 3: Configure Instance Details").

II. Put the bastion host in the correct VPC and subnet and configure public IP:

The crucial steps here are:

1. Choose the same VPC that your cluster is in. You can find the correct VPC by reviewing the resources under the Cloud Formation template section of the AWS console or by checking the details of the MarkLogic EC2 nodes.

2. Choose the correct subnet - you should navigate to the VPC section of the AWS Console, and see which of the subnets of the MarkLogic Cluster has an Internet Gateway in its route table.

3. Ensure that "Auto-assign Public IP" setting is set to "enable" - this will automatically configure a number of AWS settings so that you won't have to assign Elastic IP, routing etc. manually.

4.Ensure that you have sufficient IAM permissions to be able to create the EC2 instance and update security rules (to allow SSH traffic)

Step3-instance-details.png

III. Configure security settings so that SSH connections are allowed and launch:

1. Go to "Step 6: Configure Security Group" of the AWS Launch Wizard. By default, AWS will suggest creating "launch" security group that opens SSH incoming to any IP address. You can adjust as necessary to allow only a certain IP address range, for example.

Step6-security.png

Additionally, you may need to review the security group setting for your MarkLogic cluster so that SSH connections from bastion host are allowed.

2.Go to "Step 7: Review Instance Launch" and press "Launch". At this step you need to choose a correct SSH key pair for the region or create a new one. You will need this SSH key to connect to the bastion host.

ssh-keypair.png

3. Once the EC2 instance launches, review its details to find out the public IP address.

instance-publicIP.png

IV. Configure SSH key forwarding so that you don't have permanently store your private SSH on the bastion host. Please review your options and alternatives here (for example using ProxyCommand) as key forwarding temporarily stores the private key on the bastion host, so anyone with root access to the bastion host could hijack your MarkLogic private key (when logged in at the same time as you).

1. Add the private key, to SSH agent:

ssh-add -K myPrivateKey.pem

2. Test the connection (with SSH agent forwarding) to the bastion host using:

ssh -A ec2-user@<bastion-IP-address>

3. Once you're connected ssh from the bastion to a MarkLogic node:

ssh ec2-user@<MarkLogic-instance-IP-address or DNS-entry>

ssh-verify.png

For strictly AWS infrastructure issues (VPC, subnets, security groups) please contact AWS support. For any MarkLogic related issues please contact MarkLogic support via:

help.marklogic.com

Introduction

We discuss why MarkLogic server should be started with root priviledges.

Details

It is possible to install MarkLogic Server in a directory that does not require root priviledges.

There's also a section in our Installation Guide (Configuring MarkLogic Server on UNIX Systems to Run as a Non-daemon User) that talks at some length about how to run MarkLogic Server as a user other than daemon on UNIX systems. While that will allow you to configure permissions for non-root and non-daemon users in terms of file ownership and actual runtime, you'll still want to be the root user to start and stop the server.

It is possible to start MarkLogic without su privileges, but this is strongly discouraged.

The parent (root) MarkLogic process is simply a restarter process. It is there simply to wait for the non-root process to exit, and if the non-root process exits abnormally for some reason, the root process will fork and exec another non-root process. The root process runs no XQuery scripts, opens no sockets, and accesses no database files.

We strongly recommend to start MarkLogic as root and let it switch to the non-root user on its own. When the server initializes, if it is root it makes some privileged kernel calls to configure sockets, memory, and threads. For example, it allocates huge pages if any are available, increases the number of file descriptors it can use, binds any configured low-numbered socket ports, and requests the capability to run some of its threads at high priority. MarkLogic Server will function if it isn’t started as root, but it will not perform as well.

You can work around the root-user requirements for starting/stopping (and even installation/uninstallation) by creating wrapper scripts that call the appropriate script (startup, shutdown, etc.), providing sudo privileges to just the wrapper.  This helps to control and debug execution.

Further reading

Knowledgebase - Pitfalls Running Marklogic Process as Non-root User 

Summary

When attempting to send email from MarkLogic, from Ops Director, Query Console, or other application, you might encounter one of the following errors in your MarkLogic Server Error Log, or in the Query Console results pane.

  • Error sending mail: STARTTLS: 502 5.5.1 Error: command not implemented
  • Error sending mail: STARTTLS: 554 5.7.3 Unable to initialize security subsystem

This article will help explain what these errors mean, as well as provide some ways to resolve it.

What these Errors Mean

These errors indicate that MarkLogic is attempting to send an SMTPS email through the relay, and the relay either does support SMTPS, or SMTPS has not been configured correctly.

Resolving the Error

One possible cause of this error is when the smtp relay setting for MarkLogic server is set to localhost.  The error can be resolved by using the Admin Interface to update the smtp relay setting with the organizational SMTP host or relay.  That setting can be found under Configure --> Groups --> [GroupName]: Configure tab, then search for 'smtp relay'.

If this error occurs when testing the Email configuration for Ops Director, you can configure Ops Director to use SMTP instead of SMTPS by ensuring the Login and Password fields are blank.  These fields can be found under Console Settings --> Email Configuration in the Ops Director application.

Alternatively, install/configure an SMTP server with SMTPS support.

Related Reading

https://en.wikipedia.org/wiki/SMTPS

https://www.f5.com/services/resources/deployment-guides/smtp-servers-big-ip-v114-ltm-afm

Summary

When an SSL certificate is expired or out of date, it is necessary to renew the SSL certificates applied to a MarkLogic Application Server.   

The following general steps are required to apply an SSL certificate.  

  1. Create a certificate request for a server in MarkLogic
  2. Download Certificate Request and send it to certificate authority
  3. Import signed certificate into MarkLogic

Detailed Steps

Before proceeding, please note that you dont need to create a new template to renew an expired certificate as the existing template will work.

1. Creating a certificate request - A fresh csr can be generated from the MarkLogic Admin UI by navigating to Security -> Certificate Templates -> click [your_template] -> click the request tab -> Select radio button applicable for an expired/out of date certificate case. For Additional information, refer to the Generating and Downloading Certificate Requests section of our Administrators Guide.

2. Download and Send to certificate authority - The certificate template Status page will display the newly generated request. You can download it and send it to your certificate authority for signing.

3. Import signed certificate into MarkLogic - After recieving the signed certificate back from the certificate authority, you can import it from our Admin UI by navigating to Security-> Certificate Templates -> click [your_template] -> Import tab.  For Additional information, refer to the Importing a Signed Certificate into MarkLogic Server section of our Administrators Guide

4. Verify - To verify whether the certificate has been renewed, please look at the summary of your certificate authority. The newly added certificate should appear in certificate authority. Detailed instructions for this are available in the Viewing Trusted Certificate Authorities section of our Administrators Guide

If you are not able to view the certificate authority, then you may need to add the certificate as if it is a new CA. This can happen as if there was a change in CA certificate chain.

  • Click on the certificate template name and then import the certificate. You should already have this CA listed (as this was already there and only the certificate expired). However if there is a change in certificate authority then you will need to import it - you can do this by navigating in the Admin UI to Configure -> Security -> Certificate Authorities --> click on the import tab - this will be equivalent to adding a new CA certificate into MarkLogic. The CA certificate name will now appear in the list.

 

 

 

Summary

When running MarkLogic on AWS without using the Managed Cluster feature, a hostname warning may occur under certain circumstances.

Customer Managed Clusters

Customers who manage their own clusters in AWS use the /etc/marklogic.conf file to disable the MarkLogic Managed Cluster feature by setting. This is done by setting MARKLOGIC_EC2_HOST=0 to disable all of MarkLogics EC2 enhancements, or by setting MARKLOGIC_MANAGED_CLUSTER=0 to only disable the Managed Cluster feature. This should be done prior to starting MarkLogic for the first time on the host.

AWS Configuration Variables

SVC-SOCHN Errors

When MarkLogic is started prior to /etc/marklogic.conf being put in place, it will populate /var/local/mlcmd.conf file with some default values, including a MARKLOGIC_HOSTNAME value that is based on the current instance hostname. If this volume is used on a new instance, it's possible to end up with a value for MARKLOGIC_HOSTNAME that will no longer resolve. This will result in the following error:

2020-04-16 15:15:36.468 Warning: A valid hostname is required for proper functioning of MarkLogic Server: SVC-SOCHN: Socket hostname error: getaddrinfo ip-10-10-0-15.mydomain.com: Name or service not known

The issue does not impact the functioning of the cluster.

Resolving the Issue

After verifying that /etc/marklogic.conf has the correct entries, remove the /var/local/mlcmd.conf file, and restart the MarkLogic service on the host.

Further Reading

Getting Started with MarkLogic Server on AWS

Summary

It is important to have swap space configured on local disk as recommended on the computers in which MarkLogic runs. The general recommendation(as a minimum) is 2x physical memory allocated for swap, although that can be relaxed for Linux systems (see below).  While a properly configured MarkLogic Server installation should not need to use swap space during normal operations, the swap space is still important so that the operating system can allocate very large chunks of memory.  Even though it might seem counter-intuitive, having swap space configured appropriately actually allows the computer to run more efficiently by not needing to use its swap space.  When MarkLogic asks the operating system for a large chunk of memory (for example, when performing a large search), if there is not enough swap space available, then the operating system will have to use the swap space before it can allocate the memory, causing slow performance and possible search failures with SVC-MALLOC 'out of memory' errors. 

Linux

On Linux systems, the minimum swap space size should be set to 1x the size of RAM (if the node hosting ML has less than 32GB of RAM) or 32 GB (if the node has 32GB or more of RAM). If you have Huge Pages set up on a Linux system, the minimum swap space size on that machine should be equal to the size of your physical memory minus the size of your Huge Pages (because Linux Huge Pages are not swapped), or 32GB, whichever is lower.  For more information about huge pages, see Linux Huge Pages.

Another reason for configuring sufficient swap on a Unix system is to insure that the MarkLogic process is not killed by the OOM (Out Of Memory) killer.  Since the MarkLogic task will be the largest on the server, the OOM task will kill MarkLogic if the system runs out of memory.  In the event that the server has an issue causing it to thrash, having the recommended amount of swap space will allow for a more graceful shutdown.

Windows

For Windows systems we recommend configuring the server to use 2x physical memory. However, Windows systems are normally set up to grow the swap (page) file as needed. 

Solaris

For Solaris we recommend configuring the system to use 2x physical memory.

The reason for configuring 2x swap on a Solaris system is because the system calls used by MarkLogic to execute converter programs and to process fatal signals (in order to get stack traces) requires an enough swap space to exist in order to mirror the existing MarkLogic process space.  That space is not used, but the space needs to be available in order to allocate the memory. 

Summary

MarkLogic Server expects the system clocks to be synchronized across all the nodes in a cluster, as well as between Primary and Replica clusters. The acceptable level of clock skew (or drift) between hosts is less than 0.5 seconds, and values greater than 30 seconds will trigger XDMP-CLOCKSKEW errors, and could impact cluster availability.

Cluster Hosts should use NTP to maintain proper clock synchronization.

Inside MarkLogic Clock Time usage

MarkLogic hosts include a precise time of day in XDQP heartbeat messages they send to each other. When a host processes incoming XDQP heartbeat messages, host compares the time of the day in the message against its own clock. If the time difference from the comparison is large enough host will report a CLOCKSKEW in ErrorLog.

Clock Skew

MarkLogic does not thoroughly test clusters in a clock skewed configuration, as it is not a valid configuration. As a result, we do not know all of the ways that a MarkLogic Server Cluster would fail. However, there are some areas where we have noticed issues:

  • Local disk failover may not perform properly as the inter-forest negotiations regarding which forest has the most up to date content may not produce the correct results.
  • Database replication can hang
  • SSL certificate verification may fail on the time range.

If MarkLogic Server detects a clock skew, it will write a message to the error log such as one of the following:

  • Warning: Heartbeat: XDMP-CLOCKSKEW: Detected clock skew: host hostname.domain.com skewed by NN seconds
  • Warning: XDQPServerConnection::init: nnn.nnn.nnn.nnn XDMP-CLOCKSKEW: Detected clock skew: host host.domain.local skewed by NN seconds
  • Warning: Excessive clock skew detected; suggest using NTP (NN seconds skew with hostname)

If one of these lines appears in the error log, or you see repeated XDMP-CLOCKSKEW errors over an extended time period, the clock skew between the hosts in the cluster should be verified. However, do not be alarmed if this warning appears even if there is no clock skew. This message may appear on a system under load, or at the same time as a failed host comes back online. In these cases the errors will typically clear within a short amount of time, once the load on the system is reduced.

Time Sync Config

NTP is the recommended solution for maintaining system clock synchronization.

(1) NTP clients on Linux

The most common Linux NTP clients are ntpd and chrony.   Either of these can be used to ensure your hosts stay synchronized to a central NTP time source.  You can check the settings for NTP, and manually update the date if needed

The instructions in the link below goes over the process of checking the ntpd service, and updating the date manually using the ntpdate command. 
The following Server Fault article goes over the process of forcing chrony to manually update and step the time using the chronyc command.

Running the applicable command on the affected servers should resolve the CLOCKSKEW errors for the short term.

If the ntpd or chrony service is not running, you can still use the ntpdate or chronyc command to update the system clock, but you will need to configure a time service to ensure accurate time is maintained, and avoid future CLOCKSKEW errors.  For more information on setting up a time sychonization service, see the following KB article:

(2) NTP clients on Windows

Windows servers can be configured to retrieve time directly from an NTP server, or from a Primary Domain Controller (PDC) in the root of an Active Directory forest that is configured as an NTP server. The following link includes information on configuring NTP on a Windows server, as well as configuring a PDC as an NTP server.

https://support.microsoft.com/en-us/help/816042/how-to-configure-an-authoritative-time-server-in-windows-server

(3) VMWare time synchronization

If your systems are VMWare virtual machines then you may need to take the additional step of disabling time synchronization of the virtual machine. By default the VMWare daemon will synchronize the Guest OS to the Host OS once per minute, and may interfere with ntpd settings. Through the VMSphere Admin UI, you can disable time synchronization between the Guest OS and Host OS in the virtual machine settings.

Configuring Virtual Machine Options

This will prevent regular time synchronization, but synchronization will still occur during some VMWare operations such as, Guest OS boots/reboots, resuming a virtual machine, among others. To disable VMWare clock sync completely, then you need to edit the .vmx for the virtual machine to set several synchronization properties to false. Details can be found in the following VMWare Blog:

Completely Disable Time Synchronization for your VM

(4) AWS EC2 time synchronization

For AWS EC2 instances, if you are noticing CLOCKSKEW in MarkLogic cluster you would benefit from changing clock source from default xen to tsc.

Other sources for Clock Skew

(1) Overloaded Host leading to Clock Skew

If for some reason there is a long time between when a XDQP heartbeat message was encoded in sending host, and when it was decoded at receiving host end, it will be interpreted as a CLOCKSKEW. Below are some of the combinations which can lead to CLOCKSKEW.

  • If a sending host is overloaded enough that heartbeat messages are taking a long time to be sent, it could be reported as a transient CLOCKSKEW by the receiver.
  • If a receiving host is overloaded enough that a long time elapsed between sending time and processing time, it can be reported as a transient CLOCKSKEW.

If you see a CLOCKSKEW message in ErrorLog combined with other messages (Hung messages, Slow Warning) then Server is likely overloaded and thrashing. Messages reporting broken XDQP connections (Stopping XDQPServerConnection) are a good indication that a host is overloaded and hung for a while, so much that other hosts disconnected.

(2) XDQP Thread start fail leading to Clock Skew

When MarkLogic starts up it tries to make the number of process per user (set limit) on System to at least 16384. But if MarkLogic is not starting as root, then MarkLogic will only be able to raise the soft limit (for number of processes per user) up to the hard limit, which could fail XDQP thread start up. You can get the current setting with the shell command ulimit -u and make sure number of process per user is at least 16384.

Further Reading

Introduction

This article gives a brief summary of the MarkLogic Telemetry feature available in MarkLogic Server version 9

What is Telemetry used for?

Telemetry is a communication channel between customer's MarkLogic Server instances and the MarkLogic Support team. When enabled, historical configuration, system log and performance data is efficiently collected for immediate access by the Support team who can begin working on your support incident. Having immediate access to this critical system data will often lead to quicker diagnostics and resolution of your logged support incidents. 

When Telemetry is enabled, MarkLogic Server collects data locally and periodically uploads it encrypted and anonymised to a secure cloud storage. Data collected locally follows MarkLogic encryption settings and can be reviewed at any time. Telemetry has very low impact on the server performance as it does not require any communication between nodes and it does not depend on any database or index settings. Telemetry does require some local disk space and an SSL connection (Port 443) to access *telemetry.services.marklogic.com.

What is Captured and What is Not

Telemetry data is only collected from:

Telemetry neither collects nor sends application specific logs or customer data.

How to enable

Telemetry can be enabled at any time after MarkLogic 9 is installed through the Admin-UI, Admin-API or Rest interfaces. It is recommended that you enable Telemetry in order to have data uploaded and available before an incident is reported to MarkLogic Technical Support.  The following script is an example of how to enable Telemetry from Query Console with recommended settings for all nodes in a cluster:

Telemetry will be enabled during run time (doesn't require a restart) and starts uploading as soon as some data is collected and a connection to *telemetry.services.marklogic.com is established. All configuration settings can be changed at any time and are not dependent on other log level settings. Currently the following data types are configurable:

  • Configuration files reflect MarkLogic cluster settings over time
  • ErrorLog will only contain system related information; Application level logging, which may contain Personally Identifiable Information, are not included in the system ErrorLog files captured by the Telemetry feature.
  • Metering (performance) data holds information about cluster,host,forest status and application feature metrics

In addition, Telemetry supports uploading a Support Request on demand to the secure cloud storage. Uploading a Support Request is independent of all configured Telemetry settings as long as a connection to *.telemetry.services.marklogic.com over SSL can be established.

Who has access

Telemetry data is stored at a secured cloud storage using the Cluster-ID as identifier. A Cluster-ID is a randomly generated number during a MarkLogic installation. Access to the data is restricted and requires an open Support Ticket with a provided Cluster-ID. Data will be accessed and downloaded only by the Support Team for the period of time a Support Ticket is open. As soon as the ticket is closed all downloaded data will be destroyed. Data uploaded to the cloud storage will be held for a few month until it is deleted.

Further reading

More details can be found in the Telemetry (Monitoring MarkLogic Guide) in our documentation.

  

Introduction

XQuery modules can be imported from other XQuery modules in MarkLogic Server. This article describes how modules are resolved in MarkLogic when they are imported in Xquery.

Details

How modules are imported in code

Modules can be imported by using two approaches-

--by providing relative path

import module namespace m = "http://example.edu/example" at "example.xqy";

--Or by absolute path

import module namespace m = "http://example.edu/example" at "/example.xqy";

 

How MarkLogic resolves the path and loads the module

If something starts with a slash, it is a non-relative path and MarkLogic take it as is, if it doesn't, it is a relative path and first it is resolved  relative to the URI of the current module to obtain a non-relative path. 
 
Path in hand, MarkLogic always start by looking in the Modules directory. This is a security issue as we want to make sure that the MarkLogic created modules are the ones chosen. In general, users should NOT be putting their modules there. It creates issues on upgrade and if they open up permissions on the directory to ease deployment it creates a security hole. 
 
Then, depending on whether the appserver is configured to use a modules database or the filesystem, we interpret the non-relative path in terms of the appserver root either on the file system or in the Modules database. 

 

Debugging module path issue

To Debug this you can also enable Module caching trace. This will check  how it resolves the paths. Enter "module" as the name of the event in the Diagnostics>Events and you should have a list of module caching events added. These will give you the working details of how module resolution is happening, and should provide enough information to resolve the issue.

Be aware that diagnostic traces can fill up your ErrorLog.txt file very fast, so be sure to turn them off as soon when you no longer need them.

 

Performance Hints

1. Be sure that your code does not rely on dynamically-created modules. Although these may be convenient at times, they will make overall performance suffer. This is because every time a module changes, the internal modules cache is invalidated and must be re-loaded from scratch -- which will tend to hurt performance.

2. if you are noticing a lot of XDMP-DEADLOCK messages in your log, be sure your modules are not mixing any update statements within what should be a read-only query. The XQuery parser looks for updates anywhere in the modules stack -- including imports -- and if it finds one, it assumes that any Uri that is gathered by the queries might potentially be updated. Thus, if the query matches 10 Uris, it will put a write lock on them, and if it matches 100000 Uris, it will lock all of them as well, and performance will suffer. To prevent this, be sure to isolate updates in their own transactions via xdmp:eval() or xdmp:spawn().

 

 

Summary

There are a number of options for transferring data between MarkLogic Server clusters. The best option for your particular circumstances will depend on your use case.

Details

Database Backup and Restore

To transfer the data between two independent clusters, you may use a database backup and restore procedure, taking advantage of MarkLogic Server's facility to make a consistent backup of a database.

Note: the backup directory path that you use must exist on all hosts that serve any forests in the database. The directory you specify can be an operating system mounted directory path, it can be an HDFS path, or it can be an S3 path. Further information on using HDFS and S3 storage with MarkLogic is available in our documentation:

Further information regarding backup and restore may be found in our documentation and Knowledgebase:

Database Replication

Database Replication is another method you might choose to use to transfer content between environments. Database Replication will allow you to maintain copies of forests on databases in multiple MarkLogic Server clusters. Once the replica database in the replica cluster is fully synchronized with its master, you may break replication between the two and then go on to use the replica cluster/database as the master.

Note: to enable Database Replication, a license key that includes Database Replication is required. You would also need to ensure that all hosts are: running the same maintenance release of MarkLogic Server; using the same type of Operating System; and Database Replication is correctly configured.

Also note that before MarkLogic server version 9.0-7, indexing information was not replicated over the network between the Master and Replica databases and is instead regenerated by the Replica database.

Starting with ML server version 9.0-7, index data is also replicated from the Master to the Replica, but it does not automatically check if both sides have the same index settings.The following Knowledgebase article contains further information on this:

Further details on Database Replication and how it can be configured, may be found in our documentation:

MarkLogic Content Pump (mlcp)

Depending on your specific requirements, you may also like to make use of the MarkLogic Content Pump (mlcp), which is a command line tool for getting data out of and into a MarkLogic Server database. Using mlcp, you can export documents and metadata from a database, import documents and metadata to a database, or copy documents and metadata from one database to another.

If required, you may use mlcp to extract a consistent database snapshot, forcing all documents to be read from the database at a consistent point in time:

Note: the version of mlcp you use should be same as the most recent version of MarkLogic Server that will be used in the transfer.

Also note that mlcp should not be run on a host that is currently running MarkLogic Server, as the Server assumes it has the entire machine available to it, including the CPU and disk I/O capacity.

Further information regarding mlcp is available in our documentation:

Further Information

Related Knowledgebase articles that you may also find useful:

Problem Statement

You have an application running on a particular cluster (the source cluster), devcluster and you wish to port that application to an new cluster (the target cluster) testcluster. Porting the application can be divided into two tasks: configuring the target cluster and copying the code and data. This article is only about porting the configuration.

In an ideal world, the application is managed in an "infrastructure as code" manner: all of the configuration information about that cluster is codified in scripts and payloads stored in version control and able to be "replayed" at will. (One way to assure that this is the case is to configure testing for the application in a CI environment that begins by using the deployment scripts to configure the cluster.)

But in the real world, it's all too common for some amount of "tinkering" to have been performed in the Admin UI or via ad hoc calls to the Rest Management API (RMA). And even if that hasn't happened, it's not generally possible to be certain that's the case, so you still have to worry that it might have happened.

Migrating the application

The central theme in doing this "by hand" is that RMA payloads are re-playable. That is, the payload you GET for the properties of a resource is the same as the payload that you PUT to update the properties of that resource.

If you were going to migrate an application by hand, you'd proceed along these lines.

Determine what needs to be migrated

An application consists (more or less by definition) of one or more application servers. Application servers have databases associated with them (those databases may have additional database associations). Databases have forests.

A sufficiently complex application might have application servers divided into different groups of hosts.

Applications may also have users (for example, each application server has a default user; often, but not always, "​nobody​").

Users, in turn, have roles, and roles may have roles and privileges. Code may have amps that use privileges.

That covers most of the bases, but beware that apps can have additional configuration that should be reviewed: security artifacts (certificates, external securities, protected paths or collections, etc.), mime types, etc.

Get Source Configuration

Using RMA, you can get the properties of all of these resources:

  • Application servers

    Hypothetically, the App-Services application server.

curl --anyauth -u admin:admin \
   http://localhost:8002/manage/v2/servers/App-Services/properties?group-id=Default
  • Groups

    Hypothetically, the Default group.

curl --anyauth -u admin:admin \
   http://localhost:8002/manage/v2/groups/Default/properties
  • Databases

    Hypothetically, the Documents database.

curl --anyauth -u admin:admin \
   http://localhost:8002/manage/v2/databases/Documents/properties
  • Users

    Hypothetically, the ndw user.

curl --anyauth -u admin:admin \
   http://localhost:8002/manage/v2/users/ndw/properties
  • Roles

    Hypothetically, the app-admin role.

curl --anyauth -u admin:admin \
   http://localhost:8002/manage/v2/roles/app-admin/properties
  • Privileges

    Hypothetically, the app-writer execute privilege.

curl --anyauth -u admin:admin \
   "http://localhost:8002/manage/v2/privileges/app-writer/properties?kind=execute"

And the create-document URI privilege.

curl --anyauth -u admin:admin \
   "http://localhost:8002/manage/v2/privileges/create-document/properties?kind=uri"
  • Amps

    Hypothetically, my-amped-function in /foo.xqy in the Modules
    database using the namespace http://example.com/.

curl --anyauth -u admin:admin \
   "http://localhost:8002/manage/v2/amps/my-amped-function/properties\
   ?modules-database=Modules\
   &document-uri=/foo.xqy\
   &namespace=http://example.com"

Create Target Configuration

Some of the properties of a MarkLogic resource may be references to other resources. For example, an application server refers to databases and a role can refer to a privilege. Consequently, if you just attempt to POST all of the property payloads, you may not succeed. The references can, in fact, be circular so that no sequence will succeed.

The easiest way to get around this problem is to simply create all of the resources using minimal configurations: Create the forests (make sure you put them on the right hosts and configure them appropriately). Create the databases, application servers, roles, and privileges. Create the amps. If you need to create other resources (security artifacts, mime types, etc.) create those.

Finally, PUT the property payloads you collected from the source cluster onto the target cluster. This will update the properties of each application server, database, etc. to be the same as the source cluster.

Related Reading

MarkLogic Documentation - Scripting Cluster Management

MarkLogic Knowledgebase - Transferring data between MarkLogic Server clusters

MarkLogic Knowledgebase - Best Practices for exporting and importing data in bulk

MarkLogic Knowledgebase - Deployment and Continuous Integration Tools

Summary:

MarkLogic allows the use of SSL certificates to be used when securing application servers.  This article explains some common issues seen when importing certificates, as well as methods to troubleshoot problems.

Importing a certificate into MarkLogic:

The general procedure for creating and importing a certificate into MarkLogic can be found in the docs here:  http://docs.marklogic.com/guide/admin/SSL#id_42684

For a certificate to be successfully imported, the public key of the signed certificate must match a public key contained in the Certificate Template.  MarkLogic will create a new public/private key par for each Certificate Request that is generated within a Certificate Template.

Troubleshooting:

If you are having an issue where MarkLogic is not accepting the signed certificate you should first verify that your certificate is in PEM format.  If this is not the case, you can use openssl to convert your format to PEM.  Below are examples of how to convert between various formats using openssl.

Convert a DER file to PEM: openssl x509 -inform der -in certificate.cer -out certificate.pem

Convert a P7B file to PEM: openssl pkcs7 -print_certs -in certificate.p7b -out certificate.cer

Convert a PKCS#12 file to PEM: openssl pkcs12 -in keyStore.pfx -out keyStore.pem -nodes

If you are still experiencing issues when attempting to import a signed certificate, you should ensure that the public keys for the certificate request and signed certificate match.  This public key should also match with the key contained in the certificate template.

Use the following commands to extract the public key from the certificate request and signed certificate.

Certificate Request: openssl req -in request.csr -pubkey

Signed Certificate: openssl x509 -in certificate.crt -pubkey

To obtain the public key from the certificate request, you should use the following xquery script.  Note that this script will need to be run against the Security database by a user with admin rights.  The output of this command will also display Private key information.  If you need to provide the output of this command to support, please remove all data in the <pki:private-key> elements.

xquery version "1.0-ml";
import module namespace pki = "http://marklogic.com/xdmp/pki"
at "/MarkLogic/pki.xqy";

let $template-id := pki:template-get-id(pki:get-template-by-name("INSERT-TEMPLATE-NAME"))

return
cts:search(fn:doc(),
cts:element-value-query(xs:QName("pki:template-id"), fn:string($template-id), "exact"))

The output of this script will contain various <pki:public-key> elements.  One of these public keys needs to match with the public key contained in your signed certificate.

Summary

Sometimes, following a manual merge, a number of deleted fragments -- usually small number -- are left behind after the merge completes. In a system that is undergoing steady updates, one will observe that the number of deleted fragments will go up and down, but never go down to zero.

 

Options

There are a couple of approaches to resolve this issue:

  1.  If you have access to the Query Console, you should run xdmp:merge() with an explicit timestamp (e.g. the return value of xdmp:request-timestamp()). This will cause the server to discard all deleted fragments.

  2.  If you do not have access to the Query Console, just wait an hour and do the merge again from the Admin GUI.

 

Explanation

The hour window was added to avoid XDMP-OLDSTAMP errors that had cropped up in some of our internal stress testing, most commonly for replica databases, but also causing transaction retries for non-replica databases.

We've done some tuning of the change since then (e.g. not holding on to the last hour of deleted fragments after a reindex), and we may do some further tuning so this is less surprising to people.

 

Note

The explanation above is for new MarkLogic 7 installations. In case of an upgrade from prior MarkLogic 7 this solution might not work as it requires a divergent approach to split single big stands into 32GB. Please read more in the following knowledge base article Migrating to MarkLogic 7 and understanding the 1.5x disk rule (rather than 3x.

Summary

MarkLogic server monitoring dashboard provides a way to Monitor Disk Usage which is a key monitoring metric. Comparing the disk usage shown on monitoring dashboard with Disk space on filesystem (for example, using df –h) reveals difference between two. This article talks about these differences and reasons behind them.

 

Details

To understand how to use Monitoring dashboard Disk Usage, see our documentation at https://docs.marklogic.com/guide/monitoring/dashboard#id_60621

If you add all disk usage metrics (Fast Data, Large Data, Forest Data, Forest reserve, Free) and compare it with space on your disk (using df -h or other commands) you will see a difference between those two values.

This difference exists mainly because of two reasons:
1. Monitoring history dashboard displays disk space usage excluding Forest journal sizes in MB & GB 
2. On Linux, by default around 5% of the filesystem is reserved for cases where the filesystem fills up to prevent serious problems and for its own purposes. For example for keeping backups of its internal data structures.

 

An example

Consider below example for a host running RHEL 7 with 100GB disk space on filesystem for one database and one forest.

Disk usage as shown by Monitoring dashboard:
Free                   92.46 GB      98.17%
Forest Reserve      1.14 GB       1.21%
Forest Data          0.57 GB        0.60%
Large Data           0.02 GB        0.02%

Total from Monitoring dashboard is around 94.19 GB. When we add the size of Journals (around 1GB for this case), and OS reserve space (5%), the total comes out to be 100GB which is total capacity of disk in this example.

 

On the other hand, consider disk usage as shown by df -h command for filesystem:

Filesystem                    Size Used Avail Use% Mounted on
/dev/mapper/Data1-Vol1 99G 2.1G 92G    3%   /myspace

Adding 5% default OS reserve for Linux gives us total size for this filesystem which is more than 99GB i.e,100 GB appx.

Items of Note

  • The Dashboard:Disk Space uses KB/MB/GB, which means 1 KB = 1000 B, not KiB/MiB/GiB where 1 KiB = 1024 B.
  • The actual disk usage for forests (including Journal sizes) can be confirmed by checking the output of below command from the file system:
    • du --si -h /MarkLogic_Data/Forests/*
      • -h flag is for human readable format
      • --si flag is for using KB/MB/GB instead of the default KiB/MiB/GiB

Conclusion

The reason for difference in metrics on Monitoring dashboard and disk usage for filesystem is because monitoring history does not show Journal size and OS reserve space in the report.

 

Useful Links:

https://docs.marklogic.com/guide/monitoring/dashboard#id_60621

http://serverfault.com/questions/315181/df-says-disk-is-full-but-it-is-not

http://www.walkernews.net/2011/01/22/why-the-linux-df-command-shows-lesser-free-disk-space/

Understanding Forest State Transitions While Putting Forest in Flash-backup mode

When we transition a forest into flash-backup mode, the forest is unmounted and then remounted in read-only mode so no updates can be made. During that process, the forest goes into "start closing" state for a short while (less than a second). During this time, new queries/updates are rejected with a retry exception and running queries are allowed to continue running.

After "start closing", the process enters a "finish closing" state. At this point, all currently running queries will throw a retry exception. Transactions that are in-flight when a forest enters flash backup mode will be retried until they either succeed (when the forest remounts as read-only in the case of read transactions, or when it remounts with read/write in the case of update transactions), or they hit the timeout limit.

New transactions are continually retried until they either succeed (when the forest comes back up read-only in the case of read transactions, or when it comes back up read/write in the case of update transactions), or timeout.

During flash-backup, if nested transactions are taking place, MarkLogic will attempt to retry only the transaction that receives the exception, because it's possible that the exception applies only to that transaction. In the case where a forest is closing, MarkLogic will throw one of the following exceptions to indicate the state of the forest at the time:

  • XDMP-FORESTNOT
  • XDMP-FORESTMNT
  • XDMP-UPDATESNOTALLOWED

In the case of nested transactions - and for the above three exceptions - the transaction will not process the exception but instead will pass it up the stack. The net result is that the three exceptions will cause the outer transaction to retry rather than the inner transaction, releasing its hold on the forest and allowing it to close. The product has been designed to work in this way to prevent the flash-backup process from being held up by any nested trasactions that could be in-flight at the time.

If you can recreate the same condition, enabling the following diagnostic trace events should provide a wealth of useful information for deeper analysis of the underlying issue:

  • Forest State
  • Forest Label
  • Forest Constructor
  • Forest Destructor
  • Forest Startup
  • Forest Shutdown
  • Forest Mount
  • Forest Unmount
  • Forest Open
  • Forest Close

If you are unfamiliar with diagnostic trace events, more information is available in this Knowledgebase article

Summary

An XDMP-DBDUPURI error will occur if the same URI occurs in multiple forests of the same database. This article explains how this condition can occur and describes a number of strategies to help prevent and fix them.

Under normal operating conditions, duplicate Uris are not allowed to occur, but there are ways that programmers and administrators can bypass the server safeguards. Since duplicate Uris are considered a form of corruption, any query that encounters one will fail and post an error similar to the following:

XDMP-DBDUPURI: URI /foo.xml found in forests Library06 and Library07

We will begin by exploring the different ways that duplicate Uris can be created. Once we understand how this situation can occur, we will discuss how to prevent it from happening in the first place.  We will also discuss ways to resolve the XDMP-DBDUPURI error when it does occur.

How Administrators Can Cause Duplicate Uris

There are several administrative actions that can result in duplicate Uris:

1. By detaching a forest from its parent database (for administrative purposes - e.g., backup, restore) while allowing updates to continue on the database. If an update is committed to an Uri that exists on the detached forest, the database will create a new Uri on a different forest. When the forest is re-attached to the database, you will have duplicates of these Uris.

2. By detaching a forest from database-1 and then attaching it to database-2. Database-2 may already have some of the Uris that the new forest contains, including directory Uris such as "/".

3. By by doing a forest restore from forest-a to forest-b, where the database that contains forest-b already has some Uris that also exist on forest-a.

Prevention

To prevent case #1: Instead of detaching the forest to perform administrative operations, put the forests in read-only mode instead. You can do this by setting 'updates-allowed' to 'read-only' in the forest settings. This will let the database know that a given Uri exists, but will disallow updates on it, thus preventing any duplicates from beuing created.

Case #2 can be prevented by not using forest attach/detach for content migration between databases.  There are other alternatives such as replication.

The best way to avoid case #3 is by using database, rather then forest restore. If you must use forest restore, make sure to use an Admin API script that double-checks that any given forest backup is being restored to the corresponding restore target. Be sure to test your script thoroughly before deploying to production.

How Programmers Can Create Duplicate Uris

There are several ways that programmers can create duplicate Uris:

1. By using an xdmp:eval() to insert content with one or more forests set in the database option. We normally check whether a Uri exists in all forests before inserting, but xdmp:eval bypasses that safeguard.

2. By using the OUTPUT_FAST_LOAD option in the MapReduce connector.

3. By loading content with the database 'locking' option set to 'off.'

Prevention

To prevent case #1, avoid using 'place keys' (specifying a forest in the database option) during document inserts. This will allow the database to decide where the document goes and thereby prevent duplicates. You can also use the API xdmp:document-assign() to figure out where xdmp:document-insert() would place that Uri, and then pass that value in the xdmp:eval(), e.g., in the if-eval function below, you can either use a hardcoded forest name:

define function local:if-eval($xquery as xs:string,$vars as item()*,$forest as xs:unsignedLong) {

       xdmp:eval

       (

          $xquery,

          $vars,

          <options xmlns="xdmp:eval">

              <isolation>different-transaction</isolation>

              <database>{$forest}</database>

          </options>

        )

    };

    Local:if-eval("xdmp:document-insert('/foo1.xml', <foo>1</foo>)", (), xdmp:forest("Sciam"))

Or you can call it using the output of the xdmp:document-assign() function, which prevents duplicate URIs:

    let $forest :=

         let $forests := xdmp:database-forests(xdmp:database())

       let $index := xdmp:document-assign("document-1.xml", count($forests))
       return $forests[$index]

    return

       Local:if-eval("xdmp:document-insert('/foo1.xml', <foo>1</foo>)", (), xs:unsignedLong($forest))

To prevent case #2, use the default settings for ContentOutputFormat when using the MarkLogic Connector for Hadoop. Here is the explanation from the documentation:

To prevent duplicate URIs, the MarkLogic Connector for Hadoop defaults to a slower protocol for ContentOutputFormat when it detects the potential for updates to existing content. In this case, MarkLogic Server manages the forest selection, rather than the MarkLogic Connector for Hadoop. This behavior guarantees unique URIs at the cost of performance.

You may override this behavior and use direct forest updates by doing the following:

  • Set mapreduce.marklogic.output.content.directory. This guarantees all inserts will be new documents. If the output directory already exists, it will either be removed or cause an error, depending on the value ofmapreduce.marklogic.output.content.cleandir.
  • Set mapreduce.marklogic.output.content.fastload to true. When fastload is true, the MarkLogic Connector for Hadoop always optimizes for performance, even if duplicate URIs are possible.

You can safely set mapreduce.marklogic.output.content.fastload to true if the number of forests in the database will not change while the job runs, and at least one of the following is true:

  • Your job only creates new documents. That is, you are certain that the URIs do not exist in any document or property fragments in the database.
  • The URIs output with ContentOutputFormat may already be in use, but both these conditions are true:
  • The in-use URIs were not originally inserted using forest placement.
  • The number of forests in the database has not changed since initial insertion.
  • You set mapreduce.marklogic.output.content.directory.

For case #3, be sure to use use either the 'fast' or the 'strict' locking option on your target database when loading content. From the documentation:

[This option] Specifies how robust transaction locking should be. When set to strict, locking enforces mutual exclusion on existing documents and on new documents. When set to fast, locking enforces mutual exclusion on existing and new documents. Instead of locking all the forests on new documents, it uses a hash function to select one forest to lock. In general, this is faster than strict. However, for a short period of time after a new forest is added, some of the transactions need to be retried internally. When set to off, locking does not enforce mutual exclusion on existing documents or on new documents; only use this setting if you are sure all documents you are loading are new (a new bulk load, for example), otherwise you might create duplicate URIs in the database.

It is OK to use the 'off' setting only if performing a new bulk load onto a fresh database.

Repairing Duplicate Uris

Once you encounter duplicate URIs, you will need to delete them as soon as possible. Here are scripts that will help you do the job:

1. The first script helps you to view the document singled out in the error message:

    (: Script for viewing duplicate document/properties fragments :)

    xquery version "1.0-ml";

    let $doc := "/" (: DUPLICATE URI :)

    let $forest-a-name := "forest_00"

    let $forest-b-name := "forest_01"

    let $query :=

        'xquery version "1.0-ml";

        declare variable $URI as xs:string external;

        (xdmp:document-properties($URI),fn:doc($URI))'

    let $options-a := <options xmlns="xdmp:eval"><database>{xdmp:forest($forest-a-name)}</database></options>

    let $options-b := <options xmlns="xdmp:eval"><database>{xdmp:forest($forest-b-name)}</database></options>

    let $results-a := xdmp:eval($query,(xs:QName("URI"),$doc),$options-a)

    let $results-b := xdmp:eval($query,(xs:QName("URI"),$doc),$options-b)

    return

        (  fn:concat("RESULTS FROM : ",$forest-a-name), $results-a, fn:concat("RESULTS FROM : ",$forest-b-name), $results-b    )

 

2. The second script allows you to delete a duplicate document or property:

    (: Delete the duplicate documents :)

    xquery version "1.0-ml";

    let $doc := "/" (: DUPLICATE URI :)

    (: BAD FOREST :)

    let $forest-name := "forest_00"

    let $query :=

        'xquery version "1.0-ml";

         declare variable $URI as xs:string external;

         xdmp:document-delete($URI)'

    let $options := <options xmlns="xdmp:eval"><database>{xdmp:forest($forest-name)}</database></options>

    return xdmp:eval($query,(xs:QName("URI"),$doc),$options)

 

3. This script helps you delete a duplicate directory:

    (: Script for deleting duplicate directory fragments :)

    xquery version "1.0-ml";

    let $doc := "/" (: DUPLICATE URI :)

    let $forest-name := "forest_00"

    let $query :=

        'xquery version "1.0-ml";

         declare variable $URI as xs:string external;

         xdmp:node-delete(xdmp:document-properties($URI))'

    let $options := <options xmlns="xdmp:eval"><database>{xdmp:forest($forest-name)}</database></options>

    return xdmp:eval($query,(xs:QName("URI"),$doc),$options)

 

4. If you need to find duplicate uris, this script will show duplicate documents:

    (: Script for finding duplicate documents. :)

    xquery version "1.0-ml";

    for $uri at $i in cts:uris ((), ('frequency-order', 'descending', 'document'))

    let $freq := cts:frequency ($uri)

    where $freq > 1

    return ($uri||': '||$freq)

Introduction

There are two ways of leveraging SSDs that can be used independently or simultaneously.

Fast Data Directory

In the forest configuration for each forest, you can configure a Fast Data Directory. The Fast Data Directory is designed for fast filesystems such as SSDs with built-in disk controllers. The Fast Data Directory stores the forest journals and as many stands as will fit onto the filesystem; if the forest never grows beyond the size of the Fast Data Directory, then the entire forest will be stored in that directory. If there are multiple forests on the same host that point to the same Fast Data Directory, MarkLogic Server divides the space equally between the different forests.

See Disk Storage.

Tiered Storage (licensed feature)

MarkLogic Server allows you to manage your data at different tiers of storage and computation environments, with the top-most tier providing the fastest access to your most-critical data and the lowest tier providing the slowest access to your least-critical data. As data ages and becomes less updated and queried, it can be migrated to less expensive and more densely packed storage devices to make room for newer, more frequently accessed and updated data.

See Tiered Storage.

 

Introduction

Several customers have contacted support with questions regarding the use of a tool such as CoRB to "post-process" a large amount of data which they have already stored in MarkLogic.

Here's a brief CoRB tutorial based on some of the questions the support team has been asked - made available as a support KnowledgeBase article in the hope that it might be useful to other customers.

What is CoRB and when would I need to use it?

Start by looking here for the README

CoRB is an open source Java application (available here on github). It's a popular tool for anyone wanting to query a group of candidate documents based on a specific criteria (a specific forest, a date range, a collection, any cts-query) and pass them to a secondary module to perform some transformation to those documents and to update them in place in their forest on-disk.

As MarkLogic is a great repository for unstructured and semi-structured data, being able to revisit documents and perform bulk updates on them can be very useful.

This article offers a simple getting started guide so you can test CoRB out in a development environment and get an idea as to how it could help you to manage your data.

Prerequisites

As CoRB is a Java application, you'll need to ensure you have a JRE installed.

For your convenience, we have provided a zip file containing all the necessary files for you to get up and running, but you may want to replace the xcc.jar with the version that matches your server; look in our maven repository for the one that matches your server version.

Using CoRB: a step-by-step walkthrough

1. Create an XDBC Server with the following values:

root
/
port
9999
modules
Modules
database
Documents

2. Create some sample data in an empty database

I'm using 'Documents' for this example. Ensure this database has the uri lexicon enabled and make sure it's selected as the content source in query console:

for $i in 1 to 2000
return 
xdmp:document-insert(concat($i, ".xml"), 
    element doc {
        element id {$i},
        element created {fn:current-dateTime()}
    }
)

3. CoRB requires 2 modules to function:

  • a module to select the candidate URIs to process
  • a module to process each doc with a matching candidate URI

4. A simple "select" module (get-uris.xqy):

xquery version "1.0-ml";

let $uris := cts:uris('', 'document')
return (count($uris), $uris)

5. A simple processor module which adds an <updated> element to each document with a timestamp (transform-docs.xqy):

xquery version "1.0-ml";

declare variable $URI as xs:string external;

xdmp:node-insert-child(doc($URI)/doc, element updated {fn:current-dateTime()} )

6. Download and unpack corb.zip (attached)

7. From a command prompt, cd to the folder where corb.zip was unpacked and run corb.bat (Windows users) or ./corb.sh (Linux/Solaris/OS X users)

8. You should see logging to stdout.

On completion, you should see a line like this:

INFO: completed all tasks 2000/2000, 159 tps, 0 active threads

9. Examine a document in the database to ensure you see the <updated> element:

<doc>
  <id>1038</id>
  <created>2012-06-28T12:16:10.739+01:00</created>
  <updated>2012-06-28T12:16:23.812+01:00</updated>
</doc>

CoRB Questions and Answers

Q: Can we call and run CORB from an XQuery module? As it's command line based, can you write some XQuery that executes the CoRB batch file via the command line? Is there some other way we could invoke this?

A: Unfortunately there's no way to execute CoRB from an XQuery Module. CoRB is a Java application and - at the time of writing - there's no mechanism in the server to create separate Java (or command line) processes.

If you needed a way of running CoRB at intervals (say: hourly, daily, weekly etc), you could explore using the Windows Task Scheduler or in Unix variants: cron. You could adapt the URI query to look (for example) for documents that have been altered within the last X number of hours and run a CoRB job against just those documents.

However, it may be the case that you can achieve the effect you need much more effectively by using a combination of triggers and spawning tasks using MarkLogic's Task Server.

While a discussion of scheduled tasks is outside the realm of this article, you can read up about the MarkLogic task server here and for some open code offering an example of how a task such as rebalancing data evenly across forests can be achieved, you can look here.

Q: Can we make CoRB run across a cluster? If so: how would we go about configuring this?

A: The CoRB readme has a section called "Writing a Custom URI Module" which shows the general structure of the URI query - which uses a call to the cts:uris function:

http://docs.marklogic.com/5.0doc/docapp.xqy#search.xqy?start=1&cat=all&query=cts:uris

The fifth argument you can pass to a call to cts:uris is a list of Forest IDs, so one way - and possibly the simplest - would be to provide a modified URI module which is restricted in that it only returns fragment URIs for that particular (local) host.

Attached is an example custom URI Module (local-cts-uris.xqy) that uses a call to xdmp:host() to get the particular host ID for that connection and then to only return forest ids for that host. As with all code provided in these articles, the usual disclaimer applies, you should test thoroughly on a development cluster or on a test database to make sure it's doing what you need before testing it in a production environment.

You'll also need to provide a database name (currently it's set to xdmp:database("YOUR_DATABASE_HERE")) for safety.

The provided module can be used as a basis for your own URI module; you could run CoRB on every host in your cluster (making sure CoRB was configured to use only connect to the local host xcc://usr:pass@localhost:port on each instance in the cluster). That way each instance of CoRB would only be responsible for the documents stored in forests local to the host.

Another way would be to hand write specific URI modules to restrict each host to only process URIs for a given group of forest ids. For example, you might just want to have 2 instances of CoRB running, each one being responsible for a specific number of forests (these could equally span over multiple hosts)

Another way would be to write some XQuery to generate these separate URI modules for each host in the cluster, which you could achieve by looping through all hosts (xdmp:hosts()) in the cluster and then using xdmp:save to write out the URI module containing the generated cts:uris query with the corresponding forest ids. I could see you wanting to do this if you wanted to introduce complex uris queries (rather than a simple "catch all" process as discussed earlier) and always ensure you were targeting specific forest ids.

Introduction

MarkLogic Server is a highly scalable, high performance Enterprise NoSQL database platform. Configuring a MarkLogic cluster to run as virtual machines follows tuning best practices associated with highly distributed, high performance database applications. Avoiding resource contention and oversubscription is critical for maintaining the performance of the cluster. The objective of this guide is to provide a set of recommendations for configuring virtual machines running MarkLogic for optimal performance. This guide is organized into sections for each computing resource, and provides a recommendation along with the rationale for that particular recommendation. The contents of this guide are intended for best practice recommendations and are not meant as a replacement for tuning resources based on specific usage profiles. Additionally, several of these recommendations trade off performance for flexibility in the virtualized environment.

General

Recommendation: Use the latest version of Virtual Hardware

The latest version of Virtual Hardware provides performance enhancements and maximums over older Virtual Hardware versions. Be aware that you may have to update the host, cluster or data center. For example, ESXi 7.0 introduces virtual hardware version 17, but VMs imported or migrated from older versions may not be automatically upgraded.

Recommendation: Use paravirtualized device drivers in the guest operating system

Paravirtualized hardware provides advanced queuing and processing off-loading features to maximize Virtual Machine performance. Additionally, paravirtualized drives provide batching of interrupts and requests to the physical hardware, which provides optimal performance for resource intensive operations.

Recommendation: Keep VMware Tools up to date on guest operating systems

VMware Tools provides guest OS drivers for paravirtual devices that optimize the interaction with VMkernel and offload potentially processor-intensive tasks such packet segmentation.

Recommendation: Disable VMWare Daemon Time Synchronization of the Virtual Machine

By default the VMWare daemon will synchronize the Guest OS to the Host OS (Hypervisor) once per minute, and may interfere with ntpdor chronyd settings. Through the VMSphere Admin UI, you can disable time synchronization between the Guest OS and Host OS in the virtual machine settings.

VMWare Docs: Configuring Virtual Machine Options

Recommendation: Disable Time Synchronization during VMWare operations

Even when daemon time synchronization is disabled, time synchronization will still occur during some VMWare operations such as, Guest OS boots/reboots, resuming a virtual machine, among others. Disabling VMWare clock sync completely requires editing the .vmx for the virtual machine to set several synchronization properties to false. Details can be found in the following VMWare Blog:

VMWare Blog: Completely Disable Time Synchronization for your VM

Recommendation: Use the noop scheduler for VMWare instances rather than deadline

The NOOP scheduler is a simple FIFO queue and uses the minimal amount of CPU/instructions per I/O to accomplish the basic merging and sorting functionality to complete the I/O.

Red Hat KB: IO Scheduler Recommendations for Virtualized Hosts

Recommendation: Remove any unused virtual hardware devices

Each virtual hardware device (Floppy disks, CD/DVD drives, COM/LPT ports) assigned to a VM requires interrupts on the physical CPU; reducing the number of unnecessary interrupts reduces the overhead associated with a VM.

Processor                                                                                                     

Socket and Core Allocation

Recommendation: Only allocate enough vCPUs for the expected server load, keeping in mind the general recommendation is to maintain two cores per forest.

Rationale: Context switching between physical CPUs for instruction execution on virtual CPUs creates overhead in the hypervisor.

Recommendation: Avoid oversubscription of physical CPU resources on hosts with MarkLogic Virtual Machines. Ensure proper accounting for hypervisor overhead, including interrupt operations, storage network operations, and monitoring overhead, when allocating vCPUs.

Rationale: Oversubscription of physical CPUs can cause contention for process intensive operations on in MarkLogic. Properly accounting will ensure adequate CPU resources are available for both the hypervisor and any MarkLogic Virtual Machines.

 

Memory                                                                                                      

General

Recommendation: Set up memory reservations for MarkLogic Virtual Machines.

Rationale: Memory reservations guarantee the availability of Virtual Machine memory when leveraging advanced vSphere functions such as Dynamic Resource Scheduling. Creating a reservation reduces the likelihood that MarkLogic Virtual Machines will be vMotioned to an ESX host with insufficient memory resources.

Recommendation: Avoid combining MarkLogic Virtual Machines with other types of Virtual Machines.

Rationale: Overcommitting memory on a single ESX host can result in swapping, causing significant performance overhead. Additionally, memory optimization techniques in the hypervisor, such as Transparent Page Sharing, rely on Virtual Machines running the same operating systems and processes.

Swapping Optimizations

Recommendation: Configure VM swap space to leverage host cache when available.

Rationale: During swapping, leveraging the ESXi hosts local SSD for swap will likely be substantially faster than using shared storage. This is unavailable when running a Fault Tolerant VM or using vMotion, but will provide a performance improvement for VMs in an HA cluster.

Huge / Large Pages

Recommendation: Configure Huge Pages in the guest operating system for Virtual Machines.

Rationale: Configuring Huge Pages in the guest operating system for a Virtual Machine prioritizes swapping of other memory first.

Recommendation: Disable Transparent Huge Pages in Linux kernels.

Rationale: The transparent Huge Page implementation in the Linux kernel includes functionality that provides compaction. Compaction operations are system level processes that are resource intensive, potentially causing resource starvation to the MarkLogic process. Using static Huge Pages is the preferred memory configuration for several high performance database platforms including MarkLogic Server.

 

Disk                                                                                                                        

General

Recommendation: Use Storage IO Control (SIOC) to prioritize MarkLogic VM disk IO.

Rationale: Several operations within MarkLogic require prioritized, consistent access to disk IO for consistent operation. Implementing a SIOC policy will help guarantee consistent performance when resources are contentious across multiple VMs accessing disk over shared links.

Recommendation: When possible, store VMDKs with MarkLogic forests on separate aggregates and LUNs.

Rationale: Storing data on separate aggregates and LUNs will reduce disk seek latency when IO intensive operations are taking place – for instance multiple hosts merging simultaneously.

Disk Provisioning

Recommendation: Use Thick Provisioning for MarkLogic data devices

Rationale: Thick provisioning prevents oversubscription of disk resources.  This will also prevent any issues where the storage appliance does not automatically reclaim free space, which can cause writes to a LUN to fail.

NetAPP Data ONTAP Discussion on Free Space with VMWare

SCSI Adapter Configuration

Recommendation: Allocate a SCSI adapter for guest operating system files and database storage independently. Additionally, add a storage adapter per tier of storage being used when configuring MarkLogic (i.e., an isolated adapter with a virtual disk for fast data directory).

Rationale: Leveraging two SCSI adapters provides additional queuing capacity for high IOPS virtual machines. Isolating IO also allows tuning of data sources to meet specific application demands.

Recommendation: Use paravirtualized SCSI controllers in Virtual Machines.

Rationale: Paravirtualized SCSI controllers reduce management overhead associated with operation queuing.

Virtual Disks versus Raw Device Mappings

Recommendation: Use Virtual Disks rather than Raw Device Mappings.

Rationale: VMFS provides optimized block alignment for virtual machines. Ensuring that MarkLogic VMs are placed on VMFS volumes with sufficient IO throughput and dedicated physical storage reduces management complexity while optimizing performance.

Multipathing

Recommendation: Use round robin multipathing for iSCSI, FCoE, and Fibre Channel LUNs.

Rationale: Multipathing allows the usage of multiple storage network links; using round robin ensures that all available paths will be used, reducing the possibility of storage network saturation.

vSphere Flash Read Cache

Recommendation: Enabling vSphere Flash Read Cache can enhance database performance. When possible, a Cache Size of 20% of the total database size should be configured.

Rationale: vSphere Flash Read Cache provides read caching for commonly accessed blocks. MarkLogic can take advantage of localized read cache for many operations including term list index resolution. Additionally, offloading read requests from the backing storage array reduces contention for write operations.

 

Network                                                                                                          

General

Recommendation: Use a dedicated physical NIC for MarkLogic cluster communications and a separate NIC for application communications. If multiple NICs are unavailable, use separate VLANs for cluster and application communication.

Rationale: Separating communications ensures optimal bandwidth is available for cluster communications while spreading the networking workload across multiple CPUs.

Recommendation: Use dedicated physical NICs for vMotion traffic on ESXi hosts running MarkLogic. If additional physical NICs are unavailable, move vMotion traffic to a separate VLAN.

Rationale: Separating vMotion traffic onto separate physical NICs, or at the very least a VLAN, reduces overall network congestion while providing optimal bandwidth for cluster communications. Additionally, NIOC policies can be configured to ensure resource shares are provided where necessary.

Recommendation: Use dedicated physical NICs for IP storage if possible. If additional physical NICs are unavailable, move IP storage traffic to a separate VLAN.

Rationale: Separating IP storage traffic onto separate physical NICs, or at the very least a VLAN, reduces overall network congestion while providing optimal bandwidth for cluster communications. Additionally, NIOC policies can be configured to ensure resource shares are provided where necessary.

Recommendation: Use Network I/O Control (NIOC) to prioritize MarkLogic inter-cluster communication
traffic.

Rationale: Since MarkLogic is a shared-nothing architecture, guaranteeing consistent network communication between nodes in the cluster provides consistent and optimal performance.

 

Network Adapter Configuration

Recommendation: Use enhanced vmxnet virtual network adapters in Virtual Machines.

Rationale: Enhanced vmxnet virtual network adapters can leverage both Jumbo Frames and TCP Segmentation Offloading to improve performance. Jumbo Frames allow for an increased MTU, reducing TCP header transmission overhead and CPU load. TCP Segmentation Offloading allows packets up to 64KB to be passed to the physical NIC for segmentation into smaller TCP packets, reducing CPU overhead and improving throughput.

Jumbo Frames

Recommendation: Use jumbo frames with MarkLogic Virtual Machines, ensuring that all components of the physical network support jumbo frames and are configured with an MTU of 9000.

Rationale: Jumbo frames increase the payload size of each TCP/IP frame sent across a network. As a result, the number of packets required to send a set of data is reduced, reducing overhead associated with the header of a TCP/IP frame. Jumbo frames are advantageous for optimizing the utilization of the physical network. However, if any components of the physical network do not support jumbo frames or are misconfigured, large frames are broken up and fragmented causing excessive overhead.

 


Analyzing Resource Contention                                                                               


Processor Contention

Virtual Machines with CPU utilization above 90% and CPU ready metrics of 20% or higher, and really any CPU ready time, are contentious for CPU.

Key metric for processor contention is %RDY.

Memory Contention

Metrics for memory contention requires an understanding of VMware memory management techniques.

. Transparent Page Sharing
. Enabled by default in the hypervisor
. Deduplicates memory pages between virtual machines on a single host

 

. Balloon Driver
. Leverages the guest operating systems internal swapping mechanisms
. Implemented as a high-priority system process that balloons to consume memory, forcing the operating system to swap older pages
. Indicated by the MEMCTL/VMMEMCTL metric

 

. Memory Page Compression
. Only enabled when memory becomes constrained on the host
. Breaks large pages into smaller pages then compresses the smaller pages
. Can generate up to a 2:1 compression ratio
. Increases processor load during reading and writing pages
. Indicated by the ZIP metric

 

. Swapping
. Hypervisor level swapping
. Swapping usually happens to the vswp file allocated for the VM
. Storage can be with the VM
. Storage can be in a custom area, local disk on the ESXi host for instance

 

. Can use SSD Host Cache for faster swapping, but still very bad
. Indicated by the SWAP metric

 Free memory metrics less than 6% or memory utilization above 94% indicate VM memory contention.


Disk Contention

Disk contention exists if the value for kernelLatency exceeds 4ms or deviceLatency exceeds 15ms. Device latencies greater than 15ms indicate an issue with the storage array, potentially an oversubscription of LUNs being used by VMFS or RDMs on the VM, or a misconfiguration in the storage processor. Additionally, a
queueLength counter greater than zero may indicate a less than optimal queue size set for an HBA or queuing on the storage array.

Network Contention

Dropped packets, indicated by the droppedTx and droppedRx metrics, are usually a good sign of a network bottleneck or misconfiguration.

High latency for a virtual switch configured with load balancing can indicate a misconfiguration in the selected load balancing algorithm. Particularly if using the IP-hash balancing algorithm, check the switch to ensure all ports on the switch are configured for EtherChannel or 802.3ad. High latency may also indicate a
misconfiguration of jumbo frames somewhere along the network path. Ensure all devices in the network have jumbo frames enabled.

References                                                                                                                                        

VMware Performance Best Practices for vSphere 5.5 - http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.5.pdf

VMware Resource Management - http://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdf

Summary

Below are some recommendations for packages that should be installed on all Linux hosts (RHEL/SUSE/CentOS) when installing MarkLogic Server, plus a brief description of what the package does and why we recommend installing it

Recommended Packages

glibc.i686
Any Unix-like operating system needs a C library: the library which defines the system calls and other basic facilities such as open, malloc, printf, exit, etc. The GNU C Library is used as the C library in the GNU systems and most systems with the Linux kernel. RHEL 6.0 requires this to be installed for a dependency to be met on installation - libc.so.6(GLIBC_2.4). http://www.gnu.org/software/libc/
gdb
GDB, the GNU Project debugger, allows you to see what is going on inside another program while it executes - or what another program was doing at the moment it crashed. http://sources.redhat.com/gdb/
redhat-lsb
The Linux Standards Base (LSB) is an attempt to develop a set of standards that will increase compatibility among Linux distributions. The redhat-lsb package provides utilities needed for LSB Compliant Applications. It also contains requirements that will ensure all components required by the LSB that are provided by Red Hat Linux are installed on the system. http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.1_Technical_Notes/redhat-lsb.html.  If you are manually installing MarkLogic on Amazon Linux 2, system-core-lsb is installed by default, but requires a soft link to be created from /etc/redhat-lsb to /etc/system-lsb:  ln -s /etc/system-lsb /etc/redhat-lsb
pstack
pstack - print a stack trace of running processes http://linuxcommand.org/man_pages/pstack1.html
sysstat
The sysstat utilities are a collection of performance monitoring tools for Linux. These include sar, sadf, mpstat, iostat, nfsiostat, cifsiostat, pidstat and sa tools. http://sebastien.godard.pagesperso-orange.fr/
libltdl  
Libtool provides a small library, called libltdl, that aims at hiding the various difficulties of dlopening libraries from programmers. It consists of a few headers and small C source files that can be distributed with applications that need dlopening functionality. On some platforms, whose dynamic linkers are too limited for a simple implementation of libltdl services, it requires GNU DLD, or it will only emulate dynamic linking with libtool’s dlpreopening mechanism. https://www.gnu.org/software/libtool/manual/html_node/Using-libltdl.html

Further Reading

Supported Platforms

In MarkLogic Server 5.0, database replication is compatible with local-disk failover, while flexible replication is compatible with both local- and shared-disk failover.

In MarkLogic Server 4.2, flexible replication is compatible with both local- and shared-disk failover.

Summary

Each node in a cluster communicates with all of the other nodes in the cluster at periodic intervals. This periodic communication, known as a heartbeat, circulates key information about host status and availability between the nodes in a cluster. Through this mechanism, the cluster determines which nodes are available and communicates configuration changes with other nodes in the cluster. If a node goes down for some reason, it will stop sending heartbeat packets to the other nodes in the cluster.

Cluster Heartbeat

The cluster uses the heartbeat to determine if a node in the cluster is down. A heartbeat message from a given node contains its view of the current state of the cluster at the moment of the heartbeat was generated. The determination of a down node is based on a vote from each node in the cluster. In order to vote a node out of the cluster, there must be a quorum of nodes voting to remove a node.

A quorum occurs if more than 50% of the total number of nodes in the cluster (including any nodes that are down) vote the same way. The voting that each host performs is done based on how long it has been since it last had a heartbeat from the other node. If at least half of the nodes in the cluster determine that a node is down, then that node is disconnected from the cluster. The wait time for a host to be disconnected from the cluster is typically considerably longer than the time for restarting a host, so restarts should not cause hosts to be disconnected from the cluster (and therefore they should not cause forests to fail over).

There are group configuration parameters to determine how long to wait before removing a node (for details, see XDQP Timeout, Host Timeout, and Host Initial Timeout Parameters).

Each node in the cluster continues listening for the heartbeat from the disconnected node to see if it has come back up, and if a quorum of nodes in the cluster are getting heartbeats from the node, then it automatically rejoins the cluster.

The heartbeat mechanism allows the cluster to recover gracefully from things like hardware failures or other events that might make a host unresponsive. This occurs automatically, without any human intervention; machines can go down and automatically come back up without requiring intervention from an administrator.

Hosts with Content Forests

If the node that goes down hosts content in a forest, then the database to which that forest belongs will go offline until the forest either comes back up or is detached from the database. 

If you have failover enabled and configured for the forest whose host is removed from the cluster, the forest will attempt to fail over to a secondary host (that is, one of the secondary hosts will attempt to mount the forest). Once that occurs, the database will come back online.

For shared disk failover, there is an additional failover criteria that could prevent a forest from failing over. The forest's label file is updated regularly by the host that is managing the forest. To avoid data corruption of the data on the shared file system, the forest will not fail over when the forest is being actively managed - i.e. the forest's label file time stamp is checked to ensure that the forest is not currently being actively managed. This could occur in the situation where a host is isolated from the other nodes in the cluster, but still can access the forest data (on shared disk).

Tips for Handling Failover on A Busy Cluster

This Knowledgebase Article contains a good discussion about how to handle failover that is occuring frequently when your cluster hosts sometimes are too busy to respond in a timely manner. The section on "Improving the situation" contains step-by-step instructions for group and database settings that are tuned for a very busy cluster.