Community

MarkLogic 10 and Data Hub 5.0

Latest MarkLogic releases provide a smarter, simpler, and more secure way to integrate data.

Read Blog →

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up →

 
Knowledgebase

Introduction

This article discusses the "Stand s has n fragments" messages that may appear in error log or system log files. These messages can appear at different log levels (Notice, Warning, Error, Critical, Alert, and Emergency) as the severity will increase as the number of fragments in a single stand increases, indicating increasing risk. 

Fragment counts and their corresponding Log levels:

 In MarkLogic 8 and MarkLogic 9, the fragment count thresholds within a single stand for the log levels are:  

  • At around 84 million fragments, MarkLogic Server will report this with a Notice level log message
  • At around 109 million fragments, MarkLogic Server will report this with a Warning level log message
  • At around 134 million fragments, MarkLogic Server will report this with an Error level log message
  • At around 159 million fragments, MarkLogic Server will report this with a Critical level log message
  • At around 184 million fragments, MarkLogic Server will report this with an Alert level log message
  • At around 209 million fragments, MarkLogic Server will report this with an Emergency level log message

At 256 million fragments your data may be at risk of becoming corrupted due to integer overflow. The log level reflects the risk and is intended to get your attention at higher stand fragment counts.

Emergency level log entries

Consider an example Error Log entry where the following information is observed:

2015-06-20 10:13:39.746 Emergency: Stand /space/Data/Forests/App-Services/00000fae has 213404541 fragments.

At all levels, the messages should be monitored and managed, but at the Emergency level, you will need to take corrective action soon.  

Corrective Actions

Note that it is the number of fragments in a stand that is important, not the number of fragments in a forest.  The actions that you take should act to decrease the size of stands in a forest. 

Some of the actions you can take:

  • If not already configured, MarkLogic databases should be configured with a merge-max-size value smaller than the current forest size (Databases created in MarkLogic 7 or MarkLogic 8 have a default value of 32GB).
  • If merge-max-size already configured for the database, decrease the value of this setting. 

Summary

Occasionally, you might see an "Invalid Database Online Event" error in your MarkLogic Server Error Log. This article will help explain what this error means, as well as provide some ways to resolve it.

What the Error Means

The XDMP-INVDATABASEONLINEEVENT means that something went wrong during the database online trigger event. There are many situations that can trigger this event, such as a server-restart, or when any of the databases has a change in configuration). In most cases, this error is harmless - it is just giving you information.

Resolving the Error

We often see this error when the user id that is baked into the database online event created by CPF is no longer valid, and the net effect is that CPF's restart handling is not functioning. We believe reinstalling CPF should fix this issue.

If re-installing CPF does not resolve this error, you will want to further analyze and debug the code that is invoked by the restart trigger.

 

 

 

Details:

Upon boot of CentOS 6.3, MarkLogic users may encounter the following warning:

:WARNING: at fs/hugetlbfs/inode.c:951 hugetlb_file_setup+0x227/0x250() (Not tainted)

MarkLogic 6.0 and earlier have not been certified to run on CentOS 6.3. This messages is due to MarkLogic using a resource that has been depreciated in CentOS 6.3. The message can be ignored, as it will not cause any issues with MarkLogic performance. Although this example points specifically points out CentOS 6.3, this message could potentially occur in other MarkLogic/Linux combinations.

Introduction

Some customers have reported seeing kernel level messages like this in their /var/log/messages file:

Jan 31 17:41:46 ml-c1-u3 kernel: [17467686.201893] TCP: Possible SYN flooding on port 7999. Sending cookie

This may also be seen as part of the output from a call to dmesg and could possibly follow a stack trace, for example:

[<ffffffff810d3d27>] ? audit_syscall_entry+0x1d7/0x200 
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b possible SYN flooding on port 7999. Sending cookies. possible SYN flooding on port 7999. Sending cookies.

What does it mean?

The tcp_syncookies configuration is likely enabled on your system.  You can check for this by viewing the contents of /proc/sys/net/ipv4/tcp_syncookies

$ cat /proc/sys/net/ipv4/tcp_syncookies
1

If the value returned is 1 (as per the example above), then tcp_syncookies are enabled for this host

Possible SYN flooding

A SYN flood is a form of denial-of-service attack in which an attacker sends a succession of SYN requests to a target's system in an attempt to consume enough server resources to make the system unresponsive to legitimate traffic.

Source: Wikipedia https://en.wikipedia.org/wiki/SYN_flood

You would expect to see evidence of a SYN flood when a "flood" of TCP SYN messages are sent to the host. Under normal operation, your kernel should acknowledge these incoming SYNs with a SYN-ACK, are not followed by ACK messages from the client. The process (or pattern) described above is known as Three Way Handshaking. The goal of this is to firmly establish communication on both the server and the client.

In the event of a real attack, a SYN flood will most likely originate from a fake IP address; during an attack, the client performing the "flood" is not waiting for the SYN-ACK response back from the server it is attacking.

Under normal operation (i.e. without SYN cookies), TCP connections will be kept half-open after receiving the first SYN because of the handshake mechanism used to establish TCP connections. Due to the fact that there is a limit to how many half open connections that the kernel can maintain at any given time, this is where the problem becomes characterised as an attack.

The term half-open refers to TCP connections whose state is out of synchronization between the two communicating hosts, possibly due to a crash of one side. A connection which is in the process of being established is also known as embryonic connection.

Source: Wikipedia https://en.wikipedia.org/wiki/TCP_half-open

If SYN cookies are enabled, then the kernel doesn't track half-open connections. Instead it relies on the sequence number in the following ACK datagram that the ACK follows a SYN and a SYN-ACK which establishes full communication between client and server. By ignoring half-open connections, SYN floods are no longer a problem.

In the case of MarkLogic, this message can appear if the rate of incoming messages is perceived to the kernel as being unusally high. In this case, this would not be indicative of a real SYN flooding attack, but to the TCP/IP stack it looks like it exhibits the same characteristics and the kernel responds by reporting a possible (fake) attack.

Notes from the kernel documentation

See the section of the kernel documentation for tcp_syncookies - BOOLEAN for some further information regarding this feature:

The syncookies feature attempts to protect a socket from a SYN flood attack. This should be used as a last resort, if at all. This is a violation of the TCP protocol, and conflicts with other areas of TCP such as TCP extensions. It can cause problems for clients and relays. It is not recommended as a tuning mechanism for heavily loaded servers to help with overloaded or misconfigured conditions. For recommended alternatives see tcp_max_syn_backlog, tcp_synack_retries, and tcp_abort_on_overflow.

Further down, they state:

Note, that syncookies is fallback facility. It MUST NOT be used to help highly loaded servers to stand against legal connection rate. If you see SYN flood warnings in your logs, but investigation shows that they occur because of overload with legal connections, you should tune another parameters until this warning disappear. See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.

Source: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

Tuning on a MarkLogic Server

Any dmesg output indicating "possible SYN flooding on port 7999" may appear in tandem with very heavy XDQP (TCP) traffic within a MarkLogic cluster - this link provides further detail in relation to a similar scenario with Apache HTTP server. You can tune your TCP settings to try to avoid SYN Flooding error messages, but SYN flooding can also be a symptom of a system under resource pressure. 

If a MarkLogic Server instance sees SYN flooding message on a system that is otherwise healthy and the messages occur because of normal and expected marklogic server communications, you may want to increase the backlog (tcp_max_syn_backlog) or adjust some of the other settings (such as tcp_synack_retries, tcp_abort_on_overflow). However, if SYN Flooding message only occurs on a system that is under resource pressures, then solving the resource issue should be the focus.  

How to disable SYN cookies

You can disable syncookies by adding the following line to /etc/sysctl.conf:

# disable TCP SYN Flood Protection
net.ipv4.tcp_syncookies = 0

Also note that the new setting will take only effect after a host reboot.

Further reading

Introduction

This article will show you how to add a Fast Data Directory (FDD) to an existing forest.

Details

The fast data directory stores transaction journals and stands. When the directory becomes full, larger stands will be merged into the data directory. Once the size of the fast data directory approaches its limit, then stands are created in the data directory.

Although it is not possible to add an FDD path to a currently-existing forest, it is possible to do the following:

1. Destroy an existing forest configuration (while preserving the data)

2. Re-create a forest with the same name & data, with an FDD added

 

The queries below illustrate steps one and two of the process. Note that you can also do this with Admin UI.

The query below will delete the forest configurations but not data.

Preparation:

1. Schedule a downtime window for this procedure (DO NOT DO THIS ON A LIVE PRODUCTION SYSTEM)

2. Insure that all ingestion and merging has stopped

3. Just to be on safer side, take a Backup of the forest first before applying this in Production

3. Detach the forest before running these queries


1) Use the following API to Delete an existing forest configuration

NOTE: make sure to set the $delete-data papameter to false().

admin:forest-delete(
$config as element(configuration),
$forest-ids as xs:unsignedLong*,
$delete-data as xs:Boolean {=FALSE}
) as element(configuration)


2) Use the following API to create a new forest  pointing to the old data directory which includes the configured FDD:

admin:forest-create(
$config as element(configuration),
$forest-name as xs:string,
$host-id as xs:unsignedLong,
$data-directory as xs:string?,
[$large-data-directory as xs:string?],
[$fast-data-directory as xs:string?]
) as element(configuration)



Here's an example query that uses these APIs:

xquery version "1.0-ml";

declare namespace html = "http://www.w3.org/1999/xhtml";

import module namespace admin = "http://marklogic.com/xdmp/admin" 
at "/MarkLogic/admin.xqy";

let $config := admin:get-configuration()

(: preserve some path values from the old forest :)

let $forest-name := "YOUR_FOREST_NAME"

let $new-fast-data := "YOUR_NEW_FAST_DATA_DIR"

let $old-data := admin:forest-get-data-directory($config, admin:forest-get-id($config, $forest-name))

let $old-large-data := admin:forest-get-large-data-directory($config, admin:forest-get-id($config, $forest-name))

return
admin:save-configuration(admin:forest-delete(
$config, admin:forest-get-id($config, $forest-name),
fn:false())),

let $config1 := admin:get-configuration()
return
admin:save-configuration(admin:forest-create(
    $config1,
    $forest-name,
    xdmp:host(),
    $old-data,
    $old-large-data,
    $new-fast-data
))

You can create and attach the forest in a single transaction. This is also possible using the admin UI (as two separate transactions) i.e. deleting only configuration of forest without data.

After attaching the forest, please re-index and data will then migrate to FDD. Note that the sample query needs to be executed on the host where the forest resides.


 

 

Introduction

Marklogic has shipped with a ReST API since MarkLogic 7.

In MarkLogic 8 the ReST API was vastly expanded, allowing ways for MarkLogic Database administrators to manage almost all common MarkLogic administration tasks over an HTTP connection to MarkLogic's ReST endpoints.

This Knowledgebase article will cover some examples of common administration tasks and will show some working examples to give you a taste of what can be done if you're using the latest version of MarkLogic Server.

While there are a significant number of examples throughout our extensive documentation in this area, many of these make use of CuRL. In this Knowledgebase article, we're going to use XQuery calls to demonstrate how the payloads are structured.

Creating a backup using a call to the ReST API (XQuery)

In the example code below, we demonstrate a call that will perform a backup of the Documents forest which places the backup in the /tmp directory.

Running the query in the above code example will return a response (in JSON format) containing a job ID for the requested task:

{
"job-id": "4903378997555340415", 
"host-name": "yourhostnamehere"
}

The next example will demonstrate a status check for a given job ID

Query the status of an active or recent job

The above query will return a response that would look like this:

{
"job-id": "4903378997555340415", 
"host-name": "yourhostnamehere", 
"status": "completed"
}

Further reading on the MarkLogic ReST API:

Summary

On Internet Explorer 9 and Internet Explorer 10, application services UI should be run in Compatibility Mode.

Details:

When using the Application Services UI in Internet Explorer 9 or Internet Explorer 10, you may notice some minor UI bugs.  These minor UI bugs occur just within MarkLogic Application Services, NOT within application built with it.  These UI bugs can be avoided if you run IE 9 or IE 10 in compatibility view.

Instructions on how to configure compatibility modes in IE 9 or IE 10: 

1. Press ALT-T to bring up the Tools menu
2. On the Tools menu, click 'Compatibility View Settings' 
3. Add the domain to the list of domains to render in compatibility view.

Introduction

A question that customers frequently ask is for advice on managing backups outside the standard XQuery APIs or the web interface provided by MarkLogic.

This Knowledgebase article demonstrates two approaches to allow you to integrate the backup of a MarkLogic database into your dev-ops workflow by allowing such processes to be scripted or managed outside the product.

Creating a backup using the ReST API

You can use the ReST API to perform a database backup and to check on the status at any given time.

The examples listed below use XQuery to make the calls to the ReST API over http but you could similarly adapt the below examples to work with cURL - examples will also be given for this approach.

The process

Here is an example that demonstrates a backup of the Documents database:

Running this should give you a job id as part of the response (in this example, we're using JSON to format the response but this can easily be changed by modifying the headers elements in the above sample to return application/xml instead):

{"job-id":"8774639830166037592", "host-name":"yourhostnamehere"}

Below is an example that demonstrates checking for the status of a given backup with the job-id given in the first step:

Example: using cURL (instead of XQuery)

Adapting the above examples so they work from cURL instead, you can generate a call that looks like this:

curl -s -X POST  --anyauth -u username:password --header "Content-Type:application/json" -d '{"operation": "backup-database", "backup-dir": "/tmp/backup", "journal-archiving": true, "include-replicas": true}'  http://localhost:8002/manage/v2/databases/Documents\?format\=json

And to check on the status, the cURL payload could be modified to look like this:

{"operation": "backup-status", "job-id" : "8774639830166037592","host-name": "yourhostnamehere"}

Further reading

Summary

Customers using the MarkLogic AWS Cloud Formation Templates may encounter a situation where someone has deleted an EBS volume that stored MarkLogic data (mounted at /var/opt/MarkLogic).  Because the volume, and the associated data are no longer available, the host is unable to rejoin the cluster.  

Getting the host to rejoin the cluster can be complicated, but it will typically be worth the effort if you are running an HA configuration with Primary and Replica forests.

This article details the procedures to get the host to rejoin the cluster.

Preparing the New Volume and New Host

The easiest way to create the new volume is using a snapshot of an existing host's MarkLogic data volume.  This saves the work of manually copying configuration files between hosts, which is necessary to get the host to rejoin the cluster.

In the AWS EC2 Dashboard:Elastic Block Store:Volumes section, create a snapshot of the data volume from one of the operational hosts.

Next, in the AWS EC2 Dashboard:Elastic Block Store:Snapshots section, create a new volume from the snapshot in the correct zone and note note the new volume id for use later.

(optional) Update the name of the new volume to match the format of the other data volumes

(optional) Delete the snapshot

Edit the Auto Scaling Group with the missing host to bring up a new instance, by increasing the Desired Capacity by 1

This will trigger the Auto Scaling Group to bring up a new instance. 

Attaching the New Volume to the New Instance

Once the instance is online, and startup is complete connect to the new instance via ssh

Ensure MarkLogic is not running, by stopping the service and checking for any remaining processes.

  • sudo service MarkLogic stop
  • pgrep -la MarkLogic

Remove /var/opt/MarkLogic if it exists, and is mounted on the root partition.

  • sudo rm -rf /var/opt/MarkLogic

Edit /var/local/mlcmd and update the volume id listed in the MARKLOGIC_EBS_VOLUME variable to the volume created above.

  • MARKLOGIC_EBS_VOLUME="[new volume id],:25::gp2::,*"

Run mlcmd to attach and mount the new volume to /var/opt/MarkLogic on the instance

  • sudo /opt/MarkLogic/mlcmd/bin/mlcmd init-volumes-from-system
  • Check that the volume has been correctly attached and mounted

Remove contents of /var/opt/MarkLogic/Forests (if they exist)

  • sudo rm -rf /var/opt/MarkLogic/Forests/*

Run mlcmd to sync the new volume information to the DynamoDB table

  • sudo /opt/MarkLogic/mlcmd/bin/mlcmd sync-volumes-to-mdb

Configuring MarkLogic With Empty /var/opt/MarkLogic

If you did not create your volume from a snapshot as detailed above, complete the following steps.  If you created your volume from a snapshot, then skip these steps, and continue with Configuring MarkLogic and Rejoining Existing Cluster

  • Start the MarkLogic service, wait for it to complete it's initialization, then stop the MarkLogic service:
    • sudo service MarkLogic start
    • sudo service MarkLogic stop
  • Move the configuration files out of /var/opt/MarkLogic/
    • sudo mv /var/opt/MarkLogic/*.xml /secure/place (using default settings; destination can be adjusted)
  • Copy the configuration files from one of the working instances to the new instance
    • Configuration files are stored here: /var/opt/MarkLogic/*.xml
    • Place a copy of the xml files on the new instance under /var/opt/MarkLogic

Configuring MarkLogic and Rejoining Existing Cluster

Note the host-id of the missing host found in /var/opt/MarkLogic/hosts.xml

  • For example, if the missing host is ip-10-0-64-14.ec2.internal
    • sudo grep "ip-10-0-64-14.ec2.internal" -B1 /var/opt/MarkLogic/hosts.xml

  • Edit /var/opt/MarkLogic/server.xml and update the value for host-id to match the value retrieved above

Start MarkLogic and view the ErrorLog for any issues

  • sudo service MarkLogic start; sudo tail -f /var/opt/MarkLogic/Logs/ErrorLog.txt

You should see messages about forests synchronizing (if you have local disk failover enabled, with replicas) and changing states from wait or async replication to sync replication.  Once all the forests are either 'open' or 'sync replicating', then your cluster is fully operational with the correct number of hosts.

At this point you can fail back to the primary forests on the new instances to rebalance the workload for the cluster.

You can also re-enable xdqp ssl enabled, by setting the value to true on the Group Configuration page, if you disabled the setting as part of these procedures.

Update the Userdata In the Auto Scaling Group

To ensure that the correct volume will be attached if the instance is terminated, the Userdata needs to be updated in a Launch Configuration.

Copy the Launch Configuration associated with the missing host.

Edit the details

  • (optional) Update the name of the Launch Configuration
  • Update the User data variable MARKLOGIC_EBS_VOLUME and replace the old volume id with the id for the volume created above.
    • MARKLOGIC_EBS_VOLUME="[new volume id],:25::gp2::,*"
  • Save the new Launch Configuration

Edit the Auto Scaling Group associated with the new node

Change the Launch Configuration to the one that was just created and save the Auto Scaling Group.

Next Steps

Now that normal operations have been restored, it's a good opportunity to ensure you have all the necessary database backups, and that your backup schedule has been reviewed to ensure it meets your requirements.

Backup/Restore settings for Local Disk Failover

When configuring backups for a database, the 'include replica forests' setting is important  in order to handle forest failover events.   When 'include replica forests' is set to 'true', both the master and the replica forests will also be included in the database backup.

This KB article will go over an example failover scenario, and will show how a scheduled backup/restore works with different 'include replica forests' and 'journal archiving' settings.

Scenario

Consider a 3 node cluster with hosts Host-A, Host-B and Host-C; and a database 'backup-test' with the following forest assignments: (forests ending with 'p' are primary and those ending with 'r' are replica).  Under normal conditions, the primary forests will be in 'open' state, and the replica forests will be in the 'sync replicating' state.

Host A Host B Host C
forest-1p (open) forest-2p(open) forest-3p(open)
forest-3r (sync replicating) forest-1r (sync replicating) forest-2r (sync replicating)


Failover and Forest states

Now consider what happens when Host-A goes offline. When Host-A's primary forests complete failover, it's replica forests will take over.   The following will be the forest state layout when this happens

Host A Host B Host C
forest-1p (disabled) forest-2p (open) forest-3p (open)
forest-3r (disabled) forest-1r (open) forest-2r (sync replicating)

Backup Examples: 

When 'Include replica Forests' is false and 'Journal Archiving' is true

Forest 1p is disabled, and the corresponding replica forest-1r is now Open because of the failover.  In this case a backup task will not succeed during this time because replica forests have not been configured for backups. The following 'Warning' level message will be logged:

Warning: Not backing up database backup-test because first forest master forest-1p is not available, and replica backups aren't enabled

When Host-A is brought up again, the forest states will be

forest-1p - sync replicating
forest-1r - open

At this time, backups will succeed and because journal archiving is enabled, journals will be written to the backup data.

However, you will not be able to do a "point in time restore' using journal archiving. When the configured master is not the acting master and backup is not enabled for replicas, the following error occurs when a restore to a point in time is attempted :

Operation failed with error message: xdmp:database-restore((xs:unsignedLong("5138652658926200166"), "/space/20160927-1125008228810", xs:dateTime("2016-09-27T11:06:21-07:00"), fn:true(), ()) -- Unable to restore replica forest forest-1r because the master forest forest-1p is not also restored, or is not acting master. Check server logs.

To get past this, the forests need to be failed back in order to make the 'configured master' same as the 'acting master'

When 'Include replica forests' is true and 'Journal Archiving' is true

In this case, backups will succeed when forests are failed over to their replica forests because replica forests are configured for backups. And, because journal archiving is enabled, journals will be also written to the backup data.

Even in this case, point in time restore will not work similar to the previous case, until the forests are failed back.

Related documentation

MarkLogic Administrator's Guide: Backing up and Restoring a Database Following Local Disk Failover 

MarkLogic Administrator's Guide: Restoring Databases with Journal Archiving

MarkLogic Knowledgebase Article: Understanding the role of journals in relation to backup and restore journal archiving

MarkLogic Knowledgebase Article: Database backup / restore and local disk failover

Before executing significant operational procedures on production systems, such as

  • Production Go Live events;
  • Major version Upgrades;
  • Adding/removing nodes to a cluster;
  • Deploying a new application or an application upgrade;
  • ...

MarkLogic recommends:

  • Thorough testing of any operational procedures on non production systems.
  • Opening a ticket with MarkLogic Technical Support to give them a heads up, along with any useful collateral that would help expedite diagnostics of issues if any occur, such as
    • The finalized plan & timeline or schedule of the operational procedure
    • support dump, taken before the operational procedure, in order to record the configuration of the system ahead of time; This may come in handy if an incident occurs as we may want to know the actual changes that had been made. You can create a MarkLogic Server support dump from our Admin UI by selecting the 'Support' tab; select scope=cluster, detail=status only, destination=browser -> save output to disk. Attach the support dump to the ticket as a file either as an email attachment or uploading through our support portal. 
    • A few days of error logs from before the operational event so that we can determine whether artifacts in the error logs are new or whether they existed prior to the event.
    • You can alternatively turn Telemetry on before the event and force an upload of the support dump & error logs.
    • Any architecture or design details of the system that you are able to share.
  • Please make sure that all individuals who are responsible for the event and who may need to contact the MarkLogic Technical Support team are registered MarkLogic Support contacts. They can register for an account per instructions available at https://help.marklogic.com/marklogic/AccountRequest.  They will want to register before the event as ONLY registered support contacts can create a ticket with MarkLogic Technical Support. We do not want registration and entitlement verification to get in the way of the ability to work on an urgent production issue.
  • Review the MarkLogic Support Handbook - http://www.marklogic.com/files/Mark_Logic_Support_Handbook.pdf. The following sections in the "HOW TO RECEIVE SUPPORT SERVICES" chapter of the handbook are useful to be acquainted with before an incident occurs
    • Section: What to do Prior to Logging a Service Request 
    • Section: Working with Support
    • Section: Escallation Process
    • Section: Understanding Case Priority and Response Time Targets
  • For urgent issues (production outages), remember that you can raise an urgent incident per the instructions in the support handbook; MarkLogic takes urgent incidents seriously, as every urgent issue results in a text message being sent to every support engineer, engineering management and the senior executive at MarkLogic. 
  • Enable Debug level logging so that any issues that arise can be more easily diagnosed.  Debug level logging does not have any noticeable impact on system performance.

Summary

In some cases it is required to change the default environment variables of a MarkLogic Server installation or configuration which are predefined in /etc/sysconfig/MarkLogic

Recommendation

The standard MarkLogic installation includes a file that contains all required environment variables for a successful service start which is located at /etc/sysconfig/MarkLogic. As this file is part of a MarkLogic installation package it will be replaced or changed during a MarkLogic upgrade without any notification. Any direct file customizations will be overwritten and are lost. This can result in various problems after a restart of a MarkLogic upgrade.

To prevent any issues with upgrading MarkLogic Server, it is recommended to place all customizations into a separate file located at /etc/marklogic.conf. As this file isn't part of the default MarkLogic installation package it has to be manually created by either copying /etc/sysconfig/MarkLogic to /etc/marklogic.conf or just creating a blank text file at /etc/marklogic.conf. Only add the variables to be changed or added. All custom environment variables added in this file will overwrite the ones which are defined in /etc/sysconfig/MarkLogic.

These changes will also survive any MarkLogic Server upgrade as it won't be touched during the upgrade process.

Further reading

Best Practice for Adding an Index in Production

Summary

It is sometimes necessary to remove or add an index to your production cluster. For a large database with more than a few GB of content, the resulting workload from reindexing your database can be a time and resource intensive process, that can affect query performance while the server is reindexing. This article points out some strategies for avoiding some of the pain-points associated with changing your database configuration on a production cluster.

Preparing your Server for Production

In general, high performance production search implementations run with tight controls on the automatic features of MarkLogic Server. 

  • Re-indexer disabled by default
  • Format-compatibility set to the latest format
  • Index-detection set to none.
  • On a very large cluster (several dozen or more hosts), consider running with expunge-locks set to none
  • On large clusters with insufficient resources, consider bumping up the default group settings
    • xdqp-timeout: from 10 to 30
    • host-timeout: from 30 to 90

The xdqp and host timeouts will prevent the server from disconnecting prematurely when a data-node is busy, possibly triggering a false failover event. However, these changes will affect the legitimate time to failover in an HA configuration. 

Preparing to Re-index

When an index configuration must be changed in production, you should:

  • First, index-detection should is set back to automatic,
  • Then, the index configuration change should be made,
  • Finally, the reindexer should be enabled during off-hours to reindex the content.

Reindexing works by reloading all the Uris that are affected by the index change, this process tends to create lots of new/deleted fragments which then need to be merged. Given that reindexing is very CPU and disk I/O intensive, the re-indexer-throttle can be set to 3 or 2 to minimize impact of the reindex.

After the Re-index

After the re-index has completed, it is important to return to the old settings by disabling the reindexer and setting index-detection back to none.

If you're reindexing over several nights or weekends, be sure to allow some time for the merging to complete. So for example, if your regular busy time starts at 5AM, you may want to disable the reindexer at around midnight to make sure all your merging is completed before business hours.

By following the above recommendations, you should be able to complete a large re-index without any disruption to your production environment.

Summary

MarkLogic Server can ingest and query all sorts of data such as XMLtextJSON, binary, generic, etc. There are some things to consider when choosing to simply load data "as-is" vs. doing some degree of data modeling or data transformation prior to ingestion.

Details

Loading data "as-is" can minimize time and complexity during ingest or document creation. That can, however, sometimes mean more complex, slower performing queries. It may also mean more storage space intensive indexing settings.

In contrast, doing some degree of data transformation prior to ingestion can sometimes result in dramatic improvements in query performance and storage space utilization due to reduced indexing requirments.

An Example

An simple example will demonstrate the how a data model can affect performance. Consider the data model used by Apple's iTunes:

<plist version="1.0">
<dict>
  <key>Major Version</key><integer>10</integer>
  <key>Minor Version</key><integer>1</integer>
  <key>Application Version</key><string>10.1.1</string>
  <key>Show Content Ratings</key><true/>
  <dict>
    <key>Track ID</key><integer>290</integer>
    <key>Name</key><string>01-03 Good News</string>
          …
  </dict>
</dict>
 

Note the multiple <key> sibling elements, at multiple levels - where both levels are named the same thing (in this case, <dict>). Let's say you wanted to query a document like this for "Application Version." In this case, time will be spent performing index resolution for the encompassing element (here, <key>). Unfortunately, because there are multiple sibling elements all sharing the same element name, all of those sibling elements will need to be retrieved and then evaluated to see which of them actually match the given query criteria. Consider a slightly revised data model, instead:

 

<iTunesLibrary version="1.0">
<application>
  <major-version>10</major-version>
  <minor-version>1</minor-version>
  <app-version>10.1.1</app-version>
  <show-content-ratings>true</show-content-ratings>
  <tracks>
    <track-id>290</track-id>
    <name>01-03 Good News</name>
          …
  </tracks>
</application>

Here, we only need to query and therefore retrieve and evaluate the single <app-version> element, instead of multiple retreivals/evaluations as in the previous example data model.  

At Scale

Although this is a simple example, when processing millions or even billions of records, eliminating small processing steps could have significant performance impact.

BEST PRACTICES FOR EXPORTING AND IMPORTING DATA IN BULK

Handling large amounts of data can be expensive in terms of both computing resources and runtime. It can also sometimes result in application errors or partial execution. In general, if you’re dealing with large amounts of data as either output or input, the most scalable and robust approach is to break-up that workload into a series of smaller and more manageable batches.

Of course there are other available tactics. It should be noted, however, that most of those other tactics will have serious disadvantages compared to batching. For example:

  • Configuring time limit settings through Admin UI to allow for longer request timeouts - since you can only increase timeouts so much, this is best considered a short term tactic for only slightly larger workloads.
  • Eliminating resource bottlenecks by adding more resources – often easier to implement compared to modifying application code, though with the downside of additional hardware and software license expense. Like increased timeouts, there can be a point of diminishing returns when throwing hardware at a problem.
  • Tuning queries to improve your query efficiency – this is actually a very good tactic to pursue, in general. However, if workloads are sufficiently large, even the most efficient implementation of your request will eventually need to work over subset batches of your inputs or outputs.

For more detail on the above non-batching options, please refer to XDMP-CANCELED vs. XDMP-EXTIME.

WAYS TO EXPORT LARGE AMOUNTS OF DATA FROM MARKLOGIC SERVER

1.    If you can’t break-up the data into a series of smaller batches - use xdmp:save to write out the full results from query console to the desired folder, specified by the path on your file system. For details, see xdmp:save.

2.    If you can break-up the data into a series of smaller batches:

            a.    Use batch tools like MLCP, which can export bulk output from MarkLogic server to flat files, a compressed ZIP file, or an MLCP database archive. For details, see Exporting Content from MarkLogic Server.

            b.    Reduce the size of the desired result set until it saves successfully, then save the full output in a series of batches.

            c.    Page through result set:

                               i.     If dealing with documents, cts:uris is excellent for paging through a list of URIs. Take a look at cts:uris for more details.

                               ii.     If using Semantics

                                             1.    Consider exporting the triples from the database using the Semantics REST endpoints.

                                             2.    Take a look at the URL parameters start? and pageLength? – these parameters can be configured in your SPARQL query to return the results in batches.  See GET /v1/graphs/sparql for further details.

WAYS TO IMPORT LARGE AMOUNTS OF DATA INTO MARKLOGIC SERVER

1.    If you’re looking to update more than a few thousand fragments at a time, you'll definitely want to use some sort of batching.

             a.     For example, you could run a script in batches of say, 2000 fragments, by doing something like [1 to 2000], and filtering out fragments that already have your newly added element. You could also look into using batch tools like MLCP

             b.    Alternatively, you could split your input into smaller batches, then spawn each of those batches to jobs on the Task Server, which has a configurable queue. See:

                            i.     xdmp:spawn

                            ii.    xdmp:spawn-function

2.    Alternatively, you could use an external/community developed tool like CoRB to batch process your content. See Using Corb to Batch Process Your Content - A Getting Started Guide

3.    If using Semantics and querying triples with SPARQL:

              a.    You can make use of the LIMIT keyword to further restrict the result set size of your SPARQL query. See The LIMIT Keyword

              b.    You can also use the OFFSET keyword for pagination. This keyword can be used with the LIMIT and ORDER BY keywords to retrieve different slices of data from a dataset. For example, you can create pages of results with different offsets. See  The OFFSET Keyword

Introduction

MarkLogic Server delivers performance at scale, whether we're talking about large amounts of data, users, or parallel requests. However, people do run into performance issues from time to time. Most of those performance issues can be found ahead of time via well-constructed and well-executed load testing and resource provisioning.

There are three main aspects to load testing against and resource provisioning for MarkLogic:

  1. Building your load testing suite
  2. Examining your load testing results
  3. Addressing hot spots

Building your load testing suite

The biggest issue we see with problematic load testing suites is unrepresentative load. The inaccuracy can be in the form of missing requests, missing query inputs, unanticipated query inputs, unanticipated or underestimated data growth rates, or even a population of requests that's skewes towards different load profiles compared to production traffic. For example - a given load test might heavily exercise query performance, only to find in production that ingest requests represent the majority of traffic. Alternatively, perhaps one kind of query represents the bulk of a given load test, when in reality that kind of query is dwarfed by the number of invocations of a different kind of query.

Ultimately, to be useful, a given load test needs to be representative of production traffic. Unfortunately, the less representative a load test is, the less useful it will be.

Examining your load testing results

Beginning with version 7.0, MarkLogic Server ships a Monitoring History dashboard, visible from any host in your cluster at port 8002/history. The Monitoring History dashboard will illustrate usage of resources such as CPU, RAM, disk I/O, etc... both at the cluster and individual host levels. The Monitoring History dashboard will also illustrate the occurance of read and write locks over time. It's important to get a handle on both resource and lock usage in the course of your load test as both will limit the performance of your application - but the way to address those performance issues depends on which class of usage is most prevalent.

Addressing hot spots

By having a representative load test and closely examining your load testing results, you'll likely find hot spots or slow performing parts of your application. MarkLogic Server's Monitoring History allows you to correlate resource and lock usage over time against the workload being submitted by your load tests. Once you find a hot spot, it's worthwhile examining it more closely by either running those requests in isolation, or at larger scales. For example, you could run 4x and 16x the number of parallel requests, or 4x and 16x the number of inputs to an individual request - both of which will give you an idea of how the suspect requests scale in response to increased load.

Once you've found a hot spot - what should you do about it? Well, that ultimately depends on the kind of usage you're seeing in your cluster's Monitoring History. If it's clear that your suspect requests are running into a resource bound (for example, 100% utilization of CPU/RAM/disk I/O/etc.), then you'll either need to provision more of that limiting resource (either through more machines, or more powerful machines, or both), or reduce the amount of load on the system provisioned as-is. It may also be possible to re-architect the suspect request to be more efficient with regard to its resource usage.

Alternatively you may find that your system is not, in fact, seeing a resource bound - where it appears there are plenty of spare CPU cycles/free RAM/low amounts of disk I/O/etc. If you're seeing poor performance in that situation, it's almost always the case that you'll instead see large spikes in the number of read/write locks taken as the your suspect requests work through the system. Provisioning more hardware resources may help to some small degree in the presence of read/write locks, but what really needs to happen is the requests need to be re-architected to use as few locks as possible, and preferrably to run completely lock free.

 

 

 

Introduction

While there are many different ways to define schemas in MarkLogic Server, one should be aware of both the location strategy the server will use (defined here: http://docs.marklogic.com/guide/admin/schemas), as well as the different locations in which your particular schema may reside.

Schema Location

Schemas can reside in either the Schemas database defined for your content database, or within the server's Config directory.  If there is no explicit schema map defined, the server will use the following schema location strategy:

1) If the XQuery program explicitly references a schema for the namespace in question, MarkLogic Server uses this reference.
2) Otherwise, MarkLogic Server searches the schema database for an XML schema document whose target namespace is the same as the namespace of the element that MarkLogic Server is trying to type.
3) If no matching schema document is found in the database, MarkLogic Server looks in its Config directory for a matching schema document.
4) If no matching schema document is found in the Config directory, no schema is found.

There can sometimes be issues with step #2 when there are multiple schema documents in the schema database whose target namespace matches the namespace of the element that MarkLogic Server is trying to type. In that situation, it would be best to explicitly define a default schema mapping - schema maps can be defined through the the Admin API or the Admin User Interface. Be aware that you can define schema mappings at both the group level (in which case the mapping would then apply to all application servers in the group) or at the individual application server level.

Best Practices

Now that we know how the server locates schemas and where schema can potentially reside - what are there best practices?

In general, it's best to localize your schema impacts as narrowly as possible. For example, instead of using a single Schemas database or the server's one and only Config directory, it would instead be better to define a specific Schemas database that would be used for the relevant content database. Similarly, unless you know you need a defined schema mapping to apply to every application server in a group, it would instead be better to define your schema mappings at the application server level as opposed to the group level.

Summary

Although not exhaustive, this article lists some best practices for the use of MarkLogic Server and Amazon's VPC

Details

  1. Nodes within a MarkLogic cluster need to communicate with one another directly, without the presence of a load balancer in-between them.
  2. Whether in the context of a VPC or not, before attempting to join a node to a cluster, one should verify whether each node is able to ping or to ssh from the one node to the other (or vice versa). If you're not able to ping or ssh from one machine to another, then issues seen during a MarkLogic cluster join is very likely to be localized to the network configuration and should be diagnosed at the network layer.
  3. The following items should be double-checked when using VPCs:
    1. If a private subnet is used for any MarkLogic instance, that subnet needs access to the public internet for the following situations:
      1. If Managed Cluster support is used, MarkLogic requires access to AWS services which require outbound connectivity to the internet (at minimum to the AWS service web sites).
      2. If foreign clusters are used then MarkLogic needs to connect to all hosts in the foreign cluster
      3. If Amazon S3 is used then MarkLogic needs to communicate with the S3 public web services.
    2. It is a assumed that the creator of the VPC has properly configured all subnets which MarkLogic needs to be installed to have outbound internet. There are many ways that private subnets can be configured to communicate outbound to the public internet. NAT instances are one example [AWS VPC NAT]. Another option is using DirectConnect to route outbound traffic through the organization's internet connection.
    3. All subnets which host instances running MarkLogic in the same cluster need to be able to communicate via port 7999.
    4. Inbound ssh connectivity is required for command line administration of each server requiring port 22 to be accessible from either a VPN or a public subnet.
    5. With regard to application traffic (as opposed to intra-cluster traffic as seen during cluster joining) connectivity to the MarkLogic server(s) needs to be open to whatever applications for which it is required. Application traffic can be sent through an internal or external load balancer, a VPN, direct access from applications in the same subnet or routing through another subnet.

Introduction

Problems can occur when trying to explicitly search (or not search) parts of documents when using a global configuration approach to include and exclude elements.

Global Approach

Including and excluding elements in a document using a global configuration approach can lead to unexpected results that are complex to diagnose.  The global approach will require positions to be enabled in your index settings, expanding the disk space requirements of your indexes and may result in greater processing time of your position dependent queries.  It may also require adjustments to your data model to avoid unintended includes or excludes; and may require changes to your queries in order to limit the the number of positions used.

If circumstances dictate that you must instead use the less preferred global configuration approach, you can read more about including/excluding elements in word queries here: http://docs.marklogic.com/guide/admin/wordquery#id_77008

Recommended Approach

In general, it's better to define specific fields, which are a mechanism designed to restrict your query to portions of documents based on elements. You can read more about fields here: http://docs.marklogic.com/guide/admin/fields

 

 

Introduction

Backing up multiple databases simultaneously may make some of the backups fail with error XDMP-FORESTOPIN.

 

Details

While configuring a scheduled backup, one can also select to backup the associated auxiliary databases like security, schemas, triggers. Generally, all the content databases share these auxiliary databases so issue may arise when more than one scheduled backup tries to backup the same auxiliary database. When two backups try to backup the same auxiliary database, the backup will fail throwing XDMP-FORESTOPIN error. Generally this error comes when the system attempts to start one forest operation (backup, restore, remove, clear, etc.) while another, exclusive operation is already in progress. For example, starting a new backup while a previous backup is still in progress.

 

Recommendations

One should be extra cautious while configuring scheduled backups and selecting auxiliary databases with them. If one really wants to backup the auxiliary databases with the content database then one needs to pay special attention to the timing and ensure that no two backups pose this timing threat.

As most of the applications don't make frequent changes to their auxiliary databases hence MarkLogic recommends to schedule backup for them separately - instead of selecting them together with the content databases.

Introduction

In MarkLogic 8, support for native JSON and server side JavaScript was introduced.  We discuss how this affects the support for XML and XQuery in MarkLogic 8.

Details

In MarkLogic 8, you can absolutely use XML and XQuery. XML and XQuery remain central to MarkLogic Server now and into the future. JavaScript and JSON are complementary to XQuery and XML. In fact, you can even work with XML from JavaScript or JSON from XQuery.  This allows you to mix and match within an application—or even within an individual query—in order to use the best tool for the job.

See also:

Server-side JavaScript and JSON vs XQuery and XML in MarkLogic Server

XQuery and JavaScript interoperability

Introduction

Sometimes you may find that there are one or more tasks that are taking too long to complete or are hogging too many server resources, and you would like to remove them from the Task Server.  This article presents a way to cancel active tasks in the Task Server.

Details

To cancel active tasks in the Task Server, you can browse to the Admin UI, navigate to the Status tab of the Group's Task Server, and cancel the tasks. However, this may get tedious if there are many tasks to be terminated.

As an alternative, you can use the server monitoring built-ins to programmatically find and cancel the tasks. The documentation for the MarkLogic Server API contains includes information for all the builtin functions you will need (refer to http://docs.marklogic.com/xdmp/server-monitoring).

Sample Script

Here is a sample script that removes the task based on the path to the module that is being executed:

let $host-id := xdmp:host()
let $host-task-server-id := xdmp:host-status($host-id)//*:task-server/*:task-server-id/text()
let $task-server-status := xdmp:server-status($host-id,$host-task-server-id)
let $task-server-requests := $task-server-status/*:request-statuses
let $scheduled-task-request := $task-server-requests/*:request-status[*:request-text = "/PATH/TO/SCHEDULED/TASK/MODULE.XQY"]/*:request-id/text()
return
   xdmp:request-cancel($host-id,$host-task-server-id,$scheduled-task-request)

Summary

MarkLogic stores all signed Certificates, private keys, and Certificate Authority Certificates inside the Security Database. The Security Database also stores Users, Passwords, Roles, Privileges, and many other Authentication related configurations. While setting up DR Cluster, many Administrators prefers to Replicate the Security Database to a DR (Disaster Recovery) cluster to avoid re-configuring DR cluster with Same User/Role/Privileges etc. 

Security Database Replication presents design challenges and issues while Accessing Application Servers on the DR cluster.

  • Certificates installed on the Master Cluster Security Database will get replicated to the DR cluster Security Database; However those Replicated Certificates are not useful to the DR Cluster, since Signed Certificates are typically tied to a single host (though exceptions include SAN and Wild Card Certificates).  
  • At the same time, since replicated databases are read-only, we are not able to install a new Signed Certificates on the DR Cluster as the replicated Security Database is read-only.

This article discusses the different aspect of the above problem and provides a solution.

Configuration: Security Database replicated to DR Cluster

For article discussion purpose, we will consider a 3 node Master cluster coupled to a 3 node DR cluster, where the Security DB is replicated from Master to DR Cluster. We will also have an Application Server configured attached to "DemoTemp1" Template in Master cluster. 

       Master_Cluster_Hosts.png         DR_Cluster_Hosts.png

Issues in DR Cluster.

Certificate Authentication based on CN field 

When client browsers connect to the application server using HTTPS, they check to make sure your SSL Certificate matches the host name in the address bar. There are three ways for browsers to find a match:

  1.    The host name (in the address bar) exactly matches the Common Name (CN) in the certificate's Subject.
  2.    The host name matches a Wildcard Common Name. For example, www.example.com matches the common name *.example.com.
  3.    The host name is listed in the Subject Alternative Name field.

The most common form of SSL name matching is the first option -  SSL client compares server name to the Common Name in the server's certificate. 

Since Temporary Signed Certificates have CN field of Master Cluster nodes, the Application Server on the DR Cluster will fail when used with the MarkLogic generated Temporary Signed Certificate.

Certificate Requests

When we attach Template on DR Cluster to any application server and generate a certificate request, MarkLogic Server will generates a Temporary Signed Certificate for all the nodes in Cluster in the Application Server Group.

Master_Cert_Template_Status.png    DR_Cert_Template_Status_1.png

To install Certificate Signed by 3rd party, replacing temporary Signed Certificate, we will need to generate a certificate requests. You can generate a certificate requests in MarkLogic for All nodes using the Request button under "Needed Certificate Request" on Certificate Template "Status" tab.

  • On the Master cluster, MarkLogic will generate 3 Certificate requests with CN field matching for each of 3 nodes. All 3 new Certificate Request are internally stored in the Security Database.
  • On the DR Cluster, Clicking Certificate Request will result in an ERROR, since the DR Cluster has a replicated Security Database that is in a Read-Only ("open replica") state i.e. security database updates arel not allowed.

Pending Certificate Requests

Each Certificate request are intended for specific individual nodes, as Certificate request originator will incorporate client FDQN into Certificate CN field while request generation. MarkLogic Server will use the hostname (which in most cases matches your FDQN) as the CN field value in the Certificate Request.

Certificate request generated on Master Cluster are stored in Security Database, which will get replicated to DR Cluster Security Database (as/when Security DB replication is configured); However Certificate requests generated on Master Cluster are not relevant to DR Cluster as they have Master Cluster nodes FQDN as CN Fields in them.

Master_Cert_Template_Status_Post_Request.png    DR_Cert_Template_Status_Post_Request.png

Solution

To install Signed Certificates intended for the DR Cluster, where Certificate CN field matches the FQDN of DR Cluster, we will need to install the DR cluster's Signed Certificates on the Master Cluster.  That certificate will then be replicated to the DR Cluster through the normal database replication of the Security database. 

Step 1. Generate Certificate Request (intended for DR nodes).

You would generate Certificate request using XQuery on QConsole against the Security database on the Master cluster itself, but the values used in your XQuery will be for DR/Replica Cluster nodes FQDN. For example, for the first node in DR Cluster "engrlab-130-026.engrlab.marklogic.com, you would run below Query from Query Console on any Node on Master Cluster against Security Database. We will change the FQDN value to each node and run Query total 3 times.

xquery version "1.0-ml"; 
import module namespace pki = "http://marklogic.com/xdmp/pki" at "/MarkLogic/pki.xqy";
pki:generate-certificate-request(
      pki:template-get-id(
           pki:get-template-by-name("DemoTemp1")),
                                    "engrlab-130-026.engrlab.marklogic.com",
                                    "engrlab-130-026.engrlab.marklogic.com",
                                    ())

Step 2. Download Certificate Request and Get them Signed.

We should be able to see Certificate request pertaining to each nodes (for Master as well as DR Nodes) on Certificate Template status tab on Master Cluster GUI and DR Cluster GUI both. Download them and get them signed by the favorite Certificate Authority.

Master_Cert_Template_Status_QC_Request.png    DR_Cert_Template_Status_QC_Request.png

Step 3. Install All Signed Certificates (for Master + DR Nodes) on Master Cluster 

Install all Signed Certificates (including Cert intended for Replica Cluster) on Master Cluster Admin GUI Certificate Template Import tab. If we try to Install Certificates on DR/Replica cluster from Admin GUI, we will get XDMP-FORESTNOT --Forest Security not available: open replica Error. Our Application Server on the DR Cluster will find the appropriate Certificates for the node from the list of all Certificates. Below screenshot shows the status of Certificate Template from Master cluster as well as DR cluster (Both should be identical).

Master_Cert_Template_Status_Final.png    DR_Cert_Template_Status_Final.png

Step 4. Importing Pre-Signed Cert where Keys are generated outside of MarkLogic.

Please read "Import pre-signed Certificate and Key for MarkLogic HTTPS App Server" to import Certificate Req/Key generated outside of MarkLogic; For our purpose, we will need to import Certificates (and their respective Keys) for both Clusters (Master as well as DR/Replica) from the QConsole on Master Cluster itself.

Further Reading

Summary

Each node in MarkLogic Server Cluster has a hostname, a human-readable nickname corresponding to the network address of the device. MarkLogic retrieves the hostname from underlying operating system during installation. On Linux, we can retrieve platform hostname value by running "$ hostname" from a shell prompt. 

$ hostname

129-089.engrlab.marklogic.com

In most environments, hostname is the same as the platform's Fully-Qualified-Domain-Name (FQDN). However, there are scenarios where hostname could be different than the FQDN. On such environments you would use FQDN (engrlab-129-089.engrlab.marklogic.com) to connect to platform instead of hostname

$ ping engrlab-129-089.engrlab.marklogic.com

PING engrlab-129-089.engrlab.marklogic.com (172.18.129.89) 56(84) bytes of data.

64 bytes from engrlab-129-089.engrlab.marklogic.com (172.18.129.89): icmp_seq=1 ttl=64 time=0.011 ms

During Certificate Installation to Certificate template on environments where hostname and FDQN mismatch, MarkLogic looks for the CN field in the Installed Certificate to find a matching hostname in the cluster. However since CN field (reflecting FDQN) does not match the hostname known to MarkLogic, MarkLogic does not assign  the installed Certificate to any specific host in Cluster.

Subject: C=US, ST=NJ, L=Princeton, O=MarkLogic, OU=Eng, CN=engrlab-129-089.engrlab.marklogic.com

Installing Certificates in this scenario results in the installed Certificate not replacing the Temporary Certificate, and the Temporary Certificate will still be used with HTTPS App Server instead of the installed Certificates.

This article details different solutions to address this issue. 

Solution:

1) Host Name change

By default MarkLogic picks the hostname value presented by the underlying operating system. However we can always change the hostname string stored in MarkLogic Server after installation using Admin API admin:host-set-name ( http://docs.marklogic.com/admin:host-set-name )

Changing the hostname in MarkLogic (to reflect the FDQN name) will not affect the underlying Platform/OS hostname values, but will result in MarkLogic being able to find the correct host for the installed Certificate (CN field = hostname), and thus able to link then installed Certificate to specific host in Cluster.

2) XQuery code linking Installed Cert to specific Host

You can also use below XQuery code from QConsole against Security DB (as content source) to update Certificate xml files in Security DB, linking Installed Certifiate to Specific host.

Please change the Certificate Template-Name, and Host-Name in below XQuery to reflect values from your environment.

 

Also, note that above will not replace/overwrite the temporary Certificate, however our App Server will start using Installed Certificate from this point instead of Temporary Certificate. One can also delete the now unused Temporary Certificate file from QConsole without any negative effect.

3) Certificate with Subject Alternative Name (SAN Cert)

You can also request your IT (or Certificate issuer) to provide a Certificate with altSubjectName that matches MarkLogic's understanding of the host. MarkLogic, during the Installation of the Certificate, will look for Alternative names and link Certificate to correct host based on altSubjectName field.

 

Further Reading

 

Introduction: When you may need to change the state of forests

In most cases, all forests in your MarkLogic cluster will be configured to allow all (any) updates to be made.

If we consider running the following example in Query Console:

In the majority of cases, calling the above function should return "all", indicating that the forest is in a state to allow incoming queries to read data from the forest and to allow queries to update content (and to add new content) into that forest.

At any given time, a forest can be configured to be in one of four different states:

  • all
  • read-only
  • delete-only
  • flash-backup

You may want to change the state of the forests in a given database for several reasons

read-only
To run your application in maintenance mode where data can be read but no data on-disk can be changed
delete-only
In a situation where you are migrating data from a legacy database or removing data from a given forest
flash-backup
In a situation where you need to quiesce all forests in a given database for long enough to allow you to make a file level backup of the forest data.

Forest states explained

Sample state management module

Below is an example template for modifying the state of all forests in a given database:

Further reading

Forest States
http://docs.marklogic.com/guide/admin/forests#id_43487
Setting Forests to "read only"
http://docs.marklogic.com/guide/admin/forests#id_72520
Setting Forests to "delete only"
http://docs.marklogic.com/guide/admin/forests#id_20932

Introduction

This article discusses some of the issues you should think about when preparing to change the IP address' of a MarkLogic Server.

Detail: 

If the hostnames stay the same, then changing IP addresses should not have any adverse side effects since none of the default MarkLogic Server settings require an IP address.

Here are some caveats:

  1. Make sure there are no application servers that have an 'address' setting to an IP address that will no longer be accessible/exist after the change.
  2. Similarly, make sure there a no external (to MarkLogic Server) dependencies on the original IP addresses.
  3. Make sure you allow some time (on the order of minutes) for the routing tables to propagate across the DNS servers before bringing up MarkLogic Server.
  4. Make sure the hosts themselves are reachable via the standard Unix channels (ping, ssh, etc) before starting MarkLogic Server.
  5. Make sure you test this in a non-production environment box before you implement it in production.

Introduction

If you have an existing MarkLogic Server instance running on EC2, there may be circumstances where you need to change the size of available storage.

This article discusses approaches to ensure a safe increase in the amount of available storage for your EC2 instances without compromising MarkLogic data integrity.

This article assumes that you have started your cluster using the CloudFormation templates provided by MarkLogic.

The recommended method (I.) is to shut down the cluster, do the resize using snapshots and start again. If you wish to avoid downtime an alternative procedure (II.) using multiple volumes and rebalancing is described below.

In both procedures we are recommending a single, large EBS volume as opposed to multiple smaller ones because:

1. Larger EBS volumes have faster IO as described by the Amazon EBS Volume types at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html

2. You have to keep enough spare capacity on every single volume to allow for merges.  MarkLogic disk space requirements are described in our Installation Guide.

I. Resizing using AWS snapshots

This is the recommended method. This procedure follows the same steps as official Amazon AWS documentation, but highlights MarkLogic specific steps. Please review AWS Documentation in detail before proceeding:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

1. Make sure that you have an up to date backup of your data and a working restore plan.

2. Stop the MarkLogic cluster by going to AWS Console -> CloudFormation -> Actions -> Update Stack

aws-update-stack.png

Click through the pages and leave all other settings intact, but change Nodes to and review and confirm updating the stack. This will stop the cluster.

This is also covered in Marklogic EC2 documentation:

https://docs.marklogic.com/guide/ec2/managing#id_59478

4. Create a snapshot of the volume to resize.

5. Create a new volume from the snapshot.

Ensure that the new volume is sufficiently large to cover MarkLogic disk space requirements (generally at least 1.5x of the planned total forest size).

6. Detach the old volume.

7. Attach the newly expanded volume.

Steps 4-7 are exactly as covered in AWS documentation and have no Marklogic specific parts.

8. Restart MarkLogic cluster, by going to AWS Console -> CloudFormation -> Actions -> Update Stack and changing Nodes to the original setting.

9. Connect to the machine using SSH and resize the logical partition to match the new size. This is covered in AWS documentation, the commands are:

- resize2fs for ext3 and 4

xfs_growfs for xfs

10. The new volume will have a different id. You need to update the CloudFormation template so that the data volumes are retained and remounted when the cluster or nodes are restarted. The easiest way is to use mlcmd shell script provided by Marklogic. Also using SSH, run the following:

/opt/MarkLogic/bin/mlcmd sync-volumes-to-mdb

This will synchronise the EBS volume id with the CloudFormation template.

At this point the procedure is complete and you can delete the old EBS volume and once you have verified that everything is working fine, also delete the snapshot created in step 4.

II. Resizing with no downtime, using MarkLogic Rebalancing

This method avoids cluster downtime but it is slightly more complicated than procedure 1 and rebalancing will take additional time and add load to the cluster during rebalancing. In most cases procedure 1 takes far less time to complete, however, the cluster is down for the duration. With this procedure the cluster can serve requests at all times.

This procedure follows the same steps as official Amazon AWS documentation where possible, but highlights MarkLogic specific steps. Please review AWS Documentation in detail before proceeding:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

The procedure is described in more detail in the MarkLogic Server on Amazon EC2 Guide at https://docs.marklogic.com/guide/ec2/managing#id_81403

1. Create a new volume.

Ensure that the new volume is sufficiently large to cover MarkLogic disk space requirements (generally at least 1.5x of the planned total forest size).

2. Attach the volume to the EC2 instance. Please take a note of the EC2 device mount point, for example /dev/sdg and see here where it maps to in Linux and in RedHat: https://docs.marklogic.com/guide/ec2/managing#id_17077

3. SSH into the instance and execute the /opt/MarkLogic/bin/mlcmd init-volumes-from-system command to create a filesystem for the volume and update the Metadata Database with the new volume configuration. The init-volumes-from-system command will output a detailed report of what it is doing. Note the mount directory of the volume from this report.

4. Once the volume is attached and mounted to the instance, log into the Administrator Interface on that host and create a forest or forests, specifying host name of the instance and the mount directory of the volume as the forest Data Directory. For details on how to create a forest, see Creating a Forest in the Administrator's Guide.

5. Once the status of the new forest is set to "open", attach the new forest(s) to the database and retire all the forest(s) on the old volume. If you only have 1 data volume then this includes forests for Schemas, Security, Triggers, Modules etc. It is possible to script this part using XQuery, JS or REST:

https://docs.marklogic.com/admin:forest-create

This will trigger rebalancing - database fragments will start to move to the new forests. This process will take several hours or days, depending on the size of data and the Admin UI will show you an estimate.

The Admin UI for this is covered here: https://docs.marklogic.com/guide/admin/forests#id_93728

and here is more information on rebalancing: https://docs.marklogic.com/guide/admin/database-rebalancing#id_87979

6. Once the old forest(s) have 0 fragments in them you can detach them and delete the old forest(s). The migration to a new volume is complete.

7. Optional removing of the old volume. If your original volume was data only, the original volume should be empty after this procedure and you can:

a) unmount the volume in Linux

b) delete the volume in AWS EC2 console

c) issue /opt/MarkLogic/bin/mlcmd sync-volumes-to-mdb. This will preserve the new volume mappings in the Cloud Formation template and the volumes will be preserved and remounted when nodes are restarted or even terminated.

Introduction

A common use case in many business applications is to find if an element exists in any document or not. This article provide ways to find such documents and explain points that should be taken care of while designing a solution.

 

Solution

In general, existence of an element in a document can checked by using below XQuery.

cts:element-query(xs:QName('myElement'),cts:and-query(()))

Note the empty cts:and-query construct here. An empty cts:and-query is used to fetch all fragments.

Hence running below search query will bring back all the documents having element "myElement".

 

Wrapping the query in cts:not-query will bring back all the documents *not* having element "myElement" 

 

As a search using cts:not-query is only guaranteed to be accurate if the underlying query that is being negated is accurate from its index resolution, hence to check existence of a specific XPath, we need to index that XPath.
e.g. if you want to find documents having /path/1/A (and not /path/2/A) then you can create a field index for path /path/1/A and then use it in your query instead.

 

Things to remember

1.) Have unique element name in a single document i.e. try not to use same element name at multiple places within a document if they have different meaning for your use case. Either give them different element names or put them under different namespaces to remove any ambiguity. e.g. if you have element "table" at two places in a single document then you can put them both under different namespaces such as html:table & furniture:table or you can name them differently such as html_table & furniture_table.

2.) If element names are unique within a document then you don't need to create additional indexes. If element names are not unique within a document and you are interested in only a specific XPath then create path(field) indexes on those XPaths and use the same in your not-query.

 

Introduction

MarkLogic Server has shipped with full support for the W3C XML Schema specification and schema validation capabilities since version 4.1 (released in 2009).

These features allow for the validation of complete XML documents or elements within documents against an existing XML Schema (or group of Schemas), whose purpose is to define the structure, content, and typing of elements within XML documents.

You can read more about the concepts behind XML Schemas and MarkLogic's support for schema based validation in our documentation:

https://docs.marklogic.com/guide/admin/schemas

Caching XML Schema data

In order to ensure the best possible performance at scale, all user created XML Schemas are cached in memory on each individual node within the cluster using a portion of that node's Expanded Tree Cache.

Best practices when making changes to pre-existing XML Schemas: clearing the Expanded Tree Cache

In some cases, when you are redeploying a revised XML Schema to an existing schema database, MarkLogic can sometimes refer to an older, cached version of the schema data associated with a given document.

Therefore, it's important to note that whenever you plan to deploy a new or revised version of a Schema that you maintain, as a best practice, it may be necessary to clear the cache in order to ensure that you have evicted all cached data stored for older versions of your schemas.

If you don't clear the cache, you may sometimes get references to the old, cached schema references and as result, you may get errors like:

XDMP-LEXVAL (...) Invalid lexical value

You can clear all data stored in the Expanded Tree Cache in two ways:

  1. By restarting MarkLogic service on every host in the cluster. This will automatically clear the cache, but it may not be practical on production clusters.
  2. By issuing a call to xdmp:expanded-tree-cache-clear() command on each host in the cluster. You can run the function in query console or via REST endpoint and you will need a user with admin rights to actually clear the cache.

An example script has been provided that demonstrates the use of XQuery to execute the call to clear the Expanded Tree Cache against each host in the cluster:

Please contact MarkLogic Support if you encounter any issues with this process.

Related KB articles and links:

Summary

XDMP-ODBCRCVMSGTOOBIG can occur when a non-ODBC process attempts to connect to an ODBC application server.  A couple of reasons that this can happen is that there is an http application that has been accidentally configured to point to the ODBC port, or a load balancer is sending http health checks to an ODBC port. There are a number of common error messages that can indicate whether this is the case.

Identifying Errors and Causes

One method of determining the cause of an XDMP-ODBCRCVMSGTOOBIG error is to take the size value and convert it to Characters.  For example, given the following error message:

2019-01-01 01:01:25.014 Error: ODBCConnectionTask::run: XDMP-ODBCRCVMSGTOOBIG size=1195725856, conn=10.0.0.101:8110-10.0.0.103:54736

The size, 1195725856, can be converted to the hexadecimal value 47 45 54 20, which can be converted to the ASCII value "GET ".  So what we see is a GET request being run against the ODBC application server.

Common Errors and Values

Error Hexadecimal Characters
XDMP-ODBCRCVMSGTOOBIG size=1195725856 47 45 54 20 "GET "
XDMP-ODBCRCVMSGTOOBIG size=1347769376 50 55 54 20 "PUT "
XDMP-ODBCRCVMSGTOOBIG size=1347375956 50 4F 53 54 "POST"
XDMP-ODBCRCVMSGTOOBIG size=1212501072 48 45 4C 50 "HELP"

Conclusion

XDMP-ODBCRCVMSGTOOBIG errors, do not affect the operation of MarkLogic Server, but can cause error logs to fill up with clutter.  Determining that the errors are caused by an http request to an ODBC port can help to identify the root cause, so the issue can be resolved.

Summary

In MarkLogic Server v7.0-2, the tokenizer keys, for languages where MarkLogic provides generic language support, were removed so that they now all use the same key. For example, Greek falls into this class of languages. This change was made as part of an optimization for languages in which MarkLogic Server has advanced stemming and tokenization support.  

Stemmed searches that include characters from languages that do not have advanced language support, performed on MarkLogic Server v7.0-2 or later releases, against content loaded on a version previous to v7.0-2, may not return the expected results.

Resolution

In order to successfully run these stemmed searches, you can either:

  • Reindexing the database ; or
  • Reinsert the affected documents (i.e. the documents that contain characters in languages for which MarkLogic Server only has generic language support).

If these are not possible in your environment, you can always run the query unstemmed.

An Example

The following example demonstrates the issue

  1. On MarkLogic Server version 7.0-1, insert a document (test.xml) that contains the Greek character 'ε'.
  2. Run this query 
    xdmp:estimate( cts:search( doc('test.xml'), 'ε')),
    cts:contains( doc('test.xml'), 'ε')
  3. The query will return the correct results: 1, true
  4. Upgrade MarkLogic Server to version 7.0-3 or later and run the query again
  5. The query will return incorrect results: 0, false 
  6. Reindex the database and re-run the query
  7. The query will return the correct result once again.
     

Introduction

This Knowledgebase article outlines the procedure to enable HTTPS on an AWS Elastic Load Balancer (ELB) using Route 53 or an external supplier as the DNS provider and with an AWS generated certificate.

The AWS Certificate Manager (ACM) automatically manages and renews the certificate and this certificate will be accepted by all current browsers without any security exceptions.

The downside is that you do need control over your Hosted DNS name entry - either through Route 53 or through another provider.

Prerequisites

  1. MarkLogic AWS Cluster
  2. An AWS Route 53 hosted Domain or similar externally hosted Domain; the procedure described in this article assumes that Route 53 is being used, however where possible we have tried to detail the changes needed and these should also be applicable for another external DNS provider.

Procedure

  1. Click on your hostname in Route 53 to edit it

  1. Create a new Alias Record Set to point to your Elastic Load Balancer.

  1. In the Record Set entry on the right hand side, enter an Alias name for your ELB host, select Alias and from the Alias Target select the ELB load balancer to use, then click the Create button to update the Route 53 entry.

  1. In can take a little while for AWS to propagate the DNS update throughout the network but once it is available it is worth checking that you are able to reach your MarkLogic cluster using the new address, e.g.

  1. Once the Route 53 entry is updated and available you will need to request a new certificates through ACM, if you have other certificates already in ACM you can select Request a certificate

Otherwise select Get Started with Provision Certificates and select Request a public certificate

  1. Enter your required Certificate domain name and click Next:

Note: This should match your DNS Alias name entry created in Step 3.

In addition you can also add additional records such as a "Wildcard" entry, this is particularly useful if you want to use the same certificate for multiple hostnames, e.g if you have Clusters identified by versions such as ml9.[yourdomain].com & ml10.[yourdomain].com

  1. Select DNS as the Validation Method and click "Review"

  1. Before confirming and proceeding check the Hostnames are correct as certificates with invalid hosts names will not be usable.

  1. To complete validation, AWS will require you to add random CNAME entries to the DNS record to confirm that you are the owner. If you are using Route 53 this is as simple as selecting each entry in turn, numbers will vary depending on the number of Doamin name entries you specified in step 6, and clicking "Create record in Route 53". Once all entries have been created click Continue

  1. If the update is successful a Success message is displayed

  1. If your DNS Hostname is provided by an external provider you will need to download the entries using the "Export DNS configuration to a file link" and provide this information to your DNS provider to make the necessary updates.

The file is a simple CSV file and specifies one or more CNAME entries that need to be created with the required name and values. Once the AWS DNS validation process picks up these changes have been made the certificate creation process will be completed automatically.

Domain Name,Record Name,Record Type,Record Value
marklogic.[yourdomain].com,_c3949adef7f9a61dd6865a13e65acfdb.marklogic.[yourdomain].com.,CNAME,_7ec4e5ce2cf31212e20ce68d9d0ab9fd.kirrbxfjtw.acm-validations.aws.
*.[yourdomain].com,_9b2138934ee9bbe8562af4c66591d2de.[yourdomain].com.,CNAME,_924153c45d53922d31f7d254a216aed0.kirrbxfjtw.acm-validations.aws.
  1. Once the Certificate has been validated by either of the methods in Steps 9 or 11 the certificate will be marked as Issued and be available for the Load Balancer to use.

  1. Configure the ELB for HTTPS And the new AWS generated Certificate
  2. Edit the ELB Listeners and change the Cipher

  1. (Optional) For production environments it is recommended to allow TLSv1.2 only

  1. Next select the Certificate and repeat Steps 15 and 16 for each listener that you want to secure.

  1. From the ACM available certificates select the newly generated certificate for this domain and click Save

  1. Save the Listeners updates and ensure the update was successful.

  1. You should now be able to access your MarkLogic cluster securely over HTTPS using the AWS generated certificate.

Introduction

HAProxy (http://www.haproxy.org/) is a free, fast and reliable solution offering high availability, load balancing and proxying for TCP and HTTP-based applications.

MarkLogic 8 (8.0-8 and above) and MarkLogic 9 (9.0-4 and above) include improvements to allow you to use HAProxy to connect to MarkLogic Server.

MarkLogic Server supports balancing application requests using both the HAProxy TCP and HTTP balancing modes depending on the transaction mode being used by the MarkLogic application as detailed below:

  1. For single-statement auto-commit transactions running on MarkLogic version 8.0.7 and earlier or MarkLogic version 9.0.3 and earlier, only TCP mode balancing is supported. This is due to the fact that the SessionID cookie and transaction id (txid) are only generated as part of a multi-statement transaction.
  2. For multi-statement transactions or for single-statement auto-commit transactions running on MarkLogic version 8.0.8 and later or MarkLogic version 9.0.4 and later both TCP and HTTP balancing modes can be configured.

The Understanding Transactions in MarkLogic Server and Single vs. Multi-statement Transactions in the MarkLogic documentation should be referenced to determine whether your application is using single or multi-statement transactions.

Note: Attempting to use HAProxy in HTTP mode with Single-statement transactions prior to MarkLogic versions 8.0.8 or 9.0.4 can lead to unpredictable results.

Example configurations

The following example configurations detail only the parameters relevant to enabling load balancing of a MarkLogic application, for details of all parameters that can be used please refer to the HAProxy documentation.

TCP mode balancing

The following configuration is an example of how to balance requests to a 3-node MarkLogic application using the "roundrobin" balance algorithm based on the source IP address. The health of each node is checked by a TCP probe to the application server every 1 second.

backend app
mode tcp
balance roundrobin
stick-table type ip size 200k expire 30m
stick on src
default-server inter 1s
server app1 ml-node-1:8012 check id 1
server app2 ml-node-2:8012 check id 2
server app3 ml-node-3:8012 check id 3

HTTP mode balancing

The following configuration is an example of how to balance requests to a 3-node MarkLogic application using the "roundrobin" balance algorithm based on the "SessionID" cookie inserted by the MarkLogic server.

The health of each node is checked by issuing an HTTP GET request to the MarkLogic health check port and checking for the "Healthy" response.

backend app
mode http
balance roundrobin
cookie SessionID prefix nocache
option httpchk GET / HTTP/1.1\r\nHost:\ monitoring\r\nConnection:\ close
http-check expect string Healthy
server app1 ml-node-1:8012 check port 7997 cookie app1
server app2 ml-node-2:8012 check port 7997 cookie app2
server app3 ml-node-3:8012 check port 7997 cookie app3

Summary

MarkLogic Server organizes Trusted Certificate Authorities (CA) by Organization Name.  Trusted Certificate Authorities are the issuers of digital certificates, which in turn are used to certify the public key on behalf of the named subject as given in the certificate.  These certificates are used in the authentication process by:

  1. A MarkLogic Application Server configured to use SSL (HTTPS).
  2. Any Web Client which is making a connection to a MarkLogic Application Server over HTTPS (in the case of SSL Client Authentication).

Example Scenarios

Consider the following example:

$openssl x509 -in CA.pem -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 18345409437988140316 (0xfe97fcaf8a61b51c)
    Signature Algorithm: sha1WithRSAEncryption
        Issuer: C=US, ST=CA, L=San Carlos, O=MarkLogic Corporation, OU=Engineering, CN=MarkLogic CA
        Validity
            Not Before: Nov 30 04:08:31 2015 GMT
            Not After : Nov 29 04:08:31 2020 GMT
        Subject: C=US, ST=CA, L=San Carlos, O=MarkLogic Corporation, OU=Engineering, CN=MarkLogic CA

In this example, From viewing the Trusted CA Subject field, the CA Certificate name will be listed with the organisation name of "MarkLogic Corporation" (O=MarkLogic Corporation) in MarkLogic's list of Certificate Authorities.

You can view the full list of currently configured Trusted Certificate Authorities by logging into the MarkLogic administration Application Server (on port 8001) and viewing the status page: Configure -> Security -> Certificate Authorities

Trusted CA Certificate without Organization name (O=)

In some cases, there are legitimate Trusted CA Certificates which do not contain any further information about the Organization responsible for the certificate.

The example below shows a sample self signed root CA (DemoLab CA) which highlights this scenario:

$openssl x509 -in DemoLabCA.pem  -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 12836463831212471403 (0xb22447d80f91b46b)
    Signature Algorithm: sha1WithRSAEncryption
        Issuer: CN=DemoLab CA
        Validity
            Not Before: Nov 30 05:23:13 2015 GMT
            Not After : Nov 29 05:23:13 2020 GMT
        Subject: CN=DemoLab CA

If this Certificate were to be loaded into the MarkLogic, no name would appear under the list of <em>Certificate Authorities</em>in the list provided through the administration Application Server at Configure -> Security -> Certificate Authorities

In the case of the above example, it would be difficult to use the certificate validated by DemoLab CA (and to use DemoLab CA as our Trusted Certificate Authority) as MarkLogic will only list certificates that are associated with an Organization.

Solution

To workaround this issue, we can configure MarkLogic to use the certificate through some scripting with Query Console.

1) Loading the CA using Query Console

Start by using a call to pki:insert-trusted-certificates to load the Trusted CA into MarkLogic.  The sample Query Console code below demonstrates this process (Please ensure this query is executed against the Security database)

Make a note of value of the id returned by MarkLogic. It will return an unsigned long (xs:unsignedLong) which is the id value that can be used later to retrieve that certificate

2) Attach Trusted CA with "SSL Client Certificate Authorities" using Query Console

The next step is to associate the certificate that we just inserted from our filesystem (DemoLabCA.pem) with a given MarkLogic Application Server. Once this is done, any client connecting to that application server over SSL will be presented with the cerificate and DemoLab CA will be used to match the certificate using the Common Name value (Common Name eq "DemoLab CA")

3) Verify attached Trusted CA for Client Cetificate Authorities

Executing the above code should return the same identifier (for the Trusted CA) as returned as result of the code executed in step 1. Additionally, we can see that our Application Server (DemoAppServer) is now configured to expect an SSL Client Certificate Authority signed by DemoLab CA.

Further Reading

Introduction

MarkLogic Server is engineered to scale out horizontally by easily adding forests and nodes. Be aware, however, that when adding resources horizontally, you may also be introducing additional demand on the underlying resources.

Details

On a single node, you will see some performance improvement in adding additional forests, due to increased parallelization. This is a point of diminishing returns, though, where the number of forests can overwhelm the available resources such as CPU, RAM, or I/O bandwidth. Internal MarkLogic research (as of April 2014) shows the sweet spot to be around six forests per host (assuming modern hardware). Note that there is a hard limit of 1024 primary forests per database, and it is a general recommendation that the total number of forests should not grow beyond 1024 per cluster.

At cluster level, you should see performance improvements in adding additional hosts, but attention should be paid to any potentially shared resources. For example, since resources such as CPU, RAM, and I/O bandwidth would now be split across multiple nodes, overall performance is likely to decrease if additional nodes are provisioned virtually on a single underlying server. Similarly, when adding additional nodes to the same underlying SAN storage, you'll want to pay careful attention to making sure there's enough I/O bandwidth to accommodate the number of nodes you want to connect.

More generally, additional capacity above a bottleneck generally exacerbates performance issues. If you find your performance has actually decreased after horizontally scaling out some part of your stack, it is likely that a part of your infrastructure below the part at which you made changes is being overwhelmed by the additional demand introduced by the added capacity.

Summary

MarkLogic Application Servers will keep a connection open after completing and responding to a request, waiting for another new request, until the Keep Alive timeout expires. However, there is an exception scenario where the connection will close regardless of timeout settings when the content is larger then 1 MB. This article is intended to provide further insight into connection close with respect to Payload size.

HTTP Header

Connection-Length

In general, Application Servers communicating in HTTP send the Content-Length header as part of their response HTTP Headers to indicate how many bytes of data the client application should expect to receive. For example

HTTP/1.1 200 OK
Content-type: application/sparql-results+json; charset=UTF-8
Server: MarkLogic
Content-Length: 1264
Connection: Keep-Alive
Keep-Alive: timeout=5

This requires Application Servers to know the length of the entire response data before the very first bytes (Response HTTP Headers) are put on to the wire. For small amounts of data, the time to calculate the content-length is fast; For large amounts of content, the calculation may be time consuming with the extreme being that the client finds the server unresponsive due to the delay in calculating the entire response length. Additionally, the server may need to bring the entire content into Memory Buffer, putting further burden on server resources.

Chunked-encoding

To allow servers to begin transmitting dynamically-generated content before knowing the total size of that content, HTTP 1.1 supports chunked encoding. This technique is widely used in music & video streaming and other industries. Chunked encoding eliminates the need of knowing the entire content length before sending a portion of the data, thus making the server looks more responsive.

At the time of this writing, MarkLogic Server (v8.0-6 and earlier releases) does not support chunked encoding. However, do look for this feature in future releases of MarkLogic Server.

Connection Close

In MarkLogic Server v7 and v8, MarkLogic Server closes the connection after transmitting content greater 1MB, which allows MarkLogic to avoid calculating content length in advance. The client will not see Content-Length Header for Larger (>1MB) content in HTTP Response from MarkLogic. Instead it will receive a Connection Close header in HTTP Response. After sending the entire content, MarkLogic Server will terminate the connection, to indicate to Client that the end of content has been reached.

Closing the existing connection for content larger then 1MB is an exception to the Keep-Alive configuration. This may result in unexpected behavior on clients that relying on MarkLogic Server respecting the Keep-Alive configuration, so this behavior should be accounted while designing Client Application Connection Pool.

Client Applications may have to send TCP SYN again to establish new connection to send subsequent request, which will add overhead of TCP 3 way handshake before sending next request. However, in the context of the data transfer for larger payload (>1MB), where many more round trips are added in overall communication, overhead of TCP 3 way handshake is very nominal.

Further Reading

Summary

CSV files are a very common data exchange format. It is often used as an export format for spreadsheets, databases or any other application. Depending on the application, you might be able to change the delimiter character to a #hash or *asterix etc. One of the default delimiter definitions is a tab character. Content Pump supports reading and loading such CSV files.

Detail

The Content Pump -delimiter option defines which delimiter will be used to split the columns. Defining a tab as a value for the delimiter option on the command line isn't straight forward.

Loading tab delimited data files with content pump can result in an error massage like the following:

mlcp>bin/mlcp.sh IMPORT -host localhost -port 9000 -username admin -password secret -input_file_path sample.csv -input_file_type delimited_text -delimiter '    ' -mode local
13/08/21 15:10:20 ERROR contentpump.ContentPump: Error parsing command arguments: 
13/08/21 15:10:20 ERROR contentpump.ContentPump: Missing argument for option: delimiter
usage: IMPORT [-aggregate_record_element <QName>]
... 

Depending on the command line shell, a tab needs to be escaped to be understand from the shell script: 

On bash shell, this should work: -delimiter $'\t'
On Bourne shell, this should work: -delimiter 'Ctrl+V followed by tab' 
Alternative way would be to use: -delimiter \x09 

If none of these work, another approach you can try is to use the -options_file /path/to/options-file parameter. The options file can contains all of the same parameters as the command line does. The benefit of using an option file is that the command line is simpler and characters are interpreted as intended. The options file will contain multiple lines where the first line is always the action like IMPORT,  EXPORT etc. followed by a pair of lines. The first line is the option parameter and second the value for the option.

A sample could look like the following:

IMPORT
-host
localhost
-port
9000
-username
admin
-password
secret
-input_file_path
/path/to/sample.csv
-delimiter
' '
-input_file_type
delimited_text


Make sure the file is saved in UTF-8 format to avoid any parsing problems. To define a tab as delimiter, place a real tab between single quotes (i.e. '<tab>')

To use this option file with mlcp execute the following command:

Linux, Mac, Solaris:

mlcp>bin/mlcp.sh -options_file /path/to/sample.options

Windows:

mlcp>bin/mlcp.bat -options_file /path/to/sample.options

The options file can take any paramter which mlcp understands. It is important that the action command is defined on the first line. It is also possible to use both command line parameters and the option file. Command line parameters take precedence over those defined in the options file.

Introduction:

MarkLogic Server allows you to set-up an alerting application to notify users when new content is available that matches a predefined query. This can be achieved through the Alerting API with the Content Processing Framework (CPF). CPF is designed to keep state for documents, so it is easy to use CPF to keep track of when a document in a particular scope is created or updated, and then perform some action on that document. However, although alerting works for document updates and inserting, it does not occur for document deletes. You will have to create a custom CPF pipeline to catch the delete through an appropriate status transition.

Details

To achieve alerting for document delete, you will have to write your own custom pipeline with status transition to handle deletes. For example:

<status-transition>
   <annotation>custom delete action</ annotation>
   <status>deleted</p:status>
   <priority>5000</p:priority>
   <always>true</always>
   <default-action>
       <module>/custom-delete-action.xqy</module>
   </default-action>
</status-transition>

The higher 'priority' value and 'always' = true indicates that the custom pipeline has precedence over the default status change handling pipeline to handle document deletes.  Similarly, in the action module, you can write your custom code for alerting.

Note: By default, when a document is deleted, the on-delete pre-commit trigger is fired and it calls the action in the Status Change Handling pipeline (if enabled) for ‘delete’ status transition. It is recommended that you do not modify this pipeline as it can cause compatibility problems in future upgrades and releases of MarkLogic server.

Introduction

If you're looking at the MarkLogic Admin UI on port 8001, you may have noticed that the status page for a given database displays the last backup dateTime for a given database.

We have been asked in the past how this gets computed so the same check can be performed using your own code.

This Knowledgebase article will show examples that utilise XQuery to get this information and will explore the possibility of retrieving this using the MarkLogic ReST API

XQuery: How does the code work?

The simple answer is in the forest status for each of the forests in the database (note these values only appear if you have created a backup already).  For the sake of these examples, let's say we have a database (called "test") which contains 12 forests (test-1 to test-12).  We can get the backup status for these using a call to our ReST API:

http://localhost:8002/manage/LATEST/forests/test-1?view=status&format=html

In the results returned, you should see something like this:

last-backup : 2016-02-12T12:30:39.916Z datetime
last-incr-backup : 2016-02-12T12:37:29.085Z datetime

In generating that status page, what the MarkLogic code does is to create an aggregate: a database doesn't contain documents in MarkLogic; it contains forests and those forests contain documents.

Continuing the example above (with a database called "test" containing 12 forests) if I run the following:

This will return me the forest status(es) for all forests in the database "test" and return the forest names using XPath, so in my case, I would see:

<forest-name xmlns="http://marklogic.com/xdmp/status/forest">test-1</forest-name>
[...]
<forest-name xmlns="http://marklogic.com/xdmp/status/forest">test-12</forest-name>

Our admin UI is interrogating each forest in turn for that database and finding out the metrics for the last backup.  So to put that into context, if we ran the following:

This gives us:

<last-backup xmlns="http://marklogic.com/xdmp/status/forest">2016-02-12T12:30:39.946Z</last-backup>
[...]
<last-backup xmlns="http://marklogic.com/xdmp/status/forest">2016-02-12T12:30:39.925Z</last-backup>

The code (or the status report) doesn't want values for all 12 forests, it just wants the time the last forest completed the backup (because that's the real time the backup completed), so our code is running a call to fn:max:

Which gives us the max value (as these are all xs:dateTimes, it's finding the most recent date), which in the case of this example is:

2016-02-12T12:30:39.993Z

The same is true for the last incremental backup (note all that we're changing here is the XPath to get to the correct element:

So we can get the max value for this by getting the most recent time across all forests:

This would give us 2016-02-12T12:37:29.161Z

Using the ReST API

The ReST API also allows you to get this information but you'd need to jump through a few hoops to get to it:

The ReST API status for a given database would give you the names of all the forests attached to that database:

http://localhost:8002/manage/LATEST/databases/test

And from there you could GET the information for all of those forests:

http://localhost:8002/manage/LATEST/forests/test-1?view=status&format=html
[...]
http://localhost:8002/manage/LATEST/forests/test-12?view=status&format=html

Once you'd got all those values, you could do what MarkLogic's admin code does and get the max values for them - although at this stage, it might make more sense to write a custom endpoint that returns this information, something like:

Where you could make a call to that module to get the aggregates (e.g.):

http://[server]:[port]/[modulename.xqy]?db=test

This would return the database status for any given parameter-name that is passed in.

 

Problem:

When searching for matches using OR'ed word-queries, and in the case where there are overlapping matches, (i.e. one query contains the text of another query), the results of a cts:highlight query are not as desired.

 

For example:

 

let $p := <p>From the memoirs of an accomplished artist</p>

 

let $query :=

 

cts:or-query(

(cts:word-query("accomplished artist"),

cts:word-query("memoirs of an accomplished artist"))

)

 

return cts:highlight($p, $query, <m>{$cts:text}</m>)

 

 The desired outcome of this would be:

               <p>From the <m>memoirs of an accomplished artist</m> </p>

 Whereas, the actual results are:

                <p>From the <m>memoirs of an </m> <m>accomplished artist</m></p>

 

This behavior is by design and the results are expected. It is because cts:highlight  breaks up overlapping  areas into separate matches.

The cts:highlight built-in variables – $cts:queries and $cts:action help in understanding how this works, as well as to work-around this problem.

  $cts:queries --> returns the matching queries for each of the matched texts.

  $cts:action --> can be used with xdmp:set to specify what should happen next

  • "continue" - (default) Walk the next match. If there are no more matches, return all evaluation results.
  • "skip" - Skip walking any more matches and return all evaluation results
  • "break" - Stop walking matches and return all evaluation results

   For eg., replacing the return statement with the following in the original query:

return

 cts:highlight($p, $query,

<m>{$cts:text,<number-of-matches>{count($cts:queries)}</number-of-matches>,

<matched-by>{$cts:queries}</matched-by>}</m>)

 

==>

 

<p>From the

     <m>memoirs of an

     <number-of-matches>1</number-of-matches>

     <matched-by>

      <cts:word-query xmlns:cts="http://marklogic.com/cts">

       <cts:text xml:lang="en">memoirs of an accomplished artist</cts:text>

      </cts:word-query>

    </matched-by>

     </m>

 

   <m>accomplished artist

   <number-of-matches>2</number-of-matches>

    <matched-by>

      <cts:word-query xmlns:cts="http://marklogic.com/cts">

     <cts:text xml:lang="en">memoirs of an accomplished artist</cts:text>

      </cts:word-query>

      <cts:word-query xmlns:cts="http://marklogic.com/cts">

    <cts:text xml:lang="en">accomplished artist</cts:text>

      </cts:word-query>

    </matched-by></m></p>

 

These results give us a better understanding of how the text is being matched. We can see that " accomplished artist" is matched by both the word-queries 'accomplished artist' and 'memoirs of an accomplished artist'; hence the results of cts:highlight seem different.

To work around this problem, we can insert a small piece of code: 

 

let $p := <p>From the memoirs of an accomplished artist</p>

let $query :=

     cts:or-query(

        (cts:word-query("accomplished artist"),

        cts:word-query("memoirs of an accomplished artist")))

 

     return cts:highlight($p,$query,

 

       ( if (count($cts:queries) gt 1) then xdmp:set($cts:action, "continue")

         else

       ( let $matched-text := <x>{$cts:queries}</x>/cts:word-query/cts:text/data(.)

        return <m>{$matched-text}</m> )

        ))

 

==>

 

<p>From the <m>memoirs of an accomplished artist</m></p>

 

 

Please note that this solution relies on assumptions about what's inside the or-query, but this example could be modified to handle other overlapping situations.

 

   

 



      These results giv

      e us a better understanding of how the text is being matched. We can see that " accomplished artist" is matched by both the word-queries, and hence the results of cts:highlight seem different.

      Introduction

      In the Scalability, Availabilty & Failover Guide, the node communication section describes a quorum as >50% of the nodes in a cluster. Is it possible for a database to be available for reads and writes, even if a quorum of nodes is not available in the cluster?

      The answer is yes, there are configurations and sequences of events that can lead to forests remaining online when there are fewer than 50% of the hosts being online.

       

      Details

      If a single forest in a database is not available, the database is not be accessible.  It is also true that as long as all of a database's forests are available in the cluster, the database will be available for reads and writes regardless of any quorum issues. 

      Of course, the Security database must also be available in the cluster for the cluster to function.

      Forest Availability

      In the simplest case, if you have a forest that is not configured with either local disk failover or shared disk failover and as long as the forest's host is online and exists in the cluster, the forest will be available regardless of any quorum issues. 

      For forests configured for local disk failover, the sequence of events is important:

      In response to a host failure that makes an "open" forest inaccessible, the forest will failover to the configured forest replica as long as a quorum exists and the configured replica forest was in the "sync replicating" state.  In this case, the configured replica forest will transition to the "open" state; the configured replica forest becomes the acting master forest and is available to the database for both reads and writes.

      Additionally, an "open" forest will not go offline in response to another host being evicted from the cluster.

      However, once a quorum is lost, forest failovers will no longer occur.

      Conclusion

      Depending on how your forests are distributed in the cluster and depending of the order of host failures, it is possible that a database can remain online even when there is no longer a quorum of hosts in the cluster.

      Of course, databases with many forests spread across many hosts typically can't stay online if you lose quorum because some forest(s) will become unavailable.

      Recommendation

      Even though it is possible to have a functioning cluster with less than a quorum of hosts online, you should not architect your high availability solution to depend on it.

      Summary

      This article discusses what happens when you backup or restore your database after a local disk failover event on one of the database forests.

      Introduction

      MarkLogic Server provides high availability in the event of a data node failure. Data node failures can include operating system crashes, MarkLogic Server restarts, power failures, or persistent system failures; for example hardware failures. With Forest level failover enabled and configured, a machine that hosts a forest can go down and the MarkLogic Server cluster automatically recovers from the outage and keep continuing to process queries without any immediate action needed by an administrator. In MarkLogic Server, if a forest becomes unavailable then the entire database to which this forest is attached becomes unavailable for further query operations. Without failover, such a failure requires a manual intervention (such as administrator) to either reconfigure the forest to another host or to remove this forest from the configuration (cluster). With failover, you can configure the forest to automatically switch to a replica forest on a different host. MarkLogic Server Failover provides for high availability and maintains data and transactional integrity in the event of a data node failure.

      The failover scenarios are well documented on our developer web site.

      Local Disk Failover

      You to configure a forest on another host to serve as a replica forest which will take over when a primary master forest's host goes offline. Local-disk failover allows you to create one or more replica forests for each primary forest. Replica forests contain the exact same data as the primary forest and are kept consistent transactionally. 

      It is helpful to use the following terms to refer to the forest configurations and states:

      • Configured Master is the forest which is originally configured as the primary forest.
      • Configured Replica is a forest on another host that is configured as a replica forest of the primary. 
      • Acting Master is the forest that is server as the master forest, regardless of the configuration.
      • Acting Replica is the forest that is server as the replica forest, regardless of the configuration.

      Database Backup when a forest is failed over

      If you attempt to take a Database back up or perform a database restore when One of the forests of the database had failed over to the replica (i.e. Configured Replica is serving as Acting Master), it may result in XDMP-FORESTNOTOPEN or XDMP-HOSTDOWN errors.

      When a database backup takes place, by default, everything associated with database gets backed up. You can also choose to backup any individual forests (only the forests selected while configuring backup are backed up). T

      Replica Forest will only be backed up when the 'Include replica forests' are enabled.  If you have not configured the backup to include replica forests, then the replica forests will not be backed up even if it is the acting master. If the Configured Master is also not available, then neither forest will be backed up. In this circumstance, you may see a message in the error logs similar to "Warning: Not backing up database test because first forest master is not available, and replica backups aren't enabled."

      Restore when a forest is failed over

      Restore's will fail if executed when a forest is failed over (i.e. Configured Replica is serving as Acting Master). In this circumstance, you may see a message in the error logs similar to "Operation failed with error message. Check server logs." or "XDMP:HOSTDOWN".

      How to detect if a forest is failed over

      In the Admin UI:

      1. Click the Forests icon in the left tree menu;
      2. Click the Summary tab;
      3. You see the configured replica in open state; (This indicates that the Configured Replica is serving as Acting Master).

      At the time of the failover event, you may see messages in the Error Log similar to:
      2013-10-03 12:49:53.873 Info: Disconnecting from domestic host rh6v-intel64-9.marklogic.com in cluster 16599165797432706248 because it has not responded for 30 seconds.
      2013-10-03 12:49:53.873 Info: Disconnected from host rh6v-intel64-9.marklogic.com
      2013-10-03 12:49:53.873 Info: Unmounted forest test_P
      2013-10-03 12:49:53.875 Info: Forest test_R assuming the role of master with new precise time 13808297938747190
      2013-10-03 12:49:53.875 Debug: Recovering undo on forest test_R
      2013-10-03 12:49:53.875 Debug: Recovered undo at endTimestamp 13807844927734200 minQueryTimestamp 0 on forest test_R

      Revert back from the failover state:

      When the configured master is the acting replica, this is considered the "failover state".  In order to revert back, you must either restart the acting master forest or restart the host in which the acting master forest is locally mounted. After restarting, the forest will automatically revert to Configured Master if it's host is online. To check the status of the forests, see the Forests Summary tab in the Admin Interface. 


      Conclusion 

      For backup and restore to work correctly, clusters configured with local disk failover must have no forests in a failed over state. If a cluster is configured with local disk failover, and if some of its forests are failed over to their local disk replicas, the conditions causing the fail over must be resolved, and the cluster must be returned to the original forest configuration before backup and restore operations may resume.

      INTRODUCTION

      From the documentation:

      Queries on a Replica database must run at a timestamp that lags the current cluster commit timestamp due to replication lag. Each forest in a Replica database maintains a special timestamp, called a Non-blocking Timestamp, that indicates the most current time at which it has complete state to answer a query. As the Replica forest receives journal frames from its Master, it acknowledges receipt of each frame and advances its nonblocking timestamp to ensure that queries on the local Replica run at an appropriate timestamp. Replication lag is the difference between the current time on the Master and the time at which the oldest unacknowledged journal frame was queued to be sent to the Replica.

      To read more:

      http://tinyurl.com/7zwq4l2

      SCENARIO

      Consider the following customer scenario:

      • The storage the database resides on at one site fails.
      • This requires the customer to run for a period of time on a single site.
      • The storage / MarkLogic server are recovered at the site where the failure occurred.
      • The customer needs to re-establish replication between the two sites

      QUESTIONS AND ANSWERS

      Q: Should we tune the lag limit to suit our application?

      AWe have found in our own performance testing that increasing the lag limit beyond the default is typically not helpful.

      When the master has a sustained rate of updates, a large lag limit causes it to run quickly ahead of the replica, then stall for an extended period of time until the replica catches up. This pattern repeats over and over and gives inconsistent performance on the master.

      A smaller lag limit causes the master to suspend updates more frequently but for shorter periods of time, resulting in more consistent perceived performance.

      Q: Is there any option to restore the replica database to a point in time from a backup of the master database & re-initiate replication from that point onwards?

      A: It's fine to restore a backup to the failed system when it comes back online and before configuring replication in the reverse direction.

      Q: Is there a limit to how old a backup of the replica database can be (e.g. can a replica be restored from months back in comparison to the master) and will it still sync back to the master without issue? And does this depend on what journal data is available?

      A: There is no limit to how old a backup can be; the system will calculate all the deltas and apply them.

      Q: Are there any documented API built-ins for any of these things?

      A: Indeed; all the replication information is available through a call to xdmp:forest-status()

      xdmp:forest-status( 
        xdmp:database-forests( 
          xdmp:database("MyDatabase"), 
          fn:true()))

      For further information:

      http://tinyurl.com/d6vbpk4

      Q: Can you also advise if the replication lag limit mentioned in section 1.2.5 and the related possibility of transactions stalling on the master database applies during the bulk replication phase?

      A: As long as the replica's forests are in "open replica" state, the replica will respond to queries at any commit timestamp it is able to support irrespective of whether replication is lagged.

      A new feature in MarkLogic 5 is an application server setting for multi-version concurrency control (by default this is set to contemporaneous - meaning it will run from the latest timestamp that any query has committed - irrespective of whether there are still transactions in-flight).

      Conversely, if nonblocking is chosen (i.e. if you create an application server to query a replica database and you set multi-version concurrency control to nonblocking), the server will choose the last timestamp where all pending transactions are known to have successfully committed.

      If you wish to evaluate a query against a replica database you can use xdmp:database-nonblocking-timestamp() to determine the most current query timestamp that will not block.

      Introduction

      Database Replication replicates fragments/documents from a source database to a target database. You may see different database sizes (even when active fragment counts are then same) between Master and Replica Databases. This article provides overview of variables and reasons behind such observation.

      Database Replication:

      Database Replication operates at the forest level by copying journal frames from a forest in the Master database and replaying them on a corresponding forest in the foreign replica database. In other words, this means that when Journal frames are replayed in the replica database, the same group of documents in a single stand of the master database, does not necessarily reside in the same stand on the replica database - i.e. the distribution of fragments within stands are different between the master and replicas. 

      Also, Note that Master and Replica forests can be distributed differently across hosts in each cluster. Even when they are distributed identically (Master DB forest name to Replica DB forest name) you could still see a different number stand between them.

      Database Size, Deleted Fragment and Merge:

      Current Database Size depends on number of factors like number of documents, index, deleted fragments in Stand etc. Deleted Fragments in any stand itself depends on Merge Policy, Background Merge process, Processing Cycle available, Linux Memory Config, Memory Usage at any given time, and application usage pattern.

      Conclusion:

      Master Cluster and Replica Cluster are separate entities. Although connected, they operate independently. Replica Database on target cluster provides data consistency. However how data can be spread across different stands than the corresponding master, including the retention of deleted fragments, will differ between Master and Replica Cluster. Hence you may see different sizes between Master and Replica Databases, even where the active fragments are the same.

      Further Reading

      Introduction

      If your MarkLogic Server has it's logging level set to "Debug", it's common to see a chain of 'Detecting' and 'Detected' messages that look like this in your ErrorLogs:

      2015-01-27 11:11:04.407 Debug: Detected indexes for database Documents: ss, fp, fcs, fds, few, fep, sln
      2015-01-27 11:11:04.407 Debug: Detecting compatibility for database Documents
      2015-01-27 11:11:04.407 Debug: Detected compatibility for database Documents

      This message will appear immediately after forests are unmounted and subsequently remounted by MarkLogic Server.

      What would cause the forests to be unmounted and remounted

      • Heavy network activity leading to a cluster (XDQP) "Heartbeat" timeout
      • Changes made to forest configuration or indexes
      • Any incident that may cause a "Hung" message

      What are "Hung" messages?

      Whenever you see a "Hung" message it's very often indicative of a loss of connection to the IO subsystem (especially the case when forests are mounted on network attached storage rather than local disk). Hung messages are explained in a little more detail in this Knowledgebase article:
      https://help.marklogic.com/Knowledgebase/Article/View/35/0/hung-messages-in-the-errorlog

      What do the "Detected" messages mean and what can I do about them?

      Whenever you see a group of "Detecting" messages:

      2015-01-14 13:06:26.016 Debug: Detecting indexes for database XYZ

      There was an event where MarkLogic chose to (or was required to) attempt to unmount and remount forests (and the event may also be evident in your ErrorLogs).

      The detecting index message will occur soon after a remount, indicating that MarkLogic Server is examining forest data to check whether any reindexing work is required for all databases available to the node which have Forests attached:

      2015-01-14 13:06:26.687 Debug: Detected indexes for database XYZ: ss, wp, fp, fcs, fds, ewp, evp, few, fep

      The line immediately below indicates that the scan has been completed and the database has been identified as having been configured with a number of indexes. For the line above, these are:

      ss
      stemmed searches
      wp
      word positions
      fp
      fast phrase searches
      fcs
      fast case sensitive searches
      fds
      fast diacritic sensitive searches
      ewp
      element word positions
      evp
      element value positions
      few
      fast element word searches
      fep
      fast element phrase searches

      From this list, we are able to determine which indexes were detected.  These messages will occur after every remount if you have index detection set to automatic in the database configuration.

      Every time the forest is remounted, in addition to a recovery process (where the Journals are scanned to ensure that all transactions logged were safely committed to on-disk stands), there are a number of other tests the server will do. These are configured with three options at database level:

      • format compatibility
      • index detection
      • expunge locks

      By default, these three settings are configured with the "automatic" setting (in MarkLogic 7), so if you have logging set to "Debug" level, you'll know that these options are being worked through on remount:

      2015-01-14 13:06:26.016 Debug: Detecting indexes for database XYZ (represents the task for "automatic" index detection where the reindexer checks for configuration changes)
      2015-01-14 13:06:26.687 Debug: Detecting compatibility for database XYZ (represents the task for "automatic" format compatibility where the on-disk stand format is detected)

      These default values may change in accross releases of MarkLogic Server. In MarkLogic 8, expunge locks is set to none but the other two are still set to automatic.

      Can these values be changed safely and what happens if I change these?

      Unmounting / remounting times can be made much shorter by configuring these settings away from automatic but there are some caveats involved; if you need to upgrade to a future release of the product, it's likely that the on-disk stand format may change (it's still 5.0 even when MarkLogic 8 is released) and so setting format compatibility to 5.0 should cause the "Detecting compatibility" messages to disappear and speed up remount times.

      The same is true for disabling index detection but it's important to note that changing index settings on the database will no longer cause the reindexer to perform any checks on remount; in this case you would need to enable this for changes to database index settings to be reindexed.

      Summary

      This article will provide steps to debug applications using the Alerting API that are not triggering an alert.

      Details

      1) Check that all required components are present in the database where alerting is setup: config, actions, rules.   Run the attached script 'getalertconfigs.xqy' through the Query Console and review the output.  

      2) As documented in our Search Developer's Guide, Test the alert manually with alert:invoke-matching-actions(). 

      Example:

      alert:invoke-matching-actions("my-alert-config-uri", 
            <doc>hello world</doc>, <options/>)

      3) Use the rule's query to test against the database to check that the expected documents are returned by the query.

      Take the query text from the rule and run it through Query Console using a cts:search() on the database.  This will confirm whether the expected documents are a positive match.  If the documents are returned and no alert is triggered, then further debugging will be needed on the configuration or the query may need to be modified.

      Introduction 

      Division operations involving integer or long datatypes may generate XDMP-DECOVRFLW in MarkLogic 7. This is the expected behavior but it may not be obvious upon initial inspection.  

      For example, similar queries with similar but different input values executed in Query Console on Linux/Mac machine running MarkLogic 7 gives the following results

      1. This query returns correct results

      let $estimate := xs:unsignedLong("220")

      let $total := xs:unsignedLong("1600")

      return $estimate div $total * 100

      ==> 13.75

      2. This query returns the XDMP-DECOVRFLOW Error

       

      let $estimate := xs:unsignedLong("227")

      let $total := xs:unsignedLong("1661")

      return $estimate div $total * 100

      ==> ERROR : XDMP-DECOVRFLW: (err:FOAR0002)

      Details

      The following defines relevant behaviors in MarkLogic 7 and previous releases.

      • In MarkLogic 7, if all the operands involved in div operations are integer, long or integer sub-types in XML, then the resulting value of the div operation are stored as xs:decimal.
      • In versions previous to MarkLogic 7, if an xs:decimal value is large and occupies all digits then it was implicitly cast into an xs:double for further operations - i.e. beginning with MarkLogic, implict casting no longer occurs in this situation .
      • xs:decimal can accomodate 18 digits as a datatype.
      • In MarkLogic 7 on Linux & Mac, xs:decimal can occupy all digits depending upon actual value ( 227 div 1661 = 0.1366646598434677905 ), all 18 digits occupied in xs:decimal
      • MarkLogic 7 on Windows does not perform division with full decimal precision ( 227 div 1661 produces 0.136664659843468 ); as a result, not all 18 digits occupied in xs:decimal
      • MarkLogic 7 will generates Overflow Exception : FOAR0002, when an operation is performed on an xs:decimal that is already at full decimal precision

      In the example above, multiplying the result with 100 gives an error in Linux/Mac, while its OK on Windows.

      Recommendations:

      We recommend xs:double be used for all division related operations in order to explicitly cast resulting value to larger data-type.

      For example: These will return results

      xs:double($estimate) div $total * 100

      $estimate div $total * xs:double(100)

      .

       

       

       

      Context:

      There are options 'maintain last modified' and 'maintain directory last modified' on the Admin UI for a database, which when turned on add properties to every document inserted in the database.  There may be a need to remove all the property fragments of all the documents in the database when the properties no longer need to be retained.

      Problem:

      Turning these options off for a database ensure that properties will not be created for new documents. However, existing document properties will not be removed by turning these settings off.

      Solution:

      To delete existing document properties, the following query can be used:

       

      xdmp:node-delete(xdmp:document-properties(“your-document-uri”))

       

      Please make sure that 'maintain last modified' and 'maintain directory last modified' options are turned off for the database, so that the property fragment does not get recreated for the document.

       

       

      Introduction

      This KB article lists some available tools for continuous integration and automatically deploying the MarkLogic Server

      Deployment

      Roxy is an open source utility for configuring and deploying MarkLogic applications. Using Roxy you can define your app servers, databases, forests, groups, tasks, etc in local configuration files. Roxy can then remotely create, update, and remove those settings from the command line.
      https://github.com/marklogic/roxy

      MarkLogic Admin API and REST API can be used to script server configurations
      https://docs.marklogic.com/guide/admin-api
      https://docs.marklogic.com/REST/management

      MLSound is an open source tool for deploying and bootstraping projects into MarkLogic 8.
      https://github.com/miguelrgonzalez/mlsound

      Integration Testing

      Roxy includes a unit testing component that allows tests to be written in XQuery and run from the UI or from the command line.

      Another useful open source unit testing tool that is available for writing XQuery unit tests is ‘xray’ - https://github.com/robwhitby/xray

      Implementation Specific Tools

      There are also implementation specific tools to help with check-in build/deployment:

      For use with Java

      ml-gradle is a Gradle plugin that supports a number of tasks pertaining to deploying an application to MarkLogic Server and interacting with other features of MarkLogic via a Gradle build file

      Node.js

      mlnodetools are a collection of command line tools used to simplify the administration of MarkLogic Server

      Python

       

      MarkLogic Python API aims to provide complete coverage of what's in the MarkLogic REST API in idiomatic Python

      Jenkins
      Jenkins is often used with MarkLogic Server for building deployable artifacts, staging build artifacts, running automated tests, and deploying said artifacts. Jenkins has great REST endpoints that make it easy to get / put job configurations, and enable / disable jobs from scripts.

      Jenkins provides a driver to the continuous integration / continuous delivery process that can integrate with other tools. In combination with Roxy, it can be used to run a bootstrap/deploy module/unit test on code check-in.

      One pipeline example used with Jenkins is to:
      1) Pull from Git
      2) Deploy to DEV with Roxy
      3) Run xray unit testing and Sonar vulnerability scans
      4) Email a report of the success/failure
      5) Kick off job to deploy to another environment

      Also noted that the most important best practice here would be to make sure Jenkins runs primarily off of a host other than a MarkLogic host.

       

      SUMMARY

      This article will help MarkLogic Administrators to monitor the health of their MarkLogic cluster. By studying the attached scripts, you will learn how to find out which hosts are down and which forests have failed over, enabling you to take the necessary recovery actions.

      Initial Setup

      On a separate Linux host (not a member of the cluster), download the file attachments from this article, making sure that they all reside within the same directory.

      Here is a general description of each file:

      cluster-name.conf - Example configuration file used by script. Configures information for monitoring one ML cluster. 

      ml-ck-for-life.sh - A very simple, low-load check that all the nodes of a cluster are up and running.

      ml-ck-for-health.sh - A more detailed check for essential cluster functionality with alerting (paging and/or emails to DBAs) if warranted. This script relies on at least one external XQuery file (mon-report-failed-over-forests.xqy) and makes use of the REST MGMT API as well as REST XQuery requests.

      mon-report-failed-over-forests.xqy - External XQuery file used by ml-ck-for-health.sh

       

      Preparing the CONF File for Use on Your Cluster

      Before running the scripts, the cluster-name.conf needs to be customized for your specific cluster. Start by changing the file name to match the name of your cluster, e.g.,

      $ mv cluster-name.conf some-other-name.conf

      Where "some-other-name" is the actual name of the cluster, or of the application that is hosted on that cluster.

      Next, you will need to customize some of the internal variables inside the CONF file itself. Here is the contents of the cluster-name.conf file, as downloaded:

      CLUSTER_NAME="CLUSTER-NAME"
      CLUSTER_NODES=( node1.my-company.com node2.my-company.com node3.my-company.com )
      # MarkLogic Credentials for the REST Management port - 8002
      USER_PW_MGMT=rest-manager-user:re-manager-password
      # MarkLogic Credentials for the XQuery eval port - 8000
      USER_PW_XQ=user-name:user-password
      UNIX_USER=unix-user-name
      PAGE_ADDRESSES=ml.alert.page@my-company.com
      MAIL_ADDRESSES=ml.alert.mail@my-company.com

      ---------  end of listing ---------

      For CLUSTER_NAME, provide the cluster-name listed in the cluster's /var/log/MarkLogic/clusters.xml file.

      For CLUSTER_NODES, write in the host-names for each node in your cluster.

      For USER_PW_MGMT, provide the user-name and password for the REST MANAGEMENT user, the format is name:password.

      For USER_PW_XQ, provide the user-name and password for the user who will execute the XQuery scripts, the format is name:password.

      The UNIX_USER is a local Unix username with the correct rwx access rights for this directory.

      The PAGE_ADDRESSES & MAIL_ADDRESSES are alert email addresses who will be notified whenever there is a failover event.

      Periodicity

      The script ml-ck-for-health.sh was created with the idea it would be run repeatedly at a certain interval to keep tabs on system health. For example, it can be configured to be invoked with a cron job. A frequency of 5 to 120 minutes is a good candidate range. Ten minutes is a good time if you would like to be woken up (on average) within 5 minutes of a failover event.

      Setting up SSH Passwordless Login

      In monitoring script ml-ck-for-health.sh, section (6) FOREST STATUS CHANGE, requires ssh access to the cluster hosts. That is because this section greps through MarkLogic server ErrorLogs. To enable this part of the script to run without prompting the user, "ssh passwordless login" should be setup between the monitoring host and all the cluster hosts.There are many examples of how to do this on the internet, for example: http://www.tecmint.com/ssh-passwordless-login-using-ssh-keygen-in-5-easy-steps/ Alternatively, this monitoring section can be commented out.

      Also regarding section (6), the “grep” command is setup up to grep the latest 10 minutes from the ErrorLog. If this script is configured to be run less often then every 10 minutes, the “grep” command line should be adapted to cover the desired period between script runs.

      Example Usage

      You are now ready to execute the failover monitoring scripts! Here is how you would execute them:


      $ ./ml-ck-for-health.sh some-other-name.conf MY-CLUSTER-NAME

      $ ./ml-ck-for-life.sh some-other-name.conf

      [where "some-other-name" and MY-CLUSTER-NAME are your actual CONF and cluster-name, as described above]

      Monitoring Multiple Clusters

      So, given a monitoring machine with a directory of cluster configuration files in the style of cluster-name.conf, those configuration files could be iterated through to monitor a suite of clusters from a single monitoring machine. It should be fairly easy to build a custom shell script to iterate through various cluster CONF files.

      Final thought and Limitations

      Please be aware that the ml-ck-for-health.sh script is only partially implemented. In particular, the Replication Lag and Replication Failure sections are left as exercises for the user.

      This script is being presented as a backup, lowest common denominator monitoring solution. For a more complete solution, you should explore other options, such as Splunk or Nagios.

       

       

       

      Summary:

      After adding or removing a forest and correspond replica forest in a database, we have seen instances where the Rebalancer does not properly distribute the documents amongst existing and newly added forests.

      For this particular instance, XDMP-HASHLOCKINGRETRY debug level error message reported repeatedly in the error logs.  The messages would look something like: 

      2016-02-11 18:22:54.044 Debug: Retrying HTTPRequestTask::handleXDBCRequest 83 because XDMP-HASHLOCKINGRETRY: Retry hash locking. Forests config hash does not match.

      2016-02-11 18:22:54.198 Debug: Retrying ForestRebalancerTask::run P_initial_p2_01 50 because XDMP-HASHLOCKINGRETRY: Retry hash locking. Forests config hash does not match.

      Diagnosing

      Gather statistics about the rebalancer to see the number of documents being scheduled. If you run attached script “rebalancer-preview.xqy” in the query console of your MarkLogic Server cluster, it will produce rebalancer statistics in tabular format.

      • Note that you must first change the database name (YourDatabaseNameOnWhichNewForestsHaveBeenAdded) on the 3rd line of the XQuery script “rebalancer-preview.xqy”:

      declare variable $DATABASE as xs:string := xdmp:get-request-field("db", "YourDatabaseNameOnWhichNewForestsHaveBeenAdded");

      If experiencing this issue, the newly added forests will show zero in the “Total to be moved” column in the generated html page.

      Resolving

      Perform a cluster wide restart in order to get past this issue.  The restart is required to reload all of the configuration files across the cluster.  The rebalancer will also check to see if additional rebalancing work needs to occur. The rebalancer should work as expected now and the  XDMP-HASHLOCKINGRETRY messages should no longer appear in the logs. If you run the rebalancer-preview.xqy script again, the statistics should now show the the number of documents being scheduled to be moved.

      You can also validate the rebalancer status from the Database Status page in the Admin UI.

      The XDMP-HASHLOCKINGRETRY rebalancer issue has fixed in the latest MarkLogic Server releases.  However, the rebalancer-preview.xqy script can be used to help diagnose other perceived issues with the Rebalancer.

       

      Search fundamentals

       

      Difference between cts:contains and fn:contains

       1) fn:contains is a substring match, where as cts:contains performs query matching

       2) cts:contains therefore can utilize general queries and stemming, where fn:contains does not

       

      For example:-

       

      Example.xml

      <test>daily running makes you fit</test>

       

      •         fn:contains(fn:doc(“Example.xml”),”ning”)

                True

      •          cts:contains(fn:doc(“Example.xml”),”ning”)

               False

       

         

      •         fn:contains(fn:doc(“Example.xml”),”ran”)

                 False

      •         cts:contains(fn:doc(“Example.xml”),”ran”)

                  True

       

       

      Note:-

      The cts:contains examples are checking the document against cts:word-querys.  Stemming reduces words down to their root, allowing for smaller term lists.

       

      1) Words from different languages are treated differently, and will not stem to the same root word entry from another language.

      2) Note: Nouns will not stem to verbs and vice versa. For example, the word “runner” will not stem to “run”.

      References

      Introduction

      MarkLogic Server provides a variety of  disaster recovery (DR) facilities including full backup, incremental backup, and journal archiving that when combined with other ML features can create a complete disaster recovery strategy. This paper shows some examples of how these features can be combined. It is not comprehensive nor does it reflect features offered only in the latest releases.

      Details

      This article will cover three perspectives. First, a quick overview of the metrics used by businesses to measure the quality of their Disaster Recovery strategies will be covered. Next, an overview of how to combine the features that MarkLogic offers in various categories will be given.

      More?: High Availability and Disaster Recovery features ,  High Availability & Disaster Recovery datasheetScalability, Availability, and Failover Guide 

      Disaster Recovery Criteria

      In order to configure MarkLogic Server to perform well in Disaster Recovery situations, we should first define what parameters we will use to measure each possible approach. For most situations, these four measures are used: 

      Long Term Retention Policy (LTR): Long Term Retention Policy can be driven by any number of business, regulatory and other criteria. It is included here because MarkLogic's backup files are often a key part of an LTR strategy. 

      Recovery Point Objective (RPO)The requirement for how up-to-date the database has to be post-recovery with respect to its state immediately before the incident that required recover.

      Recovery Time Objective (RTO)The requirement for the time elapsed between the incident and the recovery to the RPO.

      CostThe storage cost, the computational resource cost and  the operations cost of the overall deployment strategy.

      Flexible Replication Features

      Flexible replication can be used to support LTR objectives but is generally not useful for Disaster Recovery

      More? Flexible Replication Guide

      Platform Support Features

      Flash backup provides a way to leverage backup features of your deployment platform while maintaining transaction integrity. Platform specific solutions can often achieve RPO and RTO targets that would be impossible through other means.

      More? Flash Backup

      High Availability Features

      Forest replication provides recovery from host failures.

      More? Scalability, Availability, and Failover Guide

      Disaster Recovery Features

      Database Replication

      Database Replication is the process of maintaining copies of forests on databases in multiple MarkLogic Server clusters.

      More? Understanding Database Replication

      Backups

      Of all your backup options, full backups restore the quickest, but take the most time to backup and possibly the most storage space. Each full backup is a backup set in that it contains everything you need to restore to the point of the backup.

      Full backups with journal archiving allow restores to a point after the backup, but the journal archive grows in an unbounded way with the number of transactions, and replaying the journals to get to your recovery point takes time proportional to the number of transactions in the journal archive, so over time, this becomes less efficient.

      With full + incremental backups, a backup set is a full backup, plus the incremental backups taken after that full backup. Incremental backups are quick to backup, but take longer to restore, and over time the backup set gets larger and larger, so it may end up consuming more backup space than a full backup alone (depending on your backup retention policy).

      Full + incremental backups with journal archiving have the same characteristics as incremental backups, except that you can roll forward from the most recent incremental. With this strategy, the journal archive doesn't grow in an unbounded way because the archive is purged when you take the next incremental backup. Note that if your RPO is between incremental backups, you must also enable a merge timestamp by setting the merge timestamp to a negative value (see below).

      More?: Administrator’s Guide to Backing Up and Restoring a Database  How does "point-in-time" recovery work with Journal Archiving? 

      Forest Merge Configurations

      Forest merges recover the disk space occupied by deleted documents. A negative merge timestamp delays that permanent deletion. If we want incremental backups to contain all the fragments that were deleted since the last incremental backup then we want to set the delay to a period greater than the incremental backup period. This requires more disk space for the incremental backups and also requires additional space in the live database, but provides the most flexibility.

      Setting retain-until-backup on a given database (thru the Admin UI or thru an API call) has a similar effect by telling the server to keep the deleted fragments until a full backup or an incremental backup completes. Many clients choose to use both the negative merge timestamp and retain until backup options together.

      More?: admin:database-set-merge-timestamp  admin:database-set-retain-until-backup

      Other Features

      The need for a negative merge timestamp can be understood by remembering that forest merges recover the disk space occupied by deleted documents. A negative merge timestamp delays that permanent deletion. If we want incremental backups to contain all the fragments that were deleted since the last incremental backup then we want to set the delay to a period greater than the incremental backup period. This requires more disk space for the incremental backups and also requires additional space in the live database, but provides the most flexibility.

      Setting retain-until-backup on a given database (thru the Admin UI or thru an API call) has a similar effect by telling the server to  keep the deleted fragments until a full backup or an incremental backup. Many clients choose to use both the negative merge timestamp and retain until backup options together.

      More?: admin:database-set-merge-timestamp,  admin:database-set-retain-until-backup 

      Conclusion

      Planning to meet a Long Term Retention (LTR) policy, a Recovery Point Objective (RPO) and a Recovery Time Objective (RTO) and a Cost goal is a key part of developing an overall MarkLogic deployment plan. MarkLogic offers a wealth of tools that can complement each other when they are properly coordinated. As is clear from this article, the choices are many, broad, and interrelated.

      Introduction

      In the more recent versions of MarkLogic Server, there are checks in place to prevent the loading of invalid documents (such as documents with multiple root nodes).  However, documents loaded in earlier versions of MarkLogic Server can now result in duplicate URI or duplicate document errors being reported.

      Additionally, under normal operating conditions, a document/URI is saved in a single forest. If somehow the load process gets compromised, then user may see issues like duplicate URI (i.e. same URI in different forests) and duplicate documents (i.e. same document/URI in same forest).

      Resolution

      If the XDMP-DBDUPURI (duplicate URI) error is encountered, refer to our KB article "Handling XDMP-DBDUPURI errors" for procedures to resolve.

      If one doesn't see XDMP-DBDUPURI errors but running fn:doc() on a document returns multiple nodes then it could be a case of duplicate document in same forest. To check that the problem is actually duplicate documents, one can either do an xdmp:describe(fn:doc(...)) or fn:count(fn:doc((...)). If these commands return more than 1 e.g. xdmp:describe(fn:doc("/testdoc.xml")) returns (fn:doc("/testdoc.xml"), fn:doc("/testdoc.xml")) or fn:count(fn:doc("/testdoc.xml")) returns 2 then the problem is of duplicate documents in the same forest (and not duplicate URIs).

      To fix duplicate documents, the document will need to be reloaded.

      Introduction

      This article talks about effects of case sensitivity of search term on search score and thus on final order of search results for a secondary query which is using cts:boost-query and weight. The case-insensitive word term is treated as the lower case word term, so there can be no difference in the frequencies and scores of results for any-case/case-insensitive search term and lowercase search term with “case-sensitive” option or when neither "case-sensitive" nor "case-insensitive" is present. If neither "case-sensitive" nor "case-insensitive" is present, text of search term is used to determine case sensitivity.

      Understanding relevance score

      In MarkLogic Search results are returned in a relevance order. The most relevant results are first in result sequence and least relevant are last.
      More details on relevance score and its calculation are available at, https://docs.marklogic.com/guide/search-dev/relevance

      Of many ways to control this relevance score one way is to use a secondary query to boost relevance score, https://docs.marklogic.com/guide/search-dev/relevance#id_30927 . This article takes advantage of examples using secondary query to boost relevance scores and impact of text case (upper, lower or unspecifed) of search terms on relevance score on order of results returned.

      A few examples to understand this scenario

      Consider a few scenarios where below mentioned queries are trying to boost certain search results up using cts:boost-query and weight for word "washington" in returned results.

      Example 1: Search with lowercase search term and option for case not specified

      Query1:
      xquery version "1.0-ml";
      declare namespace html = "http://www.w3.org/1999/xhtml";

      for $hit in
      ( cts:search(
      fn:doc()/test,

      cts:boost-query(cts:element-word-query(xs:QName("test"),"George" ),
      cts:element-word-query(xs:QName("test"),"washington",(), 10.0) )
      )
      )

      return element hit {
      attribute score { cts:score($hit) },
      attribute fit { cts:fitness($hit) },
      attribute conf { cts:confidence($hit) },
      $hit
      }


      Results for Query1:
      <hit score="28276" fit="0.9393904" conf="0.2769644">
      <test>Washington, George... </test>
      </hit>
      ...
      ...
      <hit score="16268" fit="0.7125317" conf="0.2100787">
      <test>George washington was the first President of the United States of America...</test>
      </hit>
      ...

       

      Example 2: Search with lowercase search term and case-sensitive option

      Query2:
      xquery version "1.0-ml";
      declare namespace html = "http://www.w3.org/1999/xhtml";

      for $hit in
      ( cts:search(
      fn:doc()/test,

      cts:boost-query(cts:element-word-query(xs:QName("test"),"George" ),
      cts:element-word-query(xs:QName("test"),"washington",("case-sensitive"), 10.0) )
      )
      )

      return element hit {
      attribute score { cts:score($hit) },
      attribute fit { cts:fitness($hit) },
      attribute conf { cts:confidence($hit) },
      $hit
      }


      Results for Query2:
      <hit score="28276" fit="0.9393904" conf="0.2769644">
      <test>Washington, George... </test>
      </hit>
      ...
      ...
      <hit score="16268" fit="0.7125317" conf="0.2100787">
      <test>George washington was the first President of the United States of America...</test>
      </hit>
      ...

       

      Example 3: Search with uppercase search term and option case-insensitive, in cts:boost-query like below with rest of query similar to above queries

      Query3:

      cts:boost-query(cts:element-word-query(xs:QName("test"),"George" ),
      cts:element-word-query(xs:QName("test"),"Washington",("case-insensitive"), 10.0) )

      Results for Query3:
      <hit score="28276" fit="0.9393904" conf="0.2769644">
      <test>Washington, George... </test>
      </hit>
      ...
      ...
      <hit score="16268" fit="0.7125317" conf="0.2100787">
      <test>George washington was the first President of the United States of America...</test>
      </hit>
      ...


      Clearly above queries are producing the same scores with same fitness and confidence scores. This is because the case-insensitive word term is treated as the lower case word term, so there can therefore be no difference in the frequencies of those two terms (any-case/case-insensitive and lowercase/case-sensitive), and therefore no difference in scoring. Thus no difference in scores of results for Query3 and Query2.
      And for cases where case sensitivity is not specified, text of search term is used to determine case sensitivity. For Query3 text of search term contains no uppercase hence it treated as "case-insensitive".

       

      Now let us now take look at a query with a word with uppercase and case-sensitive option in query.

      Example 4: Search with uppercase search term and option case-sensitive, in cts:boost-query like below with rest of query similar to above queries

      Query4:

      cts:boost-query(cts:element-word-query(xs:QName("test"),"George" ),
      cts:element-word-query(xs:QName("test"),"Washington",("case-sensitive"), 10.0) )

      Results for Query4:
      <hit score="44893" fit="0.9172696" conf="0.3489831">
      <test>Washington, George was the first... </test>
      </hit>
      ...
      ...
      <hit score="256" fit="0.0692672" conf="0.0263533">
      <test>George washington was the first President of the United States of America...</test>
      </hit>
      ...

       

      As we can clearly see the scores are changed for results for Query4 and thus final order of results is also updated.


      Conclusion:

      While using a secondary query having cts:boost-query and weight, to boost certain search results up, it is important to understand the impact of case of search text on result sequence. A case-insensitive word term is treated as the lower case word term, so there can therefore be no difference in the frequencies of any-case/case-insensitive and lowercase/case-sensitive search terms, and therefore no difference in scoring. For search term with upper case alphabets in text and with “case-sensitive” option scores are boosted up as expected in comparison with a “case-insensitive search”. If neither "case-sensitive" nor "case-insensitive" is present, text of search term is used to determine case sensitivity. If text of search term contains no uppercase, it specifies "case-insensitive". If text of search term contains uppercase, it specifies "case-sensitive".

       

      Background

      MarkLogic Server includes element level security (ELS), an addition to the security model that allows you to specify security rules on specific elements within documents. Using ELS, parts of a document may be concealed from users who do not have the appropriate roles to view them. ELS can conceal the XML element (along with properties and attributes) or JSON property so that it does not appear in any searches, query plans, or indexes - unless accessed by a user with appropriate permissions.

      ELS protects XML elements or JSON properties in a document using a protected path, where the path to an element or property within the document is protected so that only roles belonging to a specific query roleset can view the contents of that element or property. You specify that an element is part of a protected path by adding the path to the Security database. You also then add the appropriate role to a query roleset, which is also added to the Security database.

      ELS uses query rolesets to determine which elements will appear in query results. If a query roleset does not exist with the associated role that has permissions on the path, the role cannot view the contents of that path.

      Notes:

      1. A user with admin privileges can access documents with protected elements by using fn:doc to retrieve documents (instead of using a query). However, to see protected elements as part of query results, even a user with admin privileges will need to have the appropriate role(s).
      2. ELS applies to both XML elements and JSON properties; so unless spelled out explicitly, 'element' refers to both XML elements and JSON properties throughout this article.

      You can read more about how to configure Element Level Security here, and can see how this all works at this Element Level Security Example.

      Node-update

      One of the commonly used document level capabilities is 'update'. Be aware, however, that document level update is too powerful to be used with ELS permissions as someone with document level update privileges could update not only a node, but also delete the whole document. Consequently, a new document-level capability - 'node-update' - has been introduced. 'node-update' offers finer control when combined with ELS through xdmp:node-replace and xdmp:node-delete functions as they can be used to update/delete only the specified nodes of a document (and not the document itself in its entirety).

      Document-level vs Element-level security

      Unlike at the document-level:

      • 'update' and 'node-update' capabilities are equivalent at the element-level. However, at the document-level, if a user only has a 'node-update' privilege to a document, you cannot delete the document. In contrast, 'update' privileges allows that user to delete the document
      • 'Read', 'insert' and 'update' are checked separately at the element level i.e.:
        • read operations - only permissions with 'read' capability are checked
        • node update operations - only permissions with 'node-update' (update) capability are checked
        • node insert operations - only permissions with  'insert' capability are checked

      Note: read, insert, update and node-update can all be used at the element-level i.e., they can be part of the protected path definition.

      Permissions:

      Document-level:

      1. update: A node can be updated by any user that has an 'update' capability at the document-level
      2. node-update:  A node can be updated by any user with a 'node-update' capability as long as they have sufficient privileges at the element-level

      Element-level:

      1. If a node is protected but no 'update/node-update' capabilities are explicitly granted to any user, that node can be updated by any user as long as they have 'update/node-update' capabilities at the document-level
      2. If any user is explicitly granted 'update/node-update' capabilities to that node at the element level, only that specific user is allowed to update/delete that node. Other users who are expected to have that capability must be explicitly granted that permission at the element level

      How does node-replace/node-delete work?

      When a node-replace/node-delete is called on a specific node:

      1. The user trying to update that node must have at least a 'node-update' (or 'update') capability to all the nodes up until (and including) the root node
      2. None of the descendant nodes of the node being replaced/deleted can be protected by a different roles. If they are protected:
        1. 'node-delete' isn’t allowed as deleting this node would also delete the descendant node which is supposed to be protected
        2. 'node-replace' can be used to update the value (text node) of the node but replacing the node itself isn’t allowed

      Note: If a caller has the 'update' capability at the document level, there is no need to do element-level permission checks since such a caller can delete or overwrite the whole document anyway.

      Takeaways:

      1. 'node-update' was introduced to offer finer control with ELS, in contrast to the document level 'update'
      2. 'update' and 'node-update' permissions behave the same at element-level, but differently at the document-level
        1. At document-level, 'update' is more powerful as it gives the user the permission to delete the entire document
        2. All permissions talk to each other at document-level. In contrast, permissions are checked independently at the element-level
          1. At the document level, an update permission allows you to read, insert to and update the document
          2. At the element level, however, read, insert and update (node-update) are checked separately
            1. For read operations, only permissions with the read capability are checked
            2. For node update operations, only permissions with the node-update capability are checked
            3. For node insert operations, only permissions with the insert capability are checked (this is true even when compartments are used).
      3. Can I use ELS without document level security (DLS)?
        1. ELS cannot be used without DLS
        2. Consider DLS the outer layer of defense, whereas ELS is the inner layer - you cannot get to the inner layer without passing through the outer layer
      4. When to use DLS vs ELS?
        1. ELS offers finer control on the nodes of a document and whether to use it or not depends on your use-case. We recommend not using ELS unless it is absolutely necessary as its usage comes with serious performance implications
        2. In contrast, DLS offers better performance and works better at scale - but is not an ideal choice when you need finer control as it doesn’t allow node-level operations 
      5. How does ELS performance scale with respect to different operations?
        1. Ingestion - depends on the number of protected paths
          1. During ingestion, the server inspects every node for ELS to do a hash lookup against the names of the last steps from all protected paths
          2. For every protected path that matches the hash, the server does a full test of the node against the path - the higher the number of protected paths, the higher the performance penalty
          3. While the hash lookup is very fast, the full test it comparatively much slower - and the corresponding performance penalty increases when there are a large number of nodes that match the last steps of the protected paths
            1. Consequently, we strongly recommend avoiding the use of wildcards at the leaf-level in protected paths
            2. For example: /foo/bar/* has a huge performance penalty compared to /foo/*/bar
        2. Updates - as with ingestion, ELS performance depends on the number of protected paths
        3. Query/Search - in contrast to ELS ingestion or update, ELS query performance depends on the number of query rolesets
          1. Because ELS query performance depends on the number of query rolesets, the concept of Protected PathSet was introduced in 9.0-4
          2. A Protected PathSet allows OR relationships between permissions on multiple protected paths that cover the same element
          3. Because query performance depends on the number of relevant query rolesets, it is highly recommended to use helper functions to obtain the query rolesets of nodes configured with element-level security

      Further Reading

      Introduction

      Some customers have reported problems when attempting to access the Configuration Manager application. In the past, this has been attributed to part of the upgrade process failing for some reason (for example: a port required by MarkLogic already being used) or in some cases it was due to a default databases being removed by the customer at some previous stage.

      XDMP-ARGTYPE Error

      If you see this error when you attempt to access the Configuration Manager:

      500 Internal Server Error XDMP-ARGTYPE XDMP-ARGTYPE: (err:XPTY0004) fn:concat( "could not initialize management plugins with scope: ", $reut:PLUGIN-SCOPE, ": ", xdmp:quote($e)) -- arg1 is not of type xs:anyAtomicType?

      Resolving the error

      Ensure you have an Extensions database configured by doing the following:

      • Log into the MarkLogic Admin interface on port 8001 - http://[your-host]:8001/
      • Under "Databases" box, ensure a database called Extensions is listed

      If it does not exist, download and run the script attached to this article (create-extensions-db.xqy).

      Summary

      Does MarkLogic provide encryption at rest?

      MarkLogic 9

      MarkLogic 9 introduces the ability to encrypt 'data at rest' - data that is on media (on disk or in the cloud), as opposed to data that is being used in a process. Encryption can be applied to newly created files, configuration files, or log files. Existing data files can be encrypted by triggering a merge or re-index of the data.

      For more information about using Encryption at Rest, see Encryption at Rest in the MarkLogic Security Guide.

      MarkLogic 8 and Earlier releases

      MarkLogic 8 does not provide support for encryption at rest for its own forests.

      Using Amazon S3 Encryption For Backups

      If you are hosting your data locally, would like to back up to S3 remotely, and your goal is that there cannot possibly exist unencrypted copies of your data outside your local environment, then you could backup locally and store the backups to S3 with AWS Client-Side encryption. MarkLogic does not support AWS Client-Side encryption, so this would need to be a solution outside MarkLogic.

      See also: MarkLogic documentation: S3 Storage.

      See also: AWS: Protecting Data Using Encryption.

      Introduction

      Here we compare XDBC servers and the Enhanced HTTP server in MarkLogic 8.

      Details

      XDBC servers are still fully supported in MarkLogic Server version 8. You can upgrade existing XDBC servers without making any changes and you can create new XDBC servers as you did in previous releases.

      The Enhanced HTTP Server is an additional feature on HTTP servers which is protocol and binary transport compatible with XCC clients, as long as you use the xcc.httpcompliant=true system property.

      The XCC protocol is actually just HTTP, but the details of how to handle body, headers, responses, etc., are "built in" to the XCC client libraries and the XDBC server. The HTTP server in MarkLogic 8 now shares the same low-level code and can dispatch XCC-like requests.

      Introduction

      This article talks about best practices for use of external proxies vs using rewriter rules in the Enhanced HTTP server.

      Details

      Whether to use external proxies versus using rewriter rules in the Enhanced HTTP application server is an application design tradeoff not dissimilar to using a single HTTP application server and a XQuery rewriter or endpoint that can dynamically dispatch to different databases and modules (using eval-in).  The Enhanced HTTP server does this type of dispatching much more efficiently, but the concept is similar, with the same pros and cons.

      It is mostly an application and business management issue—by sharing the same port you share the same server configuration (authentication, server settings) and the "outside world" only sees one port, so configuring port-based security on firewalls, routers, or load balancers is more difficult.

      Summary

      A forest reindex timeout error may occur when there are transactions holding update locks on documents for an extended period of time. A reindexer process is started as a result of a database index change or a major MarkLogic Server upgrade.  The reindexer process will not complete until after update locks are released.

      Example error text seen in the MarkLogic Server ErrorLog.txt file:

      XDMP-FORESTERR: Error in reindex of forest Documents: SVC-EXTIME: Time limit exceeded

      Detail

      Long running transactions can occur if MarkLogic Server is participating in a distributed transaction environment. In this case transactions are managed through a Resource Manager. Each transaction is executed in a two phase commit. In the first phase, the transaction will be prepared for a commit or a rollback. The actual commit or rollback will occur in the second phase. More details about XA transactions can be found in the Applicactions Developer Guide - Understanding Transactions in MarkLogic Server

      In a situation where the Resource Manager get's disconnected between the two phases, all transactions may be left in a "prepare" state within MarkLogic Server. The Resource Manager maintains transaction information and will clean up transactions left in "prepare" state after a successful reconnect. In the rare case where this doesn't happen, all transactions left in "prepare" state will stay in the system until they are cleaned up manually. The method to manually intervene is described in the XCC Developers Guide - Heuristically Completing a Stalled Transaction.

      In order for a XA transaction to take place, it needs to prepare the execution for the commit. If updates are being made to pre-existing documents, update locks are held against the URIs for those documents. When reindexing is occuring during this process, the reindexer will wait for these locks to be released before it can successfully reindex the new documents.   Because the reindexer is unable to complete due to these pending XA transactions, the hosts in the cluster are unable to completely finish the reindexing task and will eventually throw a timeout error.

      Mitigation

      To avoid these kind of reindexer timeouts, it is recommended that the database is checked for outstanding XA transactions in "prepare" state before starting a reindexing process. There are two ways to verify if the database has outstanding transactions in "prepare" state:

      • In the Admin UI, navigate  to each forest of the database and review the status page; or
      • Run the following XQuery code (in Query Console):

        xquery version "1.0-ml"; 
        declare namespace fo = "http://marklogic.com/xdmp/status/forest";   

        for $f in xdmp:database-forests(xdmp:database()) 
        return    
          xdmp:forest-status($f)//fo:transaction-coordinator[fo:decision-state = 'prepare']

      In the case where there are transactions in the "prepare" state, a roll-back can be executed:

      • In the Admin UI, click on the "rollback" link for each transaction; or
      • Run the following XQuery code (in Query Console):

        xquery version "1.0-ml"; 
        declare namespace fo = "http://marklogic.com/xdmp/status/forest";

        for $f in xdmp:database-forests(xdmp:database()) 
        return    
          for $id in xdmp:forest-status($f)//fo:transaction-coordinator[fo:decision-state = 'prepare']/fo:transaction-id/fn:string()
          return
            xdmp:xa-complete($f, $id, fn:false(), fn:false())

      Introduction

      Query Console is an interactive web-based query development tool for writing and executing ad-hoc queries in XQuery, Server-Side JavaScript, SQL and SPARQL. Query Console enables you to quickly test code snippets, debug problems, profile queries, and run administrative XQuery scripts.  Query Console uses workspaces to assist users with organizing queries.  A user can have multiple workspaces, and each workspace can have multiple queries.

      Issue

      In MarkLogic Server v9.0-11, v10.0-3 and earlier releases, users may experience delays, lag or latency between when a key is pressed on the keyboard, and when it appears in the Query Console query window.  This typically happens when there are a large number of queries in one of the users workspaces.

      Workaround

      A workaround to improve performance is to reduce the number of queries in each workspace.  The same number of queries can be managed by increasing the number of workspaces and reducing the number of queries in each workspace.  We suggest keeping no more than 30 queries in a workspace to avoid these latency issues.  

      The MarkLogic Development team is looking to improve the performance of Query Console, but at the time of this writing, this performance issue has not yet been resolved. 

      Further Reading

      Query Console User Guide

      Introduction

      Users of Java based batch processing applications, such as CoRB, XQSync, mlcp and the hadoop connector may have seen an error message containing "Premature EOF, partial header line read". Depending on how exceptions are managed, this may cause the Java application to exit with a stacktrace or to simply output the exception (and trace) into a log and continue.

      What does it mean?

      The premature EOF exception generally occurs in situations where a connection to a particular application server connection was lost while the XCC driver was in the process of reading a result set. This can happen in a few possible scenarios:

      • The host became unavailable due to a hardware issue, segfault or similar issue;
      • The query timeout expired (although this is much more likely to yield an XDMP-EXTIME exception with a "Time limit exceeded" message);
      • Network interruption - a possible indicator of a network reliability problem such as a misconfigured load balancer or a fault in some other network hardware.

      What does the full error message look like?

      An example:

      INFO: completed 5063408/14048060, 103 tps, 32 active threads
       Feb 14, 2013 7:04:19 AM com.marklogic.developer.SimpleLogger logException
       SEVERE: fatal error
       com.marklogic.xcc.exceptions.ServerConnectionException: Error parsing HTTP
       headers: Premature EOF, partial header line read: ''
       [Session: user=admin, cb={default} [ContentSource: user=admin,
       cb={none} [provider: address=localhost/127.0.0.1:8223, pool=0/64]]]
       [Client: XCC/4.2-8]
       at
       com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(AbstractRequestController.java:116)
       at com.marklogic.xcc.impl.SessionImpl.submitRequest(SessionImpl.java:268)
       at com.marklogic.developer.corb.Transform.call(Unknown Source)
       at com.marklogic.developer.corb.Transform.call(Unknown Source)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
       at
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:679)
       Caused by: java.io.IOException: Error parsing HTTP headers: Premature EOF,
       partial header line read: ''
       at com.marklogic.http.HttpHeaders.nextHeaderLine(HttpHeaders.java:283)
       at com.marklogic.http.HttpHeaders.parseResponseHeaders(HttpHeaders.java:248)
       at com.marklogic.http.HttpChannel.parseHeaders(HttpChannel.java:297)
       at com.marklogic.http.HttpChannel.receiveMode(HttpChannel.java:270)
       at com.marklogic.http.HttpChannel.getResponseCode(HttpChannel.java:174)
       at
       com.marklogic.xcc.impl.handlers.EvalRequestController.serverDialog(EvalRequestController.java:68)
       at
       com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(AbstractRequestController.java:78)
       ... 11 more
       2013-02-14 07:04:19.271 WARNING [12] (AbstractRequestController.runRequest):
       Cannot obtain connection: Connection refused

      Configuration / Code: things to try when you first see this message

      A possible cause of errors like this may be due to the JVM starting garbage collection and this process taking long enough as to exceed the server timeout setting. If this is the case, try adding the -XX:+UseConcMarkSweepGC java option

      Setting the "keep-alive" value to zero for the affected XDBC application server will disable socket pooling and may help to prevent this condition from arising; with keep-alive set to zero, sockets will not be re-used. With this approach, it is understood that disabling keep-alive should not be expected to have a significant negative impact on performance, although thorough testing is nevertheless advised.

      Summary

      Here we discuss various methods for sharing metering data with Support:  telemetry in MarkLogic 9 and exporting monitoring data.

      Discussion

      Telemetry

      In MarkLogic 9, enabling telemetry collects, encrypts, packages, and sends diagnostic and system-level usage information about MarkLogic clusters, including metering, with minimal impact to performance. Telemetry sends information about your MarkLogic Servers to a protected and secure location where it can be accessed by the MarkLogic Technical Support Team to facilitate troubleshooting and monitor performance.  For more information see Telemetry.

      Meters database

      If telemetry is not enabled, make sure that monitoring history is enabled and data has been collected covering the time of the incident.  See Enabling Monitoring History on a Group for more details.  

      Backup of Meters database

      A backup of the full Meters database will provide all the available raw data and is very useful, but is often very large and difficult to transfer, so an export of a defined time range is often requested.

      Exporting data

      One of the attached scripts can be used in lieu of a Meters database backup. They will provide the raw metering XML files from a defined period of time and can be reloaded into MarkLogic and used with the standard tools.

      exportMeters.xqy

      This XQuery export script needs to be executed in Query Console against the Meters database and will generate zip files stored in the defined folder for the defined period of time.

      Variables for start and end times, batch size, and output directory are set at the top of the script.

      get-raw.sh

      This bash version will use MLCP to perform a similar export but requires an XDBC server and MLCP installed. By default the script creates output in a subdirectory called meters-export. See the attached script for details. An example command line is

      ./get-raw.sh localhost admin admin "2018-04-12T00:00:00" "2018-04-14T00:00:00"

      Introduction

      To avoid index bloat, MarkLogic only records positions in its indexes for words once for word-query fields. When word positions are necessary to accurately match element-word queries, they are normally used from the word-query field. When elements are excluded from the word query field, words under those elements are not indexed - so their positions are not recorded. In MarkLogic 7.0-5 and 8.0-1, a code change was included to avoid false negatives resulting from an element-word query expecting positions from words in elements descended from excluded elements. This code change was to not use positions from the word-query field for element-word searches if the word-query field has exclusions.

      Implications

      Unfortunately, this solution can sometimes result in false positives - which is captured in 7.0-5 bug #33207 and 8.0-1 bug #32686 (you can read more about both of these bugs in our Fixed Bugs Report). Consequently, a follow-up refinement was shipped in 7.0-5.1 & 8.0-2 to allow for the affected queries to be fully resolveable via indexes. To take advantage of this update, three changes are required:

      1) Upgrade to 7.0-5.1 or later, or 8.0-2 or later

      2) Database index settings must be updated to tell MarkLogic Server to use positions in this scenario and therefore avoid the previously seen false positives. There are two changes that could be made. Either:

      2a. The element in the element-word query must be explicitly included in the word-query field

      ...or:

      2b. All the word-query excluded elements must be configured as phrase-around elements.

      3) After the relevant database index settings are updated and the upgrade has been applied, a reindex must be performed

      If these changes are made, positions in the word-query field should then be used, which should then ultimately result in the elimination of false positives.

      Introduction

       A "fast data directory" is configurable for each forest, and can be set to a directory built on a fast file system, such as one using SSDs. Refer to Using a mix of SSD and spinning drives. If configured MarkLogic Server will try to put as many writes and seeks to the Fast Data Directory (FDD) as it can. As such, it will try to put as many on disk stands as possible onto the FDD. Frequently updated documents tend to reside in the smaller stands and thus are more likely to reside on the FDD.

      This article attempts to explain how you should account for the FDD when sizing disk space for your MarkLogic Server.

      Details

      Fo'est journals will be placed on the fast data directory. 

      Each time an automatic merge is performed, MarkLogic Server will attempt to save the results onto the forest's fast data directory. If there is not sufficient space on the FDD, MarkLogic Server will use the forest's primary data directory. To preserve space for future small stands, MarkLogic Server is conservative in deciding whether to put the merge destination stands on the FDD, which means that even if there is enough available space, it may store the result to the forests regular data directory. For more details, refer to the fundamental of resource consumption white paper. 

      It is also important to know when the Fast Data Directory is not used: Stands created from a manually triggered merges do not get stored on the fast data directory, but in the forest's primary data directory. Manual merges can be executed by calling the xdmp:merge function or from within the Admin UI; Forest-migrate  and Restoring backups do not put stands in the fast data directory.

      Conclusion

      MarkLogic Server maintains some disk space in the FDD for checkpoints and journaling. However, since the Fast Data Directory is not used in some procedures, we should not count the size of the FDD when sizing the disk space needed for forest data.

      Introduction

      The Performance Considerations section of the Loading Content Into MarkLogic Server documentation states 

      "When you load content, MarkLogic Server performs updates transactionally, locking documents as needed and saving the content to disk in the journal before the transaction commits. By default, all documents are locked during an update and the journal is set to preserve committed transactions, even if the MarkLogic Server process ends unexpectedly."

      There are two types of locking which are specified at the database level:

      • Fast locking employs a hashed locking scheme (based on the URI) where each fragment URI has a designated forest, so the lock created during the insert is restricted only to that forest.
      • Setting up a database with "strict" locking will force the coordination of an update lock across all forests in the database (and across the cluster) until the insert has taken place.

      Fast locking has been the default setting for newly created MarkLogic databases since MarkLogic 5 (released October 2011)

      When should I use strict locking?

      If at any point in your code, you are specifying the forest to insert document or fragment into (using a technique commonly referred to as in-forest evaluation), configuring the setting for that database at "strict" is definitely the safest choice. If your code always allows the server to determine the target forest for the document/fragment, you're perfectly safe using fast locking.

      In the situation where two different people create the same document (with the same URI) and where fast locking was taking place, this would result in:

      • A transaction culminating in an insert into a given forest (as assigned by the ML node servicing the request) for the first fragment
      • An "update" transaction (in the same forest) where the first fragment is then marked as deleted
      • A new fragment takes place of the first fragment to complete the second transaction

      Subsequent merges would then remove the stand entry for the first fragment (now deleted/replaced by the subsequent transaction)

      The fast option would not create a dangerous race condition unless your application would allow two different people to insert a document with the same URI into two different forests as two separate transactions and where URI assignment is handled by your XQuery/application layer; if the code responsible for making those transactions were to inadvertently assign the same URI to two different forests in a cluster, this could cause a problem that strict locking would guard against. If your application always allows MarkLogic to assign the forest for the document, there is no danger whatsoever in keeping to the server default of "fast" locking.

      Additionally - consider what kind of failover you system is using. When using fast journaling with local disk replication, the journal disk write needs to fail on both master and replica nodes in order for data loss to occur - so there's no need for strict in this scenario. In contrast, strict journaling should be used with shared-disk failover, as data loss is possible if using fast journaling and a single node fails before the OS flushes the buffer to disk.

      Is there a performance implication in switching to strict locking?

      Fast locking will be faster than strict locking, but the performance penalty is largely going to be dependent on a number of factors; the number of forests in a given database, the number of nodes across which the database forests are spread and the speed at which all nodes in the cluster can coordinate a transaction across the cluster (Network/IO) will all have some (potentially minimal) impact.

      If the conditions of your application suit, we recommend staying with the default of fast locking on all your databases.

      There may be reasons for using 'strict' locking - especially if you are considering loading documents using in-forest-evaluation in your code.

      Further reading

      https://docs.marklogic.com/guide/ingestion/performance

      Summary

      There are situations where the SVC-DIRREM, SVC-DIROPEN and SVC-FILRD errors occur on backups to an NFS mounted drive. This article explains how this condition can occur and describes a number of recommendations to avoid such errors.

      Under normal operating conditions, with proper mounting options for a remote drive, MarkLogic Server does not expect to report SVC-xxxx errors.  Most likely, these errors are a result of improper nfs disk mounting or other IO issues.

      We will begin by exploring methods to narrow down the server which has the disk issue and then list some things to look into in order to identify the cause.

      Error Log and Sys Log Observation

      The following errors are typical MarkLogic Error Log entries seen during an NFS Backup that indicate an IO subsystem error.   The System Log files may include similar messages.

              Error: SVC-DIRREM: Directory removal error: rmdir '/Backup/directory/path': {OS level error message}

              Error: SVC-DIROPEN: Directory open error: opendir '/Backup/directory/path': {OS level error message}

              Error: Backup of forest 'forest-name' to 'Bakup path' SVC-FILRD: File read error: open '/Backup/directory/path': {OS level error message}

      These SVC- error messages include the {OS level error message} retrieved from the underlying OS platform using generic C runtime strerror() system call.  These messages are typically something like "Stale NFS file handle" or "No such file or directory".

      If only a subset of hosts in the cluster are generating these types of errrors ...

      You should compare the problem host's NFS configuration with rest of the hosts in the cluster to make sure all of the configurations are consistent.

      • Compare nfs versions (rpm -qa | grep -i nfs)
      • Compare nfs configurations (mount -l -t nfs, cat /etc/mtab, nfsstat)
      • Compare platform version (uname -mrs, lsb_release -a) 

      NFS mount options 

      MarkLogic recommends the NFS Mount settings - 'rw,bg,hard,nointr,noac,tcp,vers=3,timeo=300,rsize=32768,wsize=32768,actimeo=0'

      • Vers=3 :  Must have NFS client version v3 or above
      • TCP : NFS must be configured to use TCP instead of default UDP
      • NOAC : To improve performance, NFS clients cache file attributes. Every few seconds, an NFS client checks the server's version of each file's attributes for updates. Changes that occur on the server in those small intervals remain undetected until the client checks the server again. The noac option prevents clients from caching file attributes so that applications can more quickly detect file changes on the server.
        • In addition to preventing the client from caching file attributes, the noac option forces application writes to become synchronous so that local changes to a file become visible on the server immediately. That way, other clients can quickly detect recent writes when they check the file's attributes.
        • Using the noac option provides greater cache coherence among NFS clients accessing the same files, but it extracts a significant performance penalty. As such, judicious use of file locking is encouraged instead. The DATA AND METADATA COHERENCE section contains a detailed discussion of these trade-offs.
        • NOTE: The noac option is a combination of the generic option sync, and the NFS-specific option actimeo=0.
      • ACTIME=0 : Using actimeo sets all of acregminacregmaxacdirmin, and acdirmax to the same "0" value. If this option is not specified, the NFS client uses the defaults for each of these options listed above.
      • NOINTR : Selects whether to allow signals to interrupt file operations on this mount point. If neither option is specified (or if nointr is specified), signals do not interrupt NFS file operations. If intr is specified, system calls return EINTR if an in-progress NFS operation is interrupted by a signal.
        • Using the intr option is preferred to using the soft option because it is significantly less likely to result in data corruption.
        • The intr / nointr mount option is deprecated after kernel 2.6.25. Only SIGKILL can interrupt a pending NFS operation on these kernels, and if specified, this mount option is ignored to provide backwards compatibility with older kernels.
      • BG : If the bg option is specified, a timeout or failure causes the mount command to fork a child which continues to attempt to mount the export. The parent immediately returns with a zero exit code. This is known as a "background" mount.
      • HARD (vs soft) : Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling application.
        • Note: A so-called "soft" timeout can cause silent data corruption in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity. Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option. 

      Issue persists => Further debugging 

      If after checking NFS configuration and after implementing the MarkLogic recommended NFS mount settings, the issue persists, then you will need to debug the NFS connection during an issue period.    You should enable rpcdebug for NFS on the hosts showing the NFS errors, and then analyze the resulting syslogs during a period that is experiencing the issues

              rpcdebug -m nfs -s all

       The resulting logs may give you additional information to help understand what the source of the failures are.

       

      Introduction

      It has long been possible to store binary files in MarkLogic. In the MarkLogic 5 release in 2011, binary support was enhanced to allow for even more control over binary files.

      The purpose of this Knowledgebase article is not to cover MarkLogic's binary support in depth but to demonstrate a technique for retrieving a list of URIs for binary files which are managed in a MarkLogic Database.

      Retrieving a list of binary document URIs from MarkLogic Server

      The following code will use a call to cts:uris to get back a list of all URIs pointing to binary documents for a given MarkLogic database; note that this example assumes that you have the uri lexicon enabled in your database:

      Further reading

      People often want fine-grained entitlement control in the applications they build on top of MarkLogic Server. This article discusses two options and their performance implications.

      Best Practice

      Often, we'll see people attempt an implementation using MarkLogic users and roles. While MarkLogic Server can easily handle a large number of roles in total, you'll run into scalability and performance issues if you have a large number of roles per user. Additionally, you'll want to minimize the number of updates to documents in your Security database as every update requires Security caches to be re-validated, thus incurring a performance penalty.

      Instead, for a more scalable and performant solution, you will want to build your entitlements into your documents at the application level, then query those entitlement values with element range indexes on the elements containing those entitlement values.

      Summary

      When attempting to start MarkLogic Server on older versions of Linux (Non-supported platforms), a "Floating Point Exception" may prevent the server from starting.

      Example of the error text from system messages:

      kernel: MarkLogic[29472] trap divide error rip:2ae0d9eaa80f rsp:7fffd8ae7690 error:0

      Detail

      Older Linux kernels will, by default, utilize older libraries.  When a software product such as MarkLogic Server is built using a newer version of gcc, it is possible that it will fail to execute correctly on older systems.  We have seen it in cases where the glibc library is out of date, and not containing certain symbols that were added in newer versions. Refer to the RedHat bug that explains this issue: https://bugzilla.redhat.com/show_bug.cgi?id=482848

      The recommended solution is to upgrade to a newer version of your Linux distribution.  While you may be able to resolve the immediate issue by only upgrading the glibc library, it is not recommended.

      Introduction

      Attached to this article is an XQuery module: "appserver-status.xqy", which will generate a report on all requests currently "in-flight" across all application servers in your cluster

      Usage

      Run this in Query Console (be sure to display results as html output), it will generate an html table showing all requests currently "in-flight" across all application servers in your cluster. For any transaction taking over 60 seconds, it provides extra detail to help understand and identify bottlenecks where specific modules (or tasks) may be having an adverse effect on the overall performance of the cluster.

      The information generated by this module can be used in conjunction with any ticket opened with the support team where assistance is required to better understand and resolve performance issues relating to specific modules. This module could also be used in a situation where DBAs want to perform routine health checks on their cluster to find and identify slow running queries.

      Introduction

      At the time of this writing (MarkLogic 9), MarkLogic Server cannot perform spherical queries, as the geospatial indexes do not support a true 3D coordinate system.  In situations where cylindrical queries are sufficient, you can create a 2D geospatial index and a separate range index on an altitude value. An "and-query" with these indexes would result in a cylindrical query.

      Example

      Consider the following sample document structure:

      Configure these 2 indexes for your content database:

      1. Geospatial Element Pair index specifying latitude localname as ‘lat’ , longitude localname ‘long’ and ‘parent localname’ as ‘location’ in configuration
      2. Range element index with localname as ‘alt’ with int scalar type

      Assuming you have data in your content database matching above document structure, this query:

      will return all the documents with location i.e., points falling in the cylinder with center at 37.655983, -122.425525 having a radius of 1000 miles and with an altitude of less than 4 miles.

      Note that in MarkLogic Server 9 geospatial region match was introduced, so the above technique can be extended beyond cylinders.

      Introduction

      The MarkLogic Monitoring History dashboard (http://localhost:8002/history/) is probably the easiest way to gather monitoring history data, but almost all of this information available within the monitoring dashboard is also available over our ReST APIs:

      Application Server Status details

      Information on Application Severs can be found at https://docs.marklogic.com/REST/GET/manage/v2/servers and here's an example for getting detailed metrics - http://localhost:8002/manage/v2/servers?group-id=Default&view=metrics&format=xml

      For Application Server status information - https://docs.marklogic.com/REST/GET/manage/v2/servers@view=status and here's an example with detailed metrics http://localhost:8002/manage/v2/servers?view=status&group-id=Default&format=xml&fullrefs=true

      To access status information for a specific Application Server (for example, the TaskServer), you can get the current status by adding the name to the URI - http://localhost:8002/manage/v2/servers/TaskServer?group-id=Default&view=status&format=xml

      You can also get the configuration information for a given application server (for example: "Admin") over the ReST API - http://localhost:8002/manage/v2/servers/Admin/properties?group-id=Default&format=xml

      Database and Forest status details

      For databases and forests, you can similarly use the endpoints for /databases or /forests:

      Database level examples include:

      Forest level examples include:

      Introduction

      When configuring database replication, it is important to note that the Connect Forests by Name field is true by default. This works great because, when new forests of the same name are later added to the Master and Replica databases, they will be automatically configured for Database Replication.

      The issue

      The problem arises when you use replica forest names that do not match the original Master forest names. In that case, you may find that failover events cause forests to get stuck in the Wait Replication state. The usual methods of failing back to the designated masters will not work - restarting the replicas will not work, and neither will shutting down cluster/removing labels/restarting cluster.

      Resolution

      In this case, the way to fix the issue is to set Connect Forests by Name to false, and then you must manually connect the Master forests on the local cluster to the Replica forests on the foreign cluster, as described in the documentation: Connecting Master and Replica Forests with Different Names.

      it is worth noting that, starting MarkLogic 7, you are also allowed to rename the replica forests. Once you rename the replica forests to the same name as the forest name of the designated master database (e.g., the Security database should have a Security forest in both the master and replica), then they will be automatically configured for Database Replication, as expected.

      MarkLogic default Group Level Cache and Huge Pages settings

      The table below shows the default (and recommended) group level cache settings based on a few common RAM configurations for the 9.0-9.1 release of MarkLogic Server:

      Total RAM List Cache Compressed Tree Cache Expanded Tree Cache Triple Cache Triple Value Cache Default Huge Page Ranges
      8192 (8GB) 1024 (1 partition) 512 (1 partition) 1024 (1 partition) 512 (1 partition) 1024 (2 partitions) 1280 to 1994
      16384 (16GB) 2048 (1 partition) 1024 (2 partitions) 2048 (1 partition) 1024 (2 partitions) 2048 (2 partitions) 2560 to 3616
      24576 (24GB) 3072 (1 partition) 1536 (2 partitions) 3072 (1 partition) 1536 (2 partitions) 3072 (4 partitions) 3840 to 4896
      32768 (32GB) 4096 (2 partitions) 2048 (3 partitions) 4096 (2 partitions) 2048 (3 partitions) 4096 (6 partitions) 5120 to 6176
      49152 (48GB) 6144 (2 partitions) 3072 (4 partitions) 6144 (2 partitions) 3072 (4 partitions) 6144 (8 partitions) 7680 to 8736
      65536 (64GB) 8064 (3 partitions) 4032 (6 partitions) 8064 (3 partitions) 4096 (6 partitions) 8192 (11 partitions) 10080 to 11136
      98304 (96GB) 12160 (4 partitions) 6080 (8 partitions) 12160 (4 partitions) 6144 (8 partitions) 12160 (16 partitions) 15200 to 16256
      131072 (128GB) 16384 (6 partitions) 8192 (11 partitions) 16384 (6 partitions) 8192 (11 partitions) 16384 (22 partitions) 20480 to 21020
      147456 (144GB) 18432 (6 partitions) 9216 (12 partitions) 18432 (6 partitions) 9216 (12 partitions) 18432 (24 partitions)

      23040 to 24096

      262144 (256GB) 32768 (9 partitions) 16384 (11 partitions) 32768 (9 partitions) 16128 (22 partitions) 32256 (32 partitions)

      40320 to 42432

      Note that these values are safe to use for MarkLogic 7 and above.

      For all the databases that ship with MarkLogic Server, the Huge Pages ranges on this table will cover the out-of-the box configuration. Note that adding more forests will cause the second value in the range to increase.

      From MarkLogic Server 9.0-7 and above

      In the 9.0-7 release and above (and all versions of MarkLogic 10), automatic cache sizing was introduced; this setting is usually recommended.

      Maximum group level cache settings

      Assuming a Server configured with 256GB RAM (and above), these are the maximum sizes for the three main group level caches and will utilise 180GB (184320MB) per host for the Group Level Caches:

      • Expanded Tree Cache - 73728 (72GB) (with 9 8GB partitions)
      • List Cache - 73728 (72GB) (with 9 8GB partitions)
      • Compressed Tree Cache - 36864 (36GB) (with 11 3 GB partitions)

      We have found that configuring 4GB partitions for the Expanded Tree Cache and the List Cache generally works well in most cases; for this you would set the number of partitions to 18

      For the Compressed Tree Cache the number of partitions can be set to 22.

      Important note

      The maximum number of configurable partitions is 32

      Each cache partition should be no more than 8192 MB

      Introduction

      MarkLogic Server has a notion of groups, which are sets of similarly configured hosts within a cluster.

      Application servers (and their respective ports) are scoped to their parent group.

      Therefore, you need to make sure that the host and its exposed port to which you're trying to connect both exist in the group where the appropriate application server is defined. For example, if you attempt to connect to a host defined in a group made up of d-nodes, you'll only see application servers and ports defined in the d-nodes group. If the application server you actually want is in a different group (say, e-nodes), you'll get a connection error, instead.

      Questions

      Can I use any xdmp builtins to show which application servers are linked to particular groups?

      The code example below should help with this:

      Problem:

      The errors 'XDMP-MODNOTFOUND - Module not found' and 'XDMP-NOPROGRAM - Server unable to build program from request' may occur when the requested XQuery document does not exist or the user does not have the right permissions on the module.

      Solution:

      When either of these errors is encountered, the first step would be to check if the requested XQuery module is actually present in the modules database. Make sure the the document uri matches the 'root' of the relevant app-server.

      'Modules' field of the app-server configuration specifies the name of the database in which this app-server locates the XQuery application code (if it is not set to 'File-system'). When it is set to a specific database, then only documents in that database whose URI begin with the specified root directory are executable. For example, if 'root'  of the database is set to "/codebase/xquery/", then only documents in the database which start with this uri "/codebase/xquery/" are executable.

      If set to 'File-system' make sure the requested module exists in the location specified in the 'root' directory of the app-server. 

      Defining a 'File-system' location is often used on single node DEV systems but not recommended on a clustered environment. To keep the deployment of code simple it is recommended to use a Modules database in clustered production system.

      Once you made sure that the module does exist, the next step is to check if the user has the right permissions to execute the database. More often, it is likely that the error is caused because of a permissions issue.

      (i) Check app-server privileges

      The 'privilege' field in the app-server configuration, when set, specified the execute privilege required to access the server. Only users who are assigned this privilege can access the server and the application code. Absence of this privilege may cause the XDMP-NOPROGRAM error.

      Make sure the user accessing the app-server has the specified priveleges. This can be checked by using sec:user-privileges() (Should be run against the Security database).

      The documentation here - http://docs.marklogic.com/guide/admin/security#id_63953 contains more detailed information about privileges.

      (ii) Check permission on the requested module

      The user trying to access the application code/modules is required to have the 'execute' permission on the module. Make sure all the xquery documents have 'read' and 'execute' permissions for the user trying to access them. This can be verified by executing the following query against your 'modules' database:

                       xdmp:document-get-permissions("/your-xqy-module")

      This returns a list of permission on the document - with the capability that each role has, in the below format:

                    <sec:permission xmlns:sec="http://marklogic.com/xdmp/security">
                    <sec:capability>execute</sec:capability>
                    <sec:role-id>4680733917602888045</sec:role-id>
                    </sec:permission>
                    <sec:permission xmlns:sec="http://marklogic.com/xdmp/security">
                    <sec:capability>read</sec:capability>
                    <sec:role-id>4680733917602888045</sec:role-id>
                    </sec:permission>

      You can then map the role-ids to their role names as below: (this should be done against the Security database)

                    import module namespace sec="http://marklogic.com/xdmp/security" at "/MarkLogic/security.xqy";
                    sec:get-role-names((4680733917602888045))

      If you see that the module does not have execute permission for the user, the required permissions can be added as below: (http://docs.marklogic.com/xdmp:document-add-permissions)

                   xdmp:document-add-permissions("/document/uri.xqy",

                    (xdmp:permission("role-name","read"),
                   xdmp:permission("role-name", "execute")))

       

       

           

       

       

       

      Introduction

      Recent exploits in the TLS protocol such as POODLE, FREAK, LogJam, and SLOTH have rendered TLSv1.0 and SSLv3 largely obsolete.  Additionally, standards councils such as PCI (Payment Card Industry) and NIST (National Institute of Standards & Technology) are moving to disallow the use of these protocols.

      This article will describe the MarkLogic configuration changes needed to harden a MarkLogic HTTP Application Server so that only secure versions of TLS are used and where clients attempting to connect with TLSv1.0 or earlier protocols are rejected.

      Note: Since this article was first written MarkLogic server has added an administrator function to disable individual SSL and TLS protocol versions. If you are still running MarkLogic version 8.0-5 or earlier you can continue to use the solution outlined below, otherwise, users of MarkLogic 9 or later should use the new AppServer Set SSL Disabled Protocols function to control which SSL and TLS protocol versions are available.

      Configuration

      The TLS protocol versions accepted and the Cipher suites selected are controlled by the specification list set in the "SSL Ciphers" field on the HTTP App Server Configuration panel:

      The format of the specification list follows the OpenSSL format as described in the OpenSSL Cipher suite documentation and comprises one or more colon ":" separated ciphers strings which control which cipher suites are enabled or disabled. 

      The default specification used by MarkLogic enables ALL ciphers except those that are considered of LOW encryption and places them in order of @STRENGTH 

      ALL:!LOW:@STRENGTH

      While sufficient for a lot of needs the default settings still allow for cipher negotiations that are no longer considered secure or weak signature algorithms such as MD2 and MD5. The following cipher specification string enhances security by only permitting AES and Triple DES (3DES) ciphers while at the same time disabling MD2 and MD5 signature algorithms.

      ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD2:!MD5

      PCI DSS 3.2 & NIST SP 800-52 compliance

      At this stage, while the MarkLogic HTTP Application Server is now using stronger security it will still permit a client to connect using TLSv1.0. In order to comply with PCI DSS 3.2, compliant sites must stop using TLSv1.0 by 30th June 2018 while NIST SP 800-52 requires that sites only use TLSv1.1 with a recommendation to use TLSv1.2 where possible.

      TLSv1.2 and browser support

      For TLSv1.2, older browsers should be upgraded to current versions.

      Making these changes may require users accessing your application to upgrade older browsers such as Firefox < 27.0 or Internet Explorer < 11.0 as these versions do not support TLSv1.2 by default.

      The MarkLogic App Server utilizes OpenSSL which does not explicitly support enabling or disabling a specific TLS protocol version, however by disabling the all cipher suites associated with a particular version you effectively get the same outcome.

      SSLv3, TLSv1.0 & TLSv1.1 share the same common ciphers, so adding "!SSLv3" to the cipher specification will cause all client connection attempts using any of these protocols to fail.

      ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD2:!MD5:!SSLv3

      Testing using the OpenSSL s_client utility shows that attempts to connect using TLSv1.0 fail with SSL alert 40 indicating no common cipher was available.

      openssl s_client -connect 192.168.99.100:8010 -debug -tls1
      CONNECTED(00000003)
      ..
      140735283961936:error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure:s3_pkt.c:1472:SSL alert number 40
      140735283961936:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake failure:s3_pkt.c:656:

      While connecting using TLSv1.2 is successful.

      openssl s_client -connect 192.168.99.100:8010 -debug -tls1_2
      CONNECTED(00000003)
      ...
      ---
      New, TLSv1/SSLv3, Cipher is AES256-GCM-SHA384
      Server public key is 2048 bit
      Secure Renegotiation IS supported
      Compression: NONE
      Expansion: NONE
      No ALPN negotiated
      SSL-Session:
      Protocol : TLSv1.2
      Cipher : AES256-GCM-SHA384

      Further reading

      On MarkLogic Security Certification

      Introduction: getting more information about the bugs fixed between releases

      As a general recommendation, we encourage customers to keep the server up-to-date with patch releases at any case.

      If you would like a list of some of the published bugs that were addressed between two releases of the server (for example: 5.0-3 and 5.0-4.1), you can perform the following steps:

      - Log into the support portal at http://help.marklogic.com
      - Click on the "Fixed bugs" icon to take you to the bugtrack list
      - Select 5.0-3 in the From: dropdown box
      - Select 5.0-4.1 in the To: dropdown box
      - Click 'Show' to generate an HTML table or View PDF to export the results in a PDF document

      Step one: login

      Provide your credentials and use the form on the left-hand side to log in to access the support portal

      Log into the support portal

      Step two: select the "Fixed bugs" link from the icons on the page

      Select 'Fixed Bugs' to go to the bugtrack list

      Step three: select the release 'range' from the two dropdown lists on the Fixed Bugs page

      Use the Show button to update the page or download the list in PDF format as required

      Select the versions from the 'From' and 'To' lists to generate the report

      Introduction

      In Amazon Web Services, AMIs have unique ids based on their region. There will be many cases when you want to use multiple regions (for example: maintenance of two clusters in separate geographical regions). Below is an example of how to find the list of current AMIs.

      Log in to Amazon Web Services

      Example image showing the AWS Login Page

      Find your MarkLogic instance on Amazon AWS Marketplace

      Example image showing the MarkLogic 8 HVM in Amazon's Marketplace

      For example: https://aws.amazon.com/marketplace/pp/B00U36DS6Y

      Click continue

      Example Continue button

      View the table

      Choose the version of MarkLogic Server that you're planning to use from the version dropdown.

      Image of a table showing all AMI IDs available for this item in the AWS Marketplace

      You will see a table containing a list of all current regions and the corresponding AMI ID for our instances for each available region.

      Further reading

      Summary

      MarkLogic Server has several different features that can help manage data across multiple database instances. Those features differ from each other in several important ways - this article will focus on high-level distinctions and will provide pointers to other materials to help you decide which of these features could work best for your particular use case.

       Details

      Backup/Restore - database backup and restore operations in MarkLogic Server provide consistent database-level views of your data. Propagating data from one instance to another via backup/restore involves a MarkLogic administrator using a completed backup from the source instance as the restore archive on the destination instance. You can read more about Backup/Restore here: http://docs.marklogic.com/guide/admin/backup_restore.

      Flexible Replication - can be used to maintains copies of data on multiple MarkLogic Servers. Unlike backup/restore (which relies on taking a consistent, database level view of the data at a particular timestamp), Flexible Replication creates a copy of a document in another database and keeps that copy in sync (possibly with some time-lag/latency) with the original in the course of normal operations. You can read more about Flexible Replication here: http://docs.marklogic.com/guide/flexrep/rep_intro. Do note that:

      • Flexible Replication is asynchronous. Asynchronous Replication refers to a configuration in which the Master does not wait for confirmation that the update has been received by the Replica before sending further updates.
      • Flexible Replication does not use the same transaction boundaries on the replica as on the master. For example, 10 documents might be inserted in a single transaction on a Flexible Replication master. Those 10 documents will eventually be inserted on a Flexible Replication replica, but there is no guarantee that the replica instance will also use a single transaction to do so.

      Database Replication - is used maintains copies of data on multiple MarkLogic Servers. Database Replication creates a copy of a document in another database and keeps that copy in sync (possibly with some time-lag/latency) with the original in the course of normal operations. You can read more about Database Replication here: http://docs.marklogic.com/guide/database-replication/dbrep_intro. Note that:

      a. Database Replication is, like Flexible Replication, asynchronous.

      b. In contrast to Fleixble Replication, Database Replication operates by copying journal frames from the Master database and replays the transactions described by those journal frames on the foreign Replica database.

      XA Transactions - MarkLogic Server can participate in distributed transactions by acting as a Resource Manager in an XA/JTA transaction. If there are multiple MarkLogic Server instances participating as XA resources in a given XA transaction, then it's possible to use that XA transaction as a synchronized means of replicating data across those multiple MarkLogic instances. You can read more about XA Transactions in MarkLogic Server here: http://docs.marklogic.com/guide/xcc/concepts#id_57048.

      Introduction

      Upgrading individual MarkLogic instances and clusters is generally very easy to do and in most cases requires very little downtime. In most cases, shutting down the MarkLogic instance on each host in turn, uninstalling the current release, installing the updated release and restarting each MarkLogic instance should be all you need to be concerned about...

      However, unanticipated problems do sometimes come to light and the purpose of this Knowledgebase article is to offer some practical advice as to the steps you can take to ensure the process goes as easily as possible - this is particularly important if you're planning an upgrade between major releases of the product.

      Prerequisites

      While the steps outlined under the process heading below offer practical advice as to what to do to ensure your data is safeguarded (by recommending that backups are taken prior to upgrading), another very useful step would be to ensure you have your current configuration files backed up.

      Each host in a MarkLogic cluster is configured using parameters which are stored in XML Documents that are available on each host. These are usually relatively small files and will zip up to a manageable size.

      If you cd to your "Data" directory (on Linux this is /var/opt/MarkLogic; on Windows this is C:\Program Files\MarkLogic\Data and on OS X this is /Users/{username}/Library/Application Support/MarkLogic), you should see several xml files (assignments, clusters, databases, groups, hosts, server).

      Whenever MarkLogic updates any of these files, it creates a backup using the same naming convention used for older ErrorLog files (_1, _2 etc). We recommend backing up all configuration files before following the steps under the next heading.

      Process

      1) Take a backup for each database in your cluster

      2) Turn reindexing off for each database in your cluster

      3) Starting with the node hosting your Security and Schemas forests, uninstall the current maintenance release MarkLogic version on your cluster, then install the latest maintenance release in that feature release (for example, if you're currently running version 8.0-5, you'll want to update to the latest available MarkLogic 8 maintenance release - at the time of this writing, it is 8.0-8.1).

      4) Start up the host in your cluster hosting your Security and Schemas forests, then the remaining hosts in the cluster.

      5) Access the Admin UI on the node hosting your Security and Schemas forests and accept the license agreement, either for just that host (Accept button) or for all of the hosts in the cluster (Accept for Cluster button). If you choose the Accept for Cluster button, a summary screen appears showing all of the hosts in the cluster. Click the Accept for Cluster button to confirm acceptance (all of the hosts must be started in order to accept for the cluster). If you accepted the license just for the one host in the previous step, you must go to all of the Admin Interface for all of the other hosts and accept the license for each host before each host can operate.

      6) If you're upgrading across feature releases, you may now repeat steps #3-5 until you reach the desired feature and maintenance release on your cluster (for example, if trying to upgrade to the major feautre release MarkLogic 9, after installing 8.0-latest, you'll repeat steps 3-5 for version 9.0-latest).

      7) After you've finished upgrading across all the relevant feature releases, re-enable reindexing for each database in your cluster.

      For more details, please go through Section 6.4: “Upgrading a Cluster to a New Maintenance Release of MarkLogic Server” of “Scalability, Availability, and Failover” guide available here: http://developer.marklogic.com/pubs/.

      If you've got database replication in place across both a master and replica cluster, then be aware that:

      1) You do not need to break replication between the clusters

      2) You should plan to upgrade both the master cluster and replica cluster. If you upgrade just the master, connectivity between the two clusters will stop due to different XDQP versions. 

      3) If the Security database isn't replicated, then there shouldn't be anything special you need to do other than upgrade the two clusters.

      4) If the security database is replicated, do the following:

      • Upgrade the Replica cluster and run the upgrade scripts. This will update the Replica's Security database to indicate that it is current. It will also do any necessary configuration upgrades.
      • Upgrade the Master cluster and run the upgrade scripts. This will update the Master's Security database to indicate that it is current. It will also do any necessary configuration upgrades.

      For more here Updating Clusters Configured with Database Replication

      Back-out Plan

      MarkLogic does not support restoring a backup made on a newer version of MarkLogic Server onto an older version of MarkLogic Server. Your Back-out plan will need to take this into consideration.

      See the section below for recommendations on how this should be handled.

      Further reading

      Backing out of your upgrade: steps to ensure you can downgrade in an emergency

      Product release notes

      The "Upgrade Support" section of the release notes.

      All known incompatibilities between releases

      The "Upgrading from previous releases" section of the documentation

      MarkLogic Support Fixed Bug List

      Introduction

      spell:suggest() and spell:suggest-detailed aren't simply looking for character differences between the provided strings and the strings in your dictionaries - they're also factoring in differences in the resulting phonetics represented by these strings.

      Detail

      There is an undocumented option that can be passed along to increase the phonetic-distance threshold (which is 1, by default). For example, consider the following:

      xquery version "1.0-ml";

      spell:suggest-detailed(('customDictionary.xml'),'acknowledgment', <options xmlns="http://marklogic.com/xdmp/spell"> <phonetic-distance>2</phonetic-distance> </options> )

      =>

      <spell:suggestion original="acknowledgment"
      dictionary="customDictionary.xml"
      xmlns:xml="http://www.w3.org/XML/1998/namespace"
      xmlns:spell="http://marklogic.com/xdmp/spell"> <spell:word distance="9" key-distance="2" word-distance="45"
      levenshtein-distance="1">acknowledgement</spell:word> </spell:suggestion>

      Note that the option "distance-threshold" corresponds to "distance" in the result, and "phonetic-distance" corresponds to "key-distance."

      Also note that increasing the phonetic-distance may cause spell:suggest() and spell:suggest-detailed() to use significantly more CPU. Metaphones are short keys, so a larger distance may match a very large fraction of the dictionary, which would then mean each of those matches would need to be checked in the distance algorithms.

      Background

      A database consists of one or more forests. A forest is a collection of documents (mostly XML trees, thus the name), implemented as a physical directory on disk. Each forest holds a set of documents and all their indexes. 

      When a new document is loaded into MarkLogic Server, the server puts this document in an in-memory stand and writes the action to an on-disk journal to maintain transactional integrity in case of system failure. After enough documents are loaded, the in-memory stand will fill up and be flushed to disk, written out as an on-disk stand. As more document are loaded, they go into a new in-memory stand. At some point this in-memory stand fills up as well, and the in-memory stand gets written as yet another new on-disk stand.

      To read a single term list, MarkLogic must read the term list data from each individual stand and unify the results. To keep the number of stands to a manageable level where that unification isn't a performance concern, MarkLogic runs merges in the background. A merge takes some of the stands on disk and creates a new singular stand out of them, coalescing and optimizing the indexes and data, as well as removing any previously deleted fragments
      Each forest has its own in-memory stand and set of on-disk stands. Loading and indexing content is a largely parallelizable activity so splitting the loading effort across forests and potentially across machines in a cluster can help scale the ingestion work.

      Deletions and Multi-Version Concurrency Control (MVCC)

      What happens if you delete or change a document? If you delete a document, MarkLogic marks the document as deleted but does not immediately remove it from disk. The deleted document will be removed from query results based on its deletion markings, and the next merge of the stand holding the document will bypass the deleted document when writing the new stand. MarkLogic treats any changed document like a new document, and treats the old version like a deleted document.

      This approach is known in database circles as which stands for Multi-Version Concurrency Control (or MVCC).
      In an MVCC system changes are tracked with a timestamp number which increments for each transaction as the database changes. Each fragment gets its own creation-time (the timestamp at which it was created) and deletion-time (the timestamp at which it was marked as deleted, starting at infinity for fragments not yet deleted).

      For a request that doesn't modify data the system gets a performance boost by skipping the need for any URI locking. The query is viewed as running at a certain timestamp, and throughout its life it sees a consistent view of the database at that timestamp, even as other (update) requests continue forward and change the data.

      Updates and Deadlocks

      An update request, because it isn't read-only, has to use read/write locks to maintain system integrity while making changes. Read-locks block for write-locks; write-locks block for both read and write-locks. An update has to obtain a read-lock before reading a document and a write-lock before changing (adding, deleting, modifying) a document. Lock acquisition is ordered, first-come first-served, and locks are released automatically at the end of a request.

      In any lock-based system you have to worry about deadlocks, where two or more updates are stalled waiting on locks held by the other. In MarkLogic deadlocks are automatically detected with a background thread. When the deadlock happens on the same host in a cluster, the update farthest along (with the most locks) wins and the other update gets restarted. When it happens on different hosts, because lock count information isn't in the wire protocol, both updates start over. MarkLogic differentiates queries from updates using static analysis. Before running a request, it looks at the code to determine if it includes any calls to update functions. If so, it's an update. If not, it's a query. Even if at execution time the update doesn't actually invoke the updating function, it still runs as an update.

      For the most part it's not under the control of the user. The one exception is there's an xdmp:lock-for-update($uri) call that requests a write-lock on a document URI, without actually having to issue a write and in fact without the URI even having to exist.

      When a request potentially touches millions of documents (such as sorting a large data set to find the most recent items), a query request that runs lock-free will outperform an update request that needs to acquire read-locks and writelocks. In some cases you can speed up the query work by isolating the update work to its own transactional context. This technique only works if the update doesn't have a dependency on the outer query, but that turns out to be a common case. For example, let's say you want to execute a content search and record the user's search string to the database for tracking purposes. The database update doesn't need to be in the same transactional context as the search itself, and would slow things down if it were. In this case it's better to run the search in one context (read-only and lock-free) and the update in a different context. See the xdmp:eval() and xdmp:invoke() functions for documentation on how to invoke a request from within another request and manage the transactional contexts between the two.

      Document Lifecycle

      Let's track the lifecycle of a document from first load to deletion until the eventual removal from disk. A document load request acquires a write-lock for the target URI as part of the xdmp:document-load() function call. If any other request is already doing a write to the same URI, our load will block for it, and vice versa. At some point, when the full update request completes successfully (without any errors that would implicitly cause a rollback), the actual insertion work begins, processing the queue of update work orders. MarkLogic starts by parsing and indexing the document contents, converting the document from XML to a compressed binary fragment representation. The fragment gets added to the in-memory stand. At this point the fragment is considered a nascent fragment, a term you'll see sometimes on the administration console status pages. Being nascent means it exists in a stand but hasn't been fully committed. (On a technical level, nascent fragments have creation and deletion timestamps both set to infinity, so they can be managed by the system while not appearing in queries prematurely.) If you're doing a large transactional insert you'll accumulate a lot of nascent fragments while the documents are being processed. They stay nascent until they've been committed. Once the fragment is placed into the in-memory stand, the request is ready to commit. It obtains the next timestamp value, journals its intent to commit the transaction, and then makes the fragment available by setting the creation timestamp for the new fragment to the transaction's timestamp. At this point it's a durable transaction, replayable in event of server failure, and it's available to any new queries that run at this timestamp or later, as well as any updates from this point forward (even those in progress). As the request terminates, the write-lock gets released.

      Our document lives for a time in the in-memory stand, fully queryable and durable, until at some point the in-memory stand fills up and gets written to disk. Our document is now in an on-disk stand. Sometime later, based on merge algorithms, the on-disk stand will get merged with some other on-disk stands to produce a new on-disk stand. The fragment will be carried over, its tree data and indexes incorporated into the larger stand. This might happen several times.

      At some point a new request makes a change to the document, such as with an xdmp:node-replace() call. The request making the change first obtains a read-lock on the URI when it first accesses the document, then promotes the read-lock to a write-lock when executing the xdmp:node-replace() call. If another write-lock were already present on the URI from another executing update, the read-lock would have blocked until the other write-lock released. If another read-lock were already present, the lock promotion to a write-lock would have blocked. Assuming the update request finishes successfully, the work runs similar to before: parsing and indexing the document, writing it to the in-memory stand as a nascent fragment, acquiring a timestamp, journaling the work, and setting the creation timestamp to make the fragment live. Because it's an update, it has to mark the old fragment as deleted also, and does that by setting the deletion timestamp of the original fragment to the transaction timestamp. This combination effectively replaces the old fragment with the new. When the request concludes, it releases its locks. Our document is now deleted, replaced by the new version.

      The old fragment still exists on disk, of course. In fact, any query that was already in progress before the update incremented the timestamp, or any query doing time travel with an old timestamp, can still see it. Eventually the on-disk stand holding the fragment will be merged again, at which point the old fragment will be completely removed from the system. It won't be written into the new on-disk stand. That is, unless the administration "merge timestamp" was set to allow deep time travel. In that case it will live on, sticking around in case any new queries want to time travel to see old fragments.

      Summary

      The following article explains the way in-memory caches are used by MarkLogic Server and how can they be utilized to improve query execution.

       

      Detail

      MarkLogic Server provides several caches that are used to improve the performance during query execution. When a query executes for the first time, the Server will populate these caches to store termlist and data fragments in memory.

      MarkLogic Server keeps a lot of its configuration information in databases, and has a lot of caches to make it run faster, but those caches get populated the first time things are accessed. The server also uses book-keeping terms in the indexes to keep track of whether all documents have been indexed with the current settings. MarkLogic caches this information, but has to query the indexes on the first request to warm the cache.

      The in-memory cache in MarkLogic Server holds data that was recently added to the system and is still in an in-memory stand; that is, it holds data that has not yet been written to disk.

      For updates, if there is no in-memory stand on a forest when a new document is inserted, the server will create it. This stand is big enough for thousands of documents, but the cost of creating it will be seen in the time taken for the first document added to it.

       


      How will the in-memory cache help improve query execution

      When a query is executed, the in-memory data structures like range indexes and lexicons get pinned into RAM the first time they are used.  The easiest way to speed things up is to "warm the caches” by running a small sample program that exercises the type-ahead prior to starting production. You can also keep the server warm by doing a non-time-critical stub update at time intervals (every 30 sec to 1 minute). If the server is idle, then it will serve to keep caches and in-memory stand warm. If the server is really busy then it would only take a small amount of extra work. Once this is done, the functionality will be fast for all users in all future sessions.

      Introduction

      This Knowledgebase article is a general guideline for backups using the journal archiving feature for both free space requirements and expected file sizes written to the archive journaling repository when archive journaling is enabled and active.

      The MarkLogic environment used here was an out-of-the box version 9.x with one change of adding a new directory specific to storing the archive journal backup files.

      It is assumed that the reader of this article already has a basic understanding of the role of Journal Archiving in the Backup and Restore feature of MarkLogic Server. See references below for further details(below).

      How much free space is needed for the Archive Journal files in a backup?

      MarkLogic Server uses the forest size of the active forest to confirm whether the journal archive repository has enough free space to accommodate that forest, but if additional forests already exist on the same volume, then there may be an issue in the Server's "free-space" calculation as the other forests are never used in the algorithm that calculates the free space available for the backup and/or archive journal repositories. Only one forest is used in the free-space calculation.

      In other words, if multiple forests exist on the same volume, there may not be enough free space available on that specific volume due to the additional forests; especially during a high rate of ingestion. If that is the case, then it is advised to provide enough free space on that volume to accommodate the sizes of all the forests. Required Free Space(approximately) = (Number of Forests) x (Size of largest Forest).

      What can we expect to see in the journal archiving repository in terms of files sizes for specific ingestion types and sizes? That brings us to the other side.

      How is the Journal Archive repository filling up?

      1 MByte of raw XML data loaded into the server (as either a new document ingestion or a document update) will result in approximately 5 to 6 MBytes of data being written to the corresponding Journal Archive files.  Additionally, adding Range Indexes will contribute to a relatively small increase in consumed space.

      Ingesting/updating RDF data results in slightly less data being written to the journal archive files.

      In conclusion, for both new document ingestion and document updates, the typical expansion ratio of Journal Archive size to Input file size is between 5 an 6 but can be higher than that depending on the document structure and any added range indexes.

      References:

      Introduction

      Content processing applications often require multi-step processing. Each step in the process performs a particular task or set of tasks. The Content Processing Framework in MarkLogic Server supports these types of multi-step conversion processes. Sometimes during document delete operation, it is possible that the CPF action might fail with 'XDMP-CONFLICTINGUPDATES' error, which can be seen in document-properties file like:

      Sample message:

      <error:format-string>XDMP-CONFLICTINGUPDATES: xdmp:document-set-property("FILE-NAME", <cpf:state xmlns:cpf="http://marklogic.com/cpf">http://marklogic.com/states/deleted</cpf:state>) -- Conflicting updates xdmp:document-set-property("FILE-NAME", /cpf:state) and xdmp:document-delete("FILE-NAME")</error:format-string>

      This error message indicates that an update statement (for e.g. xdmp:document-set-property) is trying to update a document that is conflicting with other update occurring (e.g. xdmp:document-delete) in the same transaction.

       

      Detail

      Actions that want to delete the target URI need special handling because MarkLogic CPF also wants to keep track of progress in the properties, and just having document-delete [ xdmp:document-delete($cpf:document-uri) ]can't do that.

      Following are ways to achieve the expected behavior and get past the XDMP-CONFLICTINGUPDATES error:

      1) Performing a "soft delete" on the document and then let CPF take care of deleting the document. This can be done by setting the document status to "deleted" via cpf:document-set-processing-status API function. Setting the document's processing status to "deleted" will tell CPF to clean up the document and not update properties at the same time.

      cpf:document-set-processing-status( $uri-to-delete, "deleted" )

      Additional details can be found at: http://docs.marklogic.com/cpf:document-set-processing-status


      2) If you want to keep a record of the URI that is being deleted, you can delete its root node instead of the document. The CPF state will be able be recorded in document-properties, even if the document is gone.

      xdmp:node-delete(doc($uri-to-delete))

      Details at: http://docs.marklogic.com/xdmp:node-delete

      Introduction

      Sometimes, when a host is removed from a cluster in an improper manner -- e.g., by some means other than the Admin UI or Admin API, a remote host can still try to communicate with its old cluster, but the cluster will recognize it as a "foreign IP" and will log a message like the one below:

      2014-12-16 00:00:20.228 Warning: XDQPServerConnection::init(10.0.80.7:7999-10.0.80.39:44247): SVC-SOCRECV: Socket receive error: wait 10.0.80.7:7999-10.0.80.39:44247: Timeout

      Explanation: 

      XDQP is the internal protocol that MarkLogic uses for internal communications amongst the hosts in a cluster and it uses port 7999 by default. In this message, the local host 10.0.80.7 is receiveng socket connections from foreign host 10.0.80.39.

       

      Debugging Procedure, Step 1

      To find out if this message indicates a socket connection from an IP address that is not part of the cluster, the first place is to look is in the hosts.xml files. If the IP address in not found in the hosts.xml, then it is a foreign IP. In that case, the following are the steps will help to identify the the processes that are listening on port 7999.

       

      Debugging Procedure, Step 2

      To find out who is listening on XDQP ports, try running the following command in a shell window on each host:

            $ sudo netstat -tulpn | grep 7999

      You should only see MarkLogic as a listner:

           tcp 0 0 0.0.0.0:7999 0.0.0.0:* LISTEN 1605/MarkLogic

      If you see any other process listening on 7999, yopu have found your culprit. Shot down those processes and the messages will go away.

       

      Debugging Procedure, Step 3

      If the issue persists, run tcpdump to trace packets to/from "foreign" hosts using the following command:

           tcpdump -n host {unrecognized IP}

      Shutdown MarkLogic on those hosts. Also, shutdown any other applications that are using port 7999.

       

      Debugging Procedure, Step 4

      If the cluster are hosts on AWS, you may also want to check on your Elastic Load Balancer ports. This may be tricky, because instances will change IP addresses if they are rebooted, so  work with AWS Support to help you find the AMI or load balancer instance that is pinging your cluster.

      In the case that the "foreign host" is an elastic load balancer, be sure to remove port 7999 from its rotation/scheduler. In addition, you should set the load balancer to use port 7997 for the heartbeat functionality.

      Introduction

      Sometimes, when a cluster is under heavy load, your cluster may show a lot of XDQP-TIMEOUT messages in the error log. Often, a subset of hosts in the cluster may become so busy that the forests they host get unmounted and remounted repeatedly. Depending on your database and group settings, the act of remounting a forest may be very time-consuming, due to the fact that that all hosts in the cluster are being forced to do extra work of index detection.

      Forest Remounts

      Every time a forest remounts, the error log will show a lot messages like these:

      2012-08-27 06:50:33.146 Debug: Detecting indexes for database my-schemas
      2012-08-27 06:50:33.146 Debug: Detecting indexes for database Triggers
      2012-08-27 06:50:35.370 Debug: Detected indexes for database Last-Login: sln
      2012-08-27 06:50:35.370 Debug: Detected indexes for database Triggers: sln
      2012-08-27 06:50:35.370 Debug: Detected indexes for database Schemas: sln
      2012-08-27 06:50:35.370 Debug: Detected indexes for database Modules: sln
      2012-08-27 06:50:35.373 Debug: Detected indexes for database Security: sln
      2012-08-27 06:50:35.485 Debug: Detected indexes for database my-modules: sln
      2012-08-27 06:50:35.773 Debug: Detected indexes for database App-Services: sln
      2012-08-27 06:50:35.773 Debug: Detected indexes for database Fab: sln
      2012-08-27 06:50:35.805 Debug: Detected indexes for database Documents: ss, fp

      ... and so on ...

      This can go on for several minutes and will cost you more down time than necessary, since you already know the indexes for each database.

      Improving the situation

      Here are some suggestions for improving this situation:

      1. Browse to Admin UI -> Databases -> my-database-name
      2. Set ‘index detection’ to ‘none’
      3. Set ‘expunge locks’ to ‘none’

      Repeat steps 1-4 for all active databases.

      Now tweak the group settings to make the cluster less sensitive to an occasional busy host:

      1. Browse to Admin UI -> Groups -> E-Nodes
      2. Set ‘xdqp timeout’ to 30
      3. Set ‘host timeout’ to 90
      4. Click OK to make this change effective.

      The database-level changes tell the server to speed up cluster startup time when a server node is perceived to be offline. The group changes will cause the hosts on that group to be a little more forgiving before declaring a host to be offline, thus preventing forest unmounting when it's not really needed.

      If after performing these changes, you find that you are still experiencing XDQP-TIMEOUT's, the next step is to contact MarkLogic Support for assistance. You should also alert your Development team, in case there is a stray query that is causing the data nodes to gather too many results.

      Related Reading

      XML Data Query Protocol (XDQP)

      Introduction

      Under normal operations, only a single user object is created for a user-name. However, when users are migrated from another security database and if the recommend checking is not performed, duplicate user-names might be created.

      Resolution

      When there are duplicate user-names in the database, you may see the following message on the Admin UI or in the error logs:

      500: Internal Server Error
      XDMP-AS: (err:XPTY0004) get-element($col, "sec:user", "sec:user-name", $user-name, "SEC-USERDNE") -- Invalid coercion: (fn:doc("http://marklogic.com/xdmp/users/*******")/sec:user, fn:doc("http://marklogic.com/xdmp/users/*******")/sec:user) as element()?

       

      To fix duplicate user-names, the extra security object that is created needs to be removed. You can delete one of the extra security objects, which should have a URI similar to:


      http://marklogic.com/xdmp/users/******* where "*******" represents the user-id's.

       

      To resolve the issue, follow the below steps:

      1. Perform a backup of your Security database in case manual recovery is required.

      2. Login to the QConsole with admin credentials.

      3. Select "Security" database as the content-source

      4. Delete the security object by executing xdmp:document-delete($uri) with $uri set to the Uri of the duplicate user.

      Introduction

      For hosts that don't use a standard US locale (en_US) there are instances where some lower level calls will return data that cannot be parsed by MarkLogic Server. An example of this is shown with a host configured with a different locale when making a call to the Cluster Status page (cluster-status.xqy):

      XDMP-LEXVAL exception

      The problem

      The problem you have encountered is a known issue: MarkLogic Server uses a call to strtof() to parse the values as floats:

      http://linux.die.net/man/3/strtof

      Unfortunately, this uses a locale-specific decimal point. The issue in this environment is likely due to the Operating System using a numeric locale where the decimal point is a comma, rather then a period.

      Resolving the issue

      The workaround for this is as follows:

      1. Create a file called /etc/marklogic.conf (unless one already exists)

      2. Add the following line to /etc/marklogic.conf:

      export LC_NUMERIC=en_US.UTF-8

      After this is done, you can restart the MarkLogic process so the change is detected and try to access the cluster status again.

      Summary

      This Knowledgebase article outlines the necessary steps required in importing an existing (pre-signed) Certificate into MarkLogic Server and configuring a MarkLogic Application Server to utilize that certificate.

      Existing (Pre-signed) Certificate vs. Certificate Request Generated by MarkLogic

      MarkLogic will allow you to use an existing certificate or will allow you to generate a Certificate Request. The key difference between above two lies in who generates public-private keys and other fields in the certificate.

      For a Pre-Signed Certificate: In this instance, the keys already exist outside of MarkLogic Server, and 3rd party tool would have populated CN (Common Name) and other subject fields to generate Certificate Request File (.csr) containing a public key.

      For a Certificate Request Generated by MarkLogic: In this instance, new keys are generated by MarkLogic Server (it does this while creating the new template), while CN and other fields are added by the MarkLogic Server Administrator (or user) through the web-based MarkLogic admin GUI during New Certificate Template creation.

      The section in MarkLogic's online documentation on Creating a Certificate Template covers the steps required to generate a certificate template from within MarkLogic Server: http://docs.marklogic.com/guide/security/SSL#id_35140

        

      Steps to Import Pre-Signed Certificate and Key into MarkLogic

      1) Create a Certificate Template 

      Create a new Certificate Template with the fields similar to your existing Pre-Signed Certificate

      For example, your current Certificate file - presigned.marklogic.com.crt

      [amistry@engrlab18-128-026 PreSignedCert]$ openssl x509 -in ML.pem -text 
      Certificate:
          Data:
              Version: 1 (0x0)
              Serial Number: 7 (0x7)
          Signature Algorithm: sha1WithRSAEncryption
              Issuer: C=US, ST=CA, L=San Carlos, O=MarkLogic Corporation, OU=Engineering, CN=MarkLogic CA
              Validity
                  Not Before: Nov 30 04:12:33 2015 GMT
                  Not After : Nov 29 04:12:33 2017 GMT
              Subject: C=US, ST=NJ, L=Princeton, O=DemoLab Corporation, OU=Engineering, CN=presigned.engrlab.marklogic.com
              Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                      Public-Key: (1024 bit)
       
       
      For above Certificate we will create below Custom Template in Admin GUI -> Configure-> Security -> Certificate Template  Create Tab as below.
      We will save our new template as - "DemoLab Corporation Template"
       
       
       Template.jpg

      Note - Above fields are placeholders only for signed Certificate, and MarkLogic mainly uses above fields to generate Certificate Signing Request (.csr). For Certificate request generated by 3rd party tool, it does NOT matter if template field matches exactly with final signed Certificate or not.

      Once we have Signed Certificate imported, App Server will use the Signed Certificate, and the SSL Client will only see field values from the Signed Certificate (even if they are different from Template Config page ).

      2) Create an HTTPS App Server

      Please follow Procedures for Enabling SSL on App Servers except for the "Creating Certificate Template" part as we have created the Template to match our existing pre-signed Certificate. 

      3) Verify Pre-signed Certificate and Private Key file 

      Prior to installing a pre-signed certificate and private key the following verification should be performed to ensure that both certificate and key are valid and are in the correct format. 

      * Generate and display the certificate checksum using the OpenSSL utility

      [admin@sitea ~]# openssl x509 -noout -modulus -in cert.pem | openssl md5

      (stdin)= 2ddd2ca48ad2eb4eba082f5da3fd33ab

      * Generate and display the private key checksum

      [admin@siteaa ~]# openssl rsa -noout -modulus -in key.key | openssl md5

      (stdin)= 2ddd2ca48ad2eb4eba082f5da3fd33ab

      The checksum from both commands should return identical values, if the values do not match or if you are prompted for additional information such as the private key password then the certificate and private keys are not valid and should be corrected before proceeding.

      Note: Proceeding to the next step without verifying the certificate and the private key could lead to the MarkLogic server being made inaccessible. 

      4) Install Pre-signed Certificate and Key file to Certificate Template using Query Console

      Now since Certificate was pre-signed, MarkLogic does not have a key that goes along with that Pre-signed Certificate. We will install Pre-signed Certificate and Key into MarkLogic using below XQuery in Query Console.

      Note: Query Must be run against Security Database. 

      Please change the Certificate Template-Name, and Certificate/Key File location in below XQuery to reflect values from your environment.

      xquery version "1.0-ml";
      import module namespace pki = "http://marklogic.com/xdmp/pki" at "/MarkLogic/pki.xqy";
      import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";
      
      (: Update Template name for your environment :)
      let $templateid := pki:template-get-id(pki:get-template-by-name("TemplateName"))
      (: Path on the MarkLogic host that is readable by the MarkLogic server process (default daemon) :)
      (:   File suffix could also be .txt or other format :)
      let $path-to-cert := "/cert.pem"
      let $path-to-key := "/key.key"
      
      return
      pki:insert-host-certificate($templateid,
        xdmp:document-get($path-to-cert,
          <options xmlns="xdmp:document-get"><format>text</format></options>),
        xdmp:document-get($path-to-key,
          <options xmlns="xdmp:document-get"><format>text</format></options>)
      )
      

       Above will associate our pre-signed Certificate and Key into Template created earlier, which is linked to HTTPS App Server.

      Important note: pki:insert-trusted-certificates can also be used in place of pki:insert-host-certificate in the above example.

      Introduction

      This article discusses the effects of the incremental backup implementation on Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO).

      Details

      With MarkLogic 8 you can have multiple daily incremental backups with minimal impact on database performance.

      Incrementals complete more quickly than full backups reducing the backup window. A smaller backup window enables more frequent backups, reducing the RPO of the database in case of disaster.

      However, RTO can be longer when using incremental backups compared to just full backups, because multiple backups must be restored to recover.

      There are two modes of operation when using incremental backups:

      Incremental since last full. Here, each incremental has to store all the data that has changed since the last full backup. Since a restore only has to go through a single incremental data set, the server is able to perform a faster restore.  However, each incremental data set is bigger and takes longer to complete than the previous data set because it stores all changes that were included in the previous incremental.

      Please note when doing “Incremental since last full”:-

      - Create a new incremental backup directory for each incremental backup

      - Call database-incremental-backup with incremental-dir set to the new incremental backup directory

       

      Incremental since last incremental.  In this case, a new incremental stores only changes since the last incremental, also known as delta backups. By storing only the changes since the last incremental, the incremental backup sets are smaller in size and are faster to complete.  However, a restore operation would have to go through multiple data sets.

      Please note when doing “Incremental since last incremental”:-

      - Create an incremental backup directory ONCE

      - Call database-incremental-backup with the same incremental backup directory.

      See also the documentation on Incremental Backup.

       

       

      Indexing Best Practices

      MarkLogic Server indexes records (or documents/fragments) on ingest. When a database's index configuration is changed, the server will consequently reindex all matching records.

      Indexing and reindexing can be a CPU and I/O intensive operation. Reindexing creates a lot of new fragments, with the original fragments being marked for deletion. These deleted fragments will then need to be merged out. All of this activity can potentially affect query performance, especially in systems with under-provisioned hardware.

      Reindexing in Production

      If you need to add or modify an index on a production cluster, consider scheduling the reindex during a time when your cluster is less busy. If your database is too large to completely reindex during a single period of low usage, consider running the reindex over several periods of time. For example, if your low usage period is during a weekend, the process may look like:

      • Change your index configuration on a Friday night
      • Let the reindex run for most of the weekend
      • To pause the reindex, set the reindexer-enable field to 'false' for the database being reindexed. Be sure to allow sufficient time for the associated merging to complete before system load comes back.
      • If needed, reindexing can continue over the next weekend - the reindexer process will pick up where it left off before it was disabled.

      You can refer to https://help.marklogic.com/Knowledgebase/Article/View/18/15/how-reindexing-works-and-its-impact-on-performance for more details on invoking reindexing on production.

      Avoid Unused Range Indexes, Fields, and Path Indexes

      In addition to taking up extra disk space, Range, Field, and Path Indexes require extra work when it's time to reindex. Field and Path indexes may also require extra indexing passes.

      Avoid Using Namespaces to Implement Multi-Tenancy

      It's a common use case to want to create some kind of partition (or multiple partitions) between documents in a particular database. In such a scenario it's far better to 1) constrain the partitioning information to a particular element in a document (then include a clause over that element in your searches), than it is to 2) attempt to manage partitions via unique element namespaces corresponding to each partition. For example, given two documents in two different partitions, you'll want them to look like this:

      1a. <doc><partition>partition1</partition><name>Joe Smith</name></doc>

      1b. <doc><partition>partition2</partition><name>John Smith</name></doc>

      ...vs. something like this:

      2a. <doc xmlns:p="http://partition1"><p:name>Joe Smith</p:name></doc>

      2b. <doc xmlns:p="http://partition2"><p:name>John Smith</p:name></doc>

      Why is #1 better? In terms of searching the data once it's indexed, there's actually not much of a difference - one could easily create searches to accommodate both approaches. The issue is how the indexing works in practice. MarkLogic Server indexes all content on ingest. In scenario #2, every time a new partition is created, a new range element index needs to defined in the Admin UI, which means your index settings have changed, which means the server now needs to reindex all of your content - not just the documents corresponding to the newly introduced partition. In contrast, for scenario #1, all that would need to be done is to ingest the documents corresponding to the new partition, which would then be indexed just like all the other existing content. There would be a need, however, to change the searches in scenario #1, as they would not yet include a clause to accommodate the new partition (for example: cts:element-value-query(xs:QName("partition"), "partition2")) - but the overall impact of adding a partition is changing the searches in scenario #1, which is ultimately far, far less intrusive a change than reindexing your entire database as would be required in scenario #2. Note that in addition to a database-wide reindex, searches would also need to change in scenario #2, as well.

      Keep an Eye on I/O Throughput

      Reindexing can lead to heavy merge activity and may lead to disk I/O bottlenecks if not managed carefully. If you have a system that is available 24-7 with no downtime window, then you may need to throttle the reindexer in order to keep the disk I/O to a minimum. We suggest the following database settings for reindexing a system that must always remain in use:

      • reindexer-throttle = 3
      • large-size-threshold = 1048576

      You can also adjust the following group settings to help limit background I/O:

      • background-io-limit = 100

      This will limit the background I/O for that group to 100 MB/sec per host across all hosts in that group. This should only be configured if merges are causing problems—it is a way of throttling back the I/O used by the merging process.This is good starting point, and may be increased in increments of 50 if you find that your merges are progressing too slowly.  Proceed with caution as too low of a background IO limit can have negative performance or even catastrophic consequences

      General Recommendations

      In general, your indexing/reindexing and subsequent search experience will be better if you

      Summary

      MarkLogic Admin GUI is convenient place to deploy the Normal Certificate infrastructure or use the Temporary Certificate generated by MarkLogic. However for certain advance solutions/deployment we need XQuery based admin operations to configure MarkLogic.

      This knowledgebase discusses the solution to deploy SAN or Wildcard Certificate in 3 node (or more) cluster.

       

      Certificate Types and MarkLogic Default Config

      Certificate Types

      In general, When browsers connect to a Server using HTTPS, they check to make sure your SSL Certificate matches the host name in the address bar. There are three ways for browsers to find a match:

      a).The host name (in the address bar) exactly matches the Common Name in the certificate's Subject.

      b).The host name matches a Wildcard Common Name. Please find example at end of article. 

      c).The host name is listed in the Subject Alternative Name (SAN) field as part of X509v3 extensions. Please find example at end of article.

      The most common form of SSL name matching is for the SSL client to compare the server name it connected to with the Common Name (CN field) in the server's Certificate. It's a safe bet that all SSL clients will support exact common name matching.

      MarkLogic allows this common scenario (a) to be configured from Admin GUI, and we will discuss the Certificate featuring (b) and (c) deployment further.

      Default Admin GUI based Configuration 

      By default, MarkLogic generates Temporary Certificate for all the nodes in the group for current cluster when Template is assigned to MarkLogic Server ( Exception is when Template assignment is done through XQuery ).

      The Temporary Certificate generated for each node do have hostname as CN field for their respective Temporary Certificate - designed for common Secnario (a).

      We have two path to install CA signed Certificate in MarkLogic

      1) Generate Certificate request, get it signed by CA, import through Admin GUI

      or 2) Generate Certificate request + Private Key outside of MarkLogic, get Certificate request signed by CA, import Signed Cert + Private Key using Admin script

      Problem Scenario

      In both of the above cases, while Installing/importing Signed Certificate, MarkLogic will look to replace Temporary Certificate by comparing CN field of Installed Certificate with Temporary Certificaet CN field.

      Now, if we have WildCard Certificate (b) or SAN Certificate (c), our Signed Certificate's CN field will never match Temporary Certificate CN field, hence MarkLogic will Not remove Temporary Certificates - MarkLogic will continue using Temporary Certificate.

       

      Solution

      After installing SAN or wildcard Certificate, we may run into AppServer which still uses Temporary installed Certificate ( which was not replaced while installing SAN/wild-card Certificate).

      Use below XQuery against Security DB to remove all Temporary Certificates. XQuery needs uri lexicon to be enabled (default enabled). [Please change the Certificate Template-Name in below XQuery to reflect values from your environment.] 

      xquery version "1.0-ml";
      
      import module namespace pki = "http://marklogic.com/xdmp/pki"  at "/MarkLogic/pki.xqy";
      import module namespace admin = "http://marklogic.com/xdmp/admin"  at "/MarkLogic/admin.xqy";
            
      
      let $hostIdList := let $config := admin:get-configuration()
                         return admin:get-host-ids($config)
                           
      for $hostid in $hostIdList
      return
        (: FDQN name matching Certificate CN field value :)
        let $fdqn := "TestDomain.com"
      
        (: Change to your Template Name string :)
        let $templateid := pki:template-get-id(pki:get-template-by-name("YourTemplateName"))
      
        for $i in cts:uris()
        where 
        (   (: locate Cert file with Public Key :)
            fn:doc($i)//pki:template-id=$templateid 
            and fn:doc($i)//pki:authority=fn:false()
            and fn:doc($i)//pki:host-name=$fdqn
        )
        return <h1> Cert File - {$i} .. inserting host-id {$hostid}
        {xdmp:node-insert-child(doc($i)/pki:certificate, <pki:host-id>{$hostid}</pki:host-id>)}
        {
            (: extract cert-id :)
            let $certid := fn:doc($i)//pki:certificate/pki:certificate-id
            for $j in cts:uris()
            where 
            (
                (: locate Cert file with Private key :)
                fn:doc($j)//pki:certificate-private-key/pki:template-id=$templateid 
                and fn:doc($j)//pki:certificate-private-key/pki:certificate-id=$certid
            )
            return <h2> Cert Key File - {$j}
            {xdmp:node-insert-child(doc($j)/pki:certificate-private-key,
              <pki:host-id>{$hostid}</pki:host-id>)}
            </h2>
        } </h1>
      

      Above will remove all Temporary Certificates (including Template CA) and their private-key, leaving only Installed Certificate associated with Template, forcing all nodes to use Installed Certificate. 

       

      Example: SAN (Subject Alternative Name) Certificate

      For 3 node cluster (engrlab-128-101.engrlab.marklogic.com, engrlab-128-164.engrlab.marklogic.com, engrlab-128-130.engrlab.marklogic.com)

      $ opensl x509 -in ML.pem -text -noout
      Certificate:
          Data:
              Version: 3 (0x2)
              Serial Number: 9 (0x9)
              Signature Algorithm: sha1WithRSAEncryption
              Issuer: C=US, ST=NY, L=NewYork, O=MarkLogic, OU=Engineering, CN=Support CA
              Validity
                  Not Before: Apr 20 19:50:51 2016 GMT
                  Not After : Jun  6 19:50:51 2018 GMT
              Subject: C=US, ST=NJ, L=Princeton, O=MarkLogic, OU=Eng, CN=TestDomain.com
              Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                  RSA Public Key: (1024 bit)
                      Modulus (1024 bit):
                          00:97:8e:96:73:16:4a:cd:99:a8:6a:78:5e:cb:12:
                          5d:e5:36:42:d2:b8:52:51:53:6c:cf:ab:e4:c6:37:
                          2c:15:12:80:c1:1b:53:29:4c:52:76:84:80:1d:ee:
                          16:41:a6:31:c5:7b:0d:ca:d7:e5:da:d7:67:fe:80:
                          89:9f:0d:bc:46:4f:f0:7e:46:88:26:d5:a0:24:a6:
                          06:d1:fa:c0:c7:a2:f2:11:7f:5b:d5:8d:47:94:a8:
                          06:d9:46:8f:af:dd:31:d5:15:d2:7a:13:39:3e:81:
                          32:bd:5c:bd:62:9d:5a:98:1d:20:0e:30:d4:57:3f:
                          7f:89:e6:20:ae:88:4d:85:d7
                      Exponent: 65537 (0x10001)
              X509v3 extensions:
                  X509v3 Key Usage: 
                      Key Encipherment, Data Encipherment
                  X509v3 Extended Key Usage: 
                      TLS Web Server Authentication
                  X509v3 Subject Alternative Name: 
                      DNS:engrlab-128-101.engrlab.marklogic.com, DNS:engrlab-128-164.engrlab.marklogic.com, DNS:engrlab-128-130.engrlab.marklogic.com
          Signature Algorithm: sha1WithRSAEncryption
              52:68:6d:32:70:35:88:1b:70:df:3a:56:f6:8a:c9:a0:9d:5c:
              32:88:30:f4:cc:45:29:7d:b5:35:18:a0:9a:45:37:e9:22:d1:
              c5:50:1d:50:b8:20:87:60:9b:c1:d6:a8:0c:5a:f2:c0:68:8d:
              b9:5d:02:10:39:40:b3:e5:f6:ae:f3:90:31:57:4c:e0:7f:31:
              e2:79:e6:a8:c0:e6:3f:ea:c5:75:67:3e:cd:ea:88:5d:60:d6:
              01:59:3c:dc:e0:47:96:3b:59:4a:13:85:bb:87:70:d0:a2:6b:
              0f:d4:84:1d:d1:be:e8:a5:67:c3:e3:59:05:0d:5d:a5:86:e6:
              e4:9e

      Example: Wild-Card Certificate

      For 3 node cluster (engrlab-128-101.engrlab.marklogic.com, engrlab-128-164.engrlab.marklogic.com, engrlab-128-130.engrlab.marklogic.com). 

      $ openssl x509 -in ML-wildcard.pem -text -noout
      Certificate:
          Data:
              Version: 1 (0x0)
              Serial Number: 7 (0x7)
              Signature Algorithm: sha1WithRSAEncryption
              Issuer: C=US, ST=NY, L=NewYork, O=MarkLogic, OU=Engineering, CN=Support CA
              Validity
                  Not Before: Apr 24 17:36:09 2016 GMT
                  Not After : Jun 10 17:36:09 2018 GMT
              Subject: C=US, ST=NJ, L=Princeton, O=MarkLogic Corporation, OU=Engineering Support, CN=*.engrlab.marklogic.com
       

      Introduction

      Okta provides secure identity management and single sign-on to any application, whether in the cloud, on-premises or on a mobile device.

      The following procedure describes the procedure required to integrate MarkLogic with Okta identity management and Microsoft Windows Active Directory using the Okta AD Agent.

      This document assumes that the users accessing MarkLogic are defined in the Windows Active Directory only and do not currently have Okta User Profiles defined.

      Authentication Flow

       The authentication flow in this scenario will be as follows:

      1. The user opens a Browser connection to the Site Single Sign-On Portal page.
      2. The user enters their Active Directory credentials
      3. Okta verifies the user credentials using the Okta LDAP Agent
      4. If successful, the user is presented with a selection of applications they can sign-on to.
      5. The user selects the required application and Okta completes the sign-on using the stored user credentials.

      Requirements

      • MarkLogic Server version 8 or 9
      • Okta Admin account access
      • Okta AD Agent
      • Active Directory Server

      For the purpose of this document the following Active Directory user entry will be used as an example:

      # LDAPv3
      # base <dc=MarkLogic,dc=Local> with scope subtree
      # filter: (sAMAccountName=martin.warnes)
      # requesting: *
      #
      
      # Martin Warnes, Users, marklogic.local
      dn: CN=Martin Warnes,CN=Users,DC=marklogic,DC=local
      objectClass: top
      objectClass: person
      objectClass: organizationalPerson
      objectClass: user
      cn: Martin Warnes
      sn: Warnes
      givenName: Martin
      distinguishedName: CN=Martin Warnes,CN=Users,DC=marklogic,DC=local
      sAMAccountName: martin.warnes
      memberOf: CN=mladmins,CN=Users,DC=marklogic,DC=local
      sAMAccountType: 805306368
      userPrincipalName: martin.warnes@marklogic.local

      Notes

      1. By default, Okta uses the email address as the username, however, MarkLogic usernames cannot contain certain special characters such as the @ symbol so the sAMAccountName will be used to sign-on on to MarkLogic. This will be configured later during the Okta Application definition.
      2. One or more memberOf attributes should be assigned to the Active Directory user entry and these will be used to assign MarkLogic Roles without requiring the need to configure duplicate user entries in the MarkLogic security database.

      Step 1. Create a MarkLogic External Security definition

       An External Security definition is required to authenticate and authorize Okta users against a Microsoft Windows Active Directory server.

       Full details on configuring an external security definition can be found at:

       https://docs.marklogic.com/8.0/guide/security/external-auth

       You should ensure that both “authentication” and “authorization” are set to “ldap”, for details on the remaining settings you should consult your Active Directory administrator.

      Step 2. Assign Active Directory group membership to MarkLogic Roles

      In order to assign the correct Roles and Permission to Okta users, you will need to map Active Directory memberOf attributes to MarkLogic rolls.

      In my example Active Directory user entry martin.warnes belongs to the following Group:

       memberOf: CN=mladmins,CN=Users,DC=marklogic,DC=local

      To ensure that all members of this Group are assigned MarkLogic Admin roles you simply need to add the memberOf attribute value as an external name in the admin role as below:

      Step 3. Configure the MarkLogic AppServer

      For each App Server that you wish to integrate with Okta, you will need to set the “authentication” to “basic” and select the “external security” definition.

      As HTTP Basic Authentication is considered insecure it is highly recommended that you secure the AppServer connection using HTTPS by configuring and selecting a “SSL certificate template”.

       Further details on configuring SSL for AppServers can be found at:

       https://docs.marklogic.com/8.0/guide/admin/SSL

      Step 4. Install and Configure Okta AD Integration

      In order for Okta to authenticate your Active Directory users, you will first need to download and install the Okta AD Agent using the following instructions supplied by Okta

      https://support.okta.com/help/Documentation/Knowledge_Article/Install-and-Configure-the-Okta-Active-Directory-Agent-1689483166

       Once installed your Okta Administrator will be able to complete the AD Agent configuration to select which AD users to import into Okta.

      Step 5. Create Okta MarkLogic application

      From the Okta Administrator select “Add Application”, search for the Basic Authentication template and click “Add

      On the “General Settings” tab, enter the MarkLogic AppServer URL, ensure to use HTTP or HTTPS depending on whether you have chosen to secure the listening port using TLS.

       Check the “Browser plugin auto-submit” option.

      On the Sign-On options panel select “Administrator sets username, password is the same as user’s Okta password

       For “Application username format” select “AD SAM Account name” from the drop-down selection.

      Once the Okta application is created you should assign the users permitted to access the application

      When assigning a user, you will be prompted to check the AD Credentials, at this point you should just check that Okta has selected the correct "sAMAccountName" value, the password will not be modifiable.

      Repeat Step 5. for each AppServer you wish to access via the Okta SSO portal.

      Step 6. Sign-on to Okta SSO Portal

      All assigned MarkLogic applications should be shown:

      Selecting one of the MarkLogic applications should automatically log you in using your AD Credentials stored within Okta.

      Additional Reading

      Introduction

      MarkLogic server provides pre-commit or post-commit triggers and these triggers listens for certain events to occur and then invokes a configured XQuery module to run after event occurs. It is a common use case to create a common function in a library module which is shared among different trigger modules called by various triggers. This article shows an example to create and use such a shared library module in a post-commit trigger.

      Example

      This example shows a simple post commit trigger that fires when a new document is created.

      1. For this example create a database 'minidb' and after that set its triggers database as self (minidb). Also, create another database 'minimodule' to store all modules.

      2. Using Query Console, create a trigger using trigger definition by evaluating below XQuery against triggers database (minidb)

      3. Create a module by running below XQuery against modules database:

      4. Insert a library module into the modules database (minimodules):

      5. Now insert the sample document into the content database (minidb):

      6. Check output in logs:

      After a new document having its URI prefixed with "/mini" is inserted into the content database, TaskServer Logs file logs the below message:

      2018-04-25 11:40:50.224 Info: *****Document with /mini root /mini/test-25-1-1.xml was created.*****2018-04-25T11:40:50+05:30

      NOTE: Module imports are relative to root.

      References:

      1. Creating and Managing Triggers With triggers.xqy - https://docs.marklogic.com/guide/app-dev/triggers

      Introduction

      We are always looking for ways to understand and address performance issues within the product and we are addressing this by adding the following new diagnostic features to the product.

      New Trace Events in MarkLogic Server

      Some new diagnostic trace events have been added to MarkLogic Server:

      • Background Time Statistics - Background thread period and further processing timings are added to xdmp:host-status() output if this trace event is set.
      • Journal Lag 30 - A forest will now log a warning message if a frame takes more than 30 seconds to journal.
        • Please note that this limit can be adjusted down by setting the Journal Lag # trace event (where # is {1, 2, 5 or 10} seconds).
      • Canary Thread 10 - A new "canary thread" that does nothing but sleep for a second and check how long is was since it went to sleep.
        • It will log messages if the interval between sleeping has exceeded 10 seconds.
        • This can be adjusted down by setting the Canary Thread # trace event (where # is {1, 2, 5 or 10} seconds).
      • Canary Thread Histogram - Adding this trace event will cause MarkLogic to write to the ErrorLog a histogram of timings once every 10 minutes.
      • Forest Fast Query Lag 10 - By default, a forest will now warn if the fast query timestamp is lagging by more than 30 seconds.
        • This can be adjusted down by setting the Forest Fast Query Lag # (where # is {1, 2, 5, or 10} seconds).
        • Note that Warning level messages will be repeatedly logged at intervals while the lag limit is exceeded, with the time between logged messages doubling until it reaches 60 seconds.
        • There will be a final warning when the lag drops below the limit again as a way to bracket the period of lag.

      Examples of some of new statistics can be viewed in the Admin UI by going to the following URL in a browser (replacing hostname with the name of a node in your cluster and replacing TheDatabase with the name of the database that you would like to monitor):

      You can clear the forest insert and journal statistics by adding clear=true to your request; executing the following in a browser:

      These changes now feature in the current releases of both MarkLogic 7 and MarkLogic 8 and are available for download from our developer website:

      Hints for interpreting new diagnostic pages

      Here's some further detail on what the numbers mean.

      First, a note about how bucketing is performed on these diagnostic pages:

      For each operation category (e.g. Timestamp Wait, Semaphore, Disk), the wait time will fall into a range of values, which need to be bucketed.

      The bucketing algorithm starts with 1000 buckets to cover the whole range, but then collapses them into a small set of buckets that cover the whole span of values. The algorithm aims to

      1. End up with a small number of buckets

      2. Include extreme (out-liers) values

      3. Spread out multiple values so that they are not too "bunched-up" and are therefore easier to interpret.

      Forest Journal Statistics (http://hostname:8001/forest-journal-statistics.xqy?database=TheDatabase)

      When we journal a frame, there are a sequence of operations.

      1. Wait on a semaphore to get access to the journal.
      2. Write to the journal buffer (possibly waiting for I/O if exceeding the 512k buffer)
      3. Send the frame to replica forests
      4. Send the frame to journal archive/database replica forests
      5. Release the semaphore so other threads can access the journal
      6. Wait for everything above to complete, if needed.
        1. If it's a synchronous op (e.g. prepare, commit, fast query timestamp), we wait for disk I/O
        2. If there are replica forests, we wait for them to acknowledge that they have journaled and replayed.
        3. If the journal archive or database replica is lagged, wait for it to no longer be lagged.

      We note the wall clock time before/after these various options, so we can track how long they're taking.

      On the replica side, we also measure the "Journal Replay" time which would be inserting into the in-memory stand, committing, etc.

      Here's an example for a master and its replica.

      Forest F-1-1

      Timestamp Wait
      Bucket (ms)Count%CumulativeCumulative %
      0..9 280 99.64 280 99.64
      50..59 1 0.36 281 100.00
      Semaphore
      Bucket (ms)Count%CumulativeCumulative %
      0..9 816 100.00 816 100.00
      Disk
      Bucket (ms)Count%CumulativeCumulative %
      0..9 204 99.51 204 99.51
      10..19 1 0.49 205 100.00
      Local-Disk Replication
      Bucket (ms)Count%CumulativeCumulative %
      0..9 804 99.26 804 99.26
      10..119 6 0.74 810 100.00
      Journal Archive

      No Information

      Database Replication

      No Information

      Journal Total
      Bucket (ms)Count%CumulativeCumulative %
      0..9 810 99.26 810 99.26
      10..119 6 0.74 816 100.00
      Journal Replay

      No Information

      Forest F-1-1-R

      Timestamp Wait

      No Information

      Semaphore
      Bucket (ms)Count%CumulativeCumulative %
      0..9 811 100.00 811 100.00
      Disk
      Bucket (ms)Count%CumulativeCumulative %
      0..9 203 99.02 203 99.02
      10..59 2 0.98 205 100.00
      Local-Disk Replication

      No Information

      Journal Archive

      No Information

      Database Replication

      No Information

      Journal Total
      Bucket (ms)Count%CumulativeCumulative %
      0..9 809 99.75 809 99.75
      10..59 2 0.25 811 100.00
      Journal Replay
      Bucket (ms)Count%CumulativeCumulative %
      0..9 807 99.63 807 99.63
      10..119 3 0.37 810 100.00

      Forest Insert Statistics (http://hostname:8001/forest-insert-statistics.xqy?database=TheDatabase)

      When we're inserting a fragment into an in-memory stand, we also have a sequence of operations.

      1. Wait on a semaphore to get access to the in-memory stand.
      2. Wait on the insert throttle (e.g. if there are too may stands)
      3. Wait for the stand's journal semaphore, to serialize with the previous insert if needed.
      4. Release the stand insert semaphore.
      5. Journal the insert.
      6. Release the stand journal semaphore.
      7. Start the checkpoint task if the stand is full.

      As with the journal statistics, we note the wall clock time between these operations so we can track how long they're taking.

      On the replica side, the behavior is similar, although the journal and insert are in reverse order (we journal before inserting into the in-memory stand). If it's a database replica forest, we also have to regenerate the index information (Filled IPD).

      Here is a example for a master and its replica.

      Forest F-1-1

      Journal Throttle
      Bucket (ms)Count%CumulativeCumulative %
      0..9 606 100.00 606 100.00
      Insert Sem
      Bucket (ms)Count%CumulativeCumulative %
      0..9 604 99.67 604 99.67
      80..199 2 0.33 606 100.00
      Filled IPD

      No Information

      Stand Throttle
      Bucket (ms)Count%CumulativeCumulative %
      0..9 606 100.00 606 100.00
      Stand Insert
      Bucket (ms)Count%CumulativeCumulative %
      0..9 605 99.84 605 99.84
      100..109 1 0.17 606 100.00
      Journal Sem
      Bucket (ms)Count%CumulativeCumulative %
      0..9 604 99.67 604 99.67
      10..119 2 0.33 606 100.00
      Journal
      Bucket (ms)Count%CumulativeCumulative %
      0..9 603 99.50 603 99.50
      10..119 3 0.50 606 100.00
      Total
      Bucket (ms)Count%CumulativeCumulative %
      0..9 597 98.51 597 98.51
      10..19 6 0.99 603 99.50
      200..229 3 0.50 606 100.00

      Forest F-1-1-R

      Journal Throttle

      No Information

      Insert Sem
      Bucket (ms)Count%CumulativeCumulative %
      0..9 606 100.00 606 100.00
      Filled IPD

      No Information

      Stand Throttle
      Bucket (ms)Count%CumulativeCumulative %
      0..9 606 100.00 606 100.00
      Stand Insert
      Bucket (ms)Count%CumulativeCumulative %
      0..9 605 99.84 605 99.84
      110..119 1 0.17 606 100.00
      Journal Sem
      Bucket (ms)Count%CumulativeCumulative %
      0..9 606 100.00 606 100.00
      Journal

      No Information

      Total
      Bucket (ms)Count%CumulativeCumulative %
      0..9 605 99.84 605 99.84
      110..119 1 0.17 606 100.00

      Further reading

      To learn more about diagnostic trace events, please refer to our documentation and Knowledgebase articles and note that some trace events may only log information if logging is set to debug:

      Summary

      The jemalloc library is included with the MarkLogic install and is recommended to use as it has shown a performance boost over the default Linux malloc library.  It is included with the MarkLogic server install and is configured to be used by default. 

      There have been cases where even if configured, the library is not used.  This article will give possible solutions to debug that.

      Diagnostics

      ErrorLog message on startup if jemalloc is not allocated:

      Warning: Memory allocator is not jemalloc; check /etc/sysconfig/MarkLogic

      Solutions

      1) Make sure to use superuser shell or sudo and run the 'service MarkLogic restart'

      2) Verify that the jemalloc library is present in the install directory (ie /opt/MarkLogic/lib/libjemalloc.so.1).

      3) Has the /etc/sysconfig/MarkLogic configuration file been modified from the default?  Try setting the configuration file back to the default and restarting the server.

      4) Confirm that /etc/sysconfig/MarkLogic contain the following lines:
      # preload jemalloc
      if [ -e $MARKLOGIC_INSTALL_DIR/lib/libjemalloc.so.1 ]; then
         export LD_PRELOAD=$MARKLOGIC_INSTALL_DIR/lib/libjemalloc.so.1
      fi

      Details

      For more information on the jemalloc library, please review the article provided by Facebook Engineering

      https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919/

      Introduction

      This article compares JSON support in MarkLogic Server versions 6, 7, and 8, and the upgrade path for JSON in the database.

      How is native JSON different than the previous JSON support?

      Previous versions of MarkLogic Server provided XQuery APIs that converted between JSON and XML. This translation is lossy in the general case meaning developers were forced to make compromises on either or both ends of the transformation. Even though the transformation was implemented in C++ it still added significant overhead to ingestion. All of these issues go away with JSON as a native document format. 

      How do I upgrade my JSON façade data to native JSON?

      For applications that use the previous JSON translation façade (for example: through the Java or REST Client APIs), MarkLogic 8 comes with sample migration scripts to convert JSON stored as XML into native JSON.

      The migration script will upgrade a database’s content and configuration from the XML format that was used in MarkLogic 6 and 7 to represent data to native JSON, specifically converting documents in the http://marklogic.com/xdmp/json/basic namespace.
       
      If you are using the MarkLogic 7 JSON support, you will also need to migrate your code to use the native JSON support. The resulting application code is expected to be more efficient, but it will require application developers to make minor code changes to your application.
       
      See also:
       
      Version 8 JSON incompatibilities
       

      Introduction

      MarkLogic Server provides a couple of useful techniques for keeping values in memory or resolving values without having to scan for documents on-disk.

      Options

      There are a few options available:

      1. cts:element-values performs a lexicon lookup so it's directly getting those values from the range indexes; you can add an options node and use the "map" parameter to get the call to return a map directly as per the documentation, which may give you what you need without having to do any further work.

      See: http://docs.marklogic.com/cts:element-values

      2. Storing a map as a server field is a popular approach and is widely used for storing data that needs to be accessed routinely by queries.

      Bear in mind that there is a catch to this approach as the map is not available to all nodes in a cluster - it is only available to the node responsible for evaluating the original request, so if you're using this technique in a clustered environment, the results may not be what is expected.

      Also note that if you're planning on storing a large number of maps in server fields on nodes on the cluster, it's important to make sure the hosts are provisioned with enough memory to accommodate these maps on top of group level caches and memory for query allocation, stands, range indexes document retrieval and the like.

      See: http://docs.marklogic.com/map:map

      And: http://docs.marklogic.com/xdmp:set-server-field

      3. xdmp:set only allows you to set a value for the life of a single query but this technique can be useful in some circumstances - especially in situations where you're interested in keeping track of certain values throughout the processing of a module or a function within a module.

      See: http://docs.marklogic.com/xdmp:set

      4. If you have a situation where you have a large number of complex queries - particularly ones where lexicon lookups or calls to range indexes won't resolve the data you need and where lots of documents will need to be retrieved from disk, you should consider using registered queries.

      See: http://docs.marklogic.com/cts:registered-query

      Note that registered queries utilise the List Cache so, if you plan to adopt this method, we recommend careful testing to ensure your caches are sized sufficiently to suit the needs of your application.

      Summary

      This article explains how to kill Long Running Query and related timeout configurations.

      Problem Scenario

      At some point, we've all run into an inefficient long running query. What should we do if we don't want to wait for the query to complete? If we cancel the browser request, that would end the connection, but it wouldn't end the program invocation (called a "request") on the MarkLogic Server side. On the server side, that program invocation would continue to run until the execution is complete.

      Most of the time, this isn't really an issue. The server, of course, is multi-threaded, handling many concurrent transactions. We can just cancel the browser request, move on, and let the query finish when it finishes. However, sometimes it becomes necessary to free up server resources by killing the query and starting over. To do this, we need access to the Admin interface. 

      Sample Long running Query 

      Example only, please don't try this on any production machines!

      for $x in 1 to 1000000
      return collection()[1 + xdmp:random(1000)]
       
      This query is asking for 1,000,000 random documents, and will take a long time to execute. How can we cancel this query?

      How to Cancel/Kill the Query

      Go to the Administrative interface (at http://localhost:8001/ if you're running MarkLogic locally). At the top of the screen, you'll see a tab labeled "Status." Click that:

      screenshot1.jpg

      This will take you to the "System Status" screen. This page reveals status information about hosts, databases, forests, and app servers. The App Server section is what we're concerned with. Scanning down the "Queries" column, we see that the "Admin" server is processing a query (namely, the one that generated the page we see). Everything looks okay so far. But just below that, we see that the "App-Services" server is just over 3 minutes into processing a query. That's our slow one. Query Console runs on the "App-Services" app server, which explains why we see it there. Go ahead and click the "App-Services" link:

      screenshot2.jpg

      This takes us to the "App-Services" status page. So far, there's still no "cancel" button. One more click will reveal it ("show more"):

      screenshot3.jpg

      We can now see an individual entry for the currently running query. Here we see it's called "eval.xqy"; that's the query module that Query Console invokes when you submit a query. If you were running your own query module (instead of using Query Console), then you would see its name here instead. To cancel the query, click the "[cancel]" link:

      screenshot4.jpg

      One more click (on the confirmation page).

      screenshot5.jpg

      This takes us back to the status page, where we see MarkLogic Server is in the process of canceling our query:

      screenshot6.jpg

      Above page will continue to say "cancelling..." even though query is already killed and no longer exist till we refresh the page.

      A quick refresh of the above page shows that the query is no longer present.

      screenshot7.jpg

       

      What happens if you forget to cancel a query?

      MarkLogic will continue to execute the query until a time limit is reached, at which point the Server will cancel the query for you. For example, here's what Query Console eventually returns back if we don't bother to cancel the query:

      screenshot8.jpg

      How long is this time limit?

      This depends on your server configuration. We can actually set the timeout in the query itself, using the xdmp:set-request-time-limit() function, but even that will be limited by your server's "max time limit."

      For example, on the "Configure" tab of my "App-Services" app server, you can see that the "default time limit" is set to 10 minutes (600 seconds), and the longest any query can allow itself to run (by setting its own request time limit) is one hour (3600 seconds):

      screenshot9.jpg

       

      Introduction

      MarkLogic Server allows you to configure MarkLogic Server so that users are authenticated using an external authentication protocol, such as Lightweight Directory Access Protocol (LDAP) or Kerberos. These external agents serve as centralized points of authentication or repositories for user information from which authorization decisions can be made. If, after following the configuration instructions in our documentation, the authentication does not work as expected, this article gives some additional debugging ideas.

      Details

      The following are areas should be checked when your LDAP Authentication is not working as expected:

      1. Verify that cyrus-sasl-md5 library is installed on MarkLogic Server node.

      2. Run the following LDAP search command to check if LDAP server is properly setup.

      ldapsearch -H ldap://{Your LDAP Serevr URI}:389 -x -s base

      a. Once you run the ldap search command, make sure digest-md5 is supported. 

      supportedSASLMechanisms: DIGEST-MD5

      b. Identify the correct LDAP Service name:

      e.g ldapServiceName: MLTEST1.LOCAL:dc1$@MLTEST1.LOCAL


      3. On Windows platforms, the services.keytab file is created using Active Directory Domain Services (AD DS) on a Windows server. If you are using Active Directory Domain Services (AD DS) on a computer that is running Windows Server 2008 or Windows Server 2008 R2, be sure that you have installed the hot fix described in http://support.microsoft.com/kb/975697.

      Introduction: the issue

      MarkLogic performs Nested lookups on the LDAP Groups assigned to a user to determine which roles the user will be assigned. If the groups belong to multiple Active Directory Domains within a federated Active Directory Forest then MarkLogic user authorization could fail with a subordinate Referral error, as seen below:

      2019-07-30 13:27:23.002 Notice: XDMP-LDAP: ldap_search_s failed on ldap server ldap://ad1.myhost.com:389: Referral (10)

      Cause

      MarkLogic has been configured to connect to the Local Domain Controller LDAP ports 389 (LDAP) or 636 (LDAPs), however, a Local Domain Controller can only search domains to which it has access.

      Example

      A user is a member of the following groups which belong to two separate Active Directory domains, subA, and subC.

      Using a Local Domain Controller for subA for external authorization would result in a login failure when attempting to perform the nested group lookup for the domain subC

      member=CN=Group Onw,OU=OrgUnitAGroups,OU=OrgUnitA,DC=subA,DC=domain
      member=CN=Group Two,OU=OrgUnitAGroups,OU=OrgUnitA,DC=subA,DC=domain
      member=CN=Group Three,OU=OrgUnitCGroups,OU=OrgUnitC,DC=subC,DC=domain

      Solution

      If you have multiple Active Directory Domains federated into an Active Directory forest you should use the Global Catalog port 3278 (LDAP) or 3279 (LDAPS) to prevent failures when searching for group memberships that are defined in other domains.

      Optional workaround

      A large number of nested groups can potentially lead to a decrease in login time performance, if you do not need to really on nested lookups to determine group membership for MarkLogic roles, i.e. all groups required are returned from the initial user search request then you should consider disabling setting the "ldap nested lookup" parameter to false in the External Security configuration.

      Doing this would also prevent subordinate domain searches and allow you to continue to use an Active Directory Domain Controller instead of switching to the Global Catalog.

      Further reading

      Summary

      A leap second, as defined by wikipedia is "a one-second adjustment that is occasionally applied to Coordinated Universal Time (UTC) in order to keep its time of day close to the mean solar time. Without such a correction, time reckoned by Earth's rotation drifts away from atomic time because of irregularities in the Earth's rate of rotation."  At the time of this writing, the next leap second to be inserted is on June 30, 2015 at 23:59:60 UTC.

      For systems that use the Network Time Protocol (NTP) to synchronize the network time across all the host in their MarkLogic Cluster, the Marklogic Server Software is not impacted by the leap second (i.e. we expect everything to work fine at the MarkLogic layer)

      For systems where the synchronization of their system clocks require UTC time to be set backwards, then anywhere time dependent data is stored, it must be accounted for. In this case, we recommend that our customers implement NTP in their environment.  Otherwise, the application layer will need to handle discontinuous time. 

      Transactional Consistency

      The algorithm that MarkLogic Server uses to maintain transactional consistency of data is not wall clock dependent and, as such, is not affected by the leap second.

      Network Time Protocol (NTP)

      NTP generally works really really hard not to make time go backwards as clock readings are constrained to always increase - every reading increases the NTP clock. NTP adjusts things gradually by slowing down or speeding up the clock and not by making discrete changes unless time is off by a lot. A second is not a lot.  An hour is a lot. Regardless of the leap second, adjustments for computer clock drift can easily be more than a second and happen frequently. 

      When Time Goes Backwards

      Without NTP and left on their own, computer clocks are really not that accurate. If synchronization of the system clocks on the hosts of a MarkLogic cluster require the clocks to be set backwards, then the application layer will need to account for and handle discontinuous date-time in their data. 

      Beginning with MarkLogic Server version 8,  the temporal feature was introduced.  If the system clock is adjusted backwards, there are conditions where temporal document inserts and updates will fail with an appropriate error code.  This is by design and expected.

      Our recommendation is to implement NTP on all hosts of a MarkLogic cluster to eliminate the need to handle discontinuous time at the application layer. 

      Further Reading

      Redhat article on the Leap Second - https://access.redhat.com/articles/15145 ;

      Microsoft Support article on the Leap Second - http://support.microsoft.com/kb/909614 ;

       

      Summary

      The internal mechanisms MarkLogic Server uses to implement security are query constraints. Lexicon search performance may be impacted by security query contraints.  If performed with admin credentials, Lexicon searches will not be impacted by the security query constraints.  

      Detail

      Query time grows proportionately with the number of matches from a given search across a set of documents (not the actual number of documents in your database). The presence of security constraints will contribute a significantly larger number of matches than if the same lexicon search was performed with admin credentials.  In order to minimize the number of matches (and therefore query time) for a given lexicon search, you'll want to amp your lexicon searches to an admin user.

      For MarkLogic Server v6.0, the absolute maximum number of MarkLogic Servers in a Cluster is 256, but the optimum is around 64.

      Summary

      MarkLogic recommends the default "ordered" option for Linux ext3 and ext4 file-systems.

      File System administrators in Linux are tempted to use the data=writeback option to achieve higher throughput from their file-system, but this comes with the side-effects of potential data corruption and data-secuity breach. This article explains both file system options with respect to MarkLogic Server. 

      "data=ordered"

      Linux ext3 and ext4 file system has default data option of "ordered", which writes to the main file system before committing to the journal.

      https://www.kernel.org/doc/Documentation/filesystems/ext4.txt

      https://www.kernel.org/doc/Documentation/filesystems/ext3.txt

      Both of these file-system goes the extra mile to protect your files and writes data associated with that meta data by default with data=ordered, thus assuring file-system integrity to application layer - essential for MarkLogic Server data integrity. 

      "data=writeback"

      Other journaled file systems like XFS and JFS write meta data to the disk;  to make ext3 and ext4 behave like XFS and other journal file system, an administrator could set 'data=writeback' in their mount options.

      The 'data=writeback' mode does not preserve data ordering when writing to the disk, so commits to the journal may happen before the data is written to the file system. This method is faster because only the meta data is journaled, but is not good at protecting data integrity in the face of a system failure.

      If there is a crash between the time when metadata is commited to the journal and when data is written to disk, the post-recovery metadata can point to incomplete, partially written or incorrect data on disk; which can lead to corrupt data files. Additionally, data which was supposed to be overwritten in the filesystem could be exposed to users - resulting in a security risk.

      Linus Torvalds comments on 'data=writeback'

      "it makes things much smoother, since now the actual data is no longer in the critical path for any journal writes, but anybody who thinks that's a solution is just incompetent.  We might as well go back to ext2 then. If your data gets written out long after the metadata hit the disk, you are going to hit all kinds of bad issues if the machine ever goes down."   - http://thread.gmane.org/gmane.linux.kernel/811167/focus=811654

       

      Introduction

      Here we discuss management of temporal documents.

      Details

      In MarkLogic, a temporal document is managed as a series of versioned documents in a protected collection. The ‘original’ document inserted into the database is kept and never changes. Updates to the document are inserted as new documents with different valid and system times. A delete of the document is also inserted as a new document.

      In this way, a temporal document always retains knowledge of when the information was known in the real world and when it was recorded in the database.

      API's

      By default the normal xdmp:* document functions (e.g., xdmp:document-insert) are not permitted on temporal documents.

      The temporal module (temporal:* functions; see Temporal API) contains the functions used to insert, delete, and manage temporal documents.

      All temporal updates and deletes create new documents and in normal operations this is exactly what will be desired.

      See also the documentation: Managing Temporal Documents.

      Updates and deletes outside the temporal functions

      Note: normal use of the temporal feature will not require this sort of operation.

      The function temporal:collection-set-options can be used with the updates-admin-override option to specify that users with the admin role can change or delete temporal documents using non-temporal functions, such as xdmp:document-insert and xdmp:document-delete.

      For example, if you need to do a corb or other administrative transform, but do not want to update the system dates on the documents; say, you want to change the values M/F to Male/Female.

       

      Introduction

      When CPF is installed, a number of new documents are created for the nominated Triggers database associated with that database.

      This Knowledgebase article is designed to show you what CPF creates on install, in the event that you want to safely disable and remove it from your system.

      Getting started

      Below is a layout of all databases and their associated document counts with a clean install of MarkLogic 9.0-2:

      Database IDDatabase NameDocument Count
      8723423541597683063 App-Services 14
      12316032390759111212 Modules 0
      1695527226691932315 Fab 0
      11723073009075196192 Security 1526
      15818912922008798974 Triggers 0
      5212638700134402198 Documents 0
      4320540002505594119 Extensions 0
      9023394855382775954 Last-Login 0
      11598847197347642387 Schemas 0
      12603105430027950215 Meters 48

      Adding CPF

      After installing CPF on the Documents database (with conversion enabled), we now see:

      Database IDDatabase NameDocument Count
      8723423541597683063 App-Services 15
      12316032390759111212 Modules 0
      1695527226691932315 Fab 0
      11723073009075196192 Security 1526
      15818912922008798974 Triggers 39
      5212638700134402198 Documents 0
      4320540002505594119 Extensions 0
      9023394855382775954 Last-Login 0
      11598847197347642387 Schemas 0
      12603105430027950215 Meters 498

      If we ignore Meters and App-Services, we can see that by default, A CPF install will create a number of documents in the Triggers database:

      /cpf/domains.css
      /cpf/pipelines.css
      http://marklogic.com/cpf/configuration/configuration.xml
      http://marklogic.com/cpf/domains/4361761515557042908.xml
      http://marklogic.com/cpf/pipelines/10451885084298751684.xml
      http://marklogic.com/cpf/pipelines/11486027894562997537.xml
      http://marklogic.com/cpf/pipelines/1182872541253698578.xml
      http://marklogic.com/cpf/pipelines/11925472395644624519.xml
      http://marklogic.com/cpf/pipelines/12665626287133680551.xml
      http://marklogic.com/cpf/pipelines/12977232154552215987.xml
      http://marklogic.com/cpf/pipelines/13371411038103584886.xml
      http://marklogic.com/cpf/pipelines/13468360248543629252.xml
      http://marklogic.com/cpf/pipelines/13721894103731640519.xml
      http://marklogic.com/cpf/pipelines/14473927355946353823.xml
      http://marklogic.com/cpf/pipelines/16071401642383641119.xml
      http://marklogic.com/cpf/pipelines/17008133204004114953.xml
      http://marklogic.com/cpf/pipelines/1707825679528566193.xml
      http://marklogic.com/cpf/pipelines/17486255598951175231.xml
      http://marklogic.com/cpf/pipelines/1789191734187967847.xml
      http://marklogic.com/cpf/pipelines/2145494300111008849.xml
      http://marklogic.com/cpf/pipelines/2272288885870389220.xml
      http://marklogic.com/cpf/pipelines/2585221667797881502.xml
      http://marklogic.com/cpf/pipelines/4684095308382280821.xml
      http://marklogic.com/cpf/pipelines/6055693256331806191.xml
      http://marklogic.com/cpf/pipelines/7250675434061295808.xml
      http://marklogic.com/cpf/pipelines/7354167915842037706.xml
      http://marklogic.com/cpf/pipelines/7492839190910743342.xml
      http://marklogic.com/cpf/pipelines/8329675320036351600.xml
      http://marklogic.com/cpf/pipelines/8537493622930387355.xml
      http://marklogic.com/cpf/pipelines/8877791654658876902.xml
      http://marklogic.com/cpf/pipelines/8988716724908642408.xml
      http://marklogic.com/cpf/pipelines/9432621469736814202.xml
      http://marklogic.com/xdmp/triggers/10905847201437369653
      http://marklogic.com/xdmp/triggers/11663386212502595308
      http://marklogic.com/xdmp/triggers/12471659507809075185
      http://marklogic.com/xdmp/triggers/15932603084768890631
      http://marklogic.com/xdmp/triggers/16817738273312375366
      http://marklogic.com/xdmp/triggers/17731123999892629453
      http://marklogic.com/xdmp/triggers/6779751200800194600

      Files created by CPF

      http://marklogic.com/cpf/configuration

      One of these files is the CPF configuration.xml file

      http://marklogic.com/cpf/domains

      One of these documents describes the default domain which is created when CPF is installed:

      Default Documents
      http://marklogic.com/cpf/pipelines

      Of the 39 files created, we can see from the URI listing above that the majority (28) of these are prefaced with http://marklogic.com/cpf/pipelines. These files describe each of the standard conversion pipelines that ship with the server. These are:

      Alerting
      Alerting (spawn)
      Calais Entity Enrichment Sample
      Conversion Processing
      Conversion Processing (Basic)
      Data Harmony Enrichment Sample
      DocBook Conversion
      Document Filtering (Properties)
      Document Filtering (XHTML)
      Entity Enrichment
      Flexible Replication
      HTML Conversion
      Janya Entity Enrichment Sample
      MS Office Conversion
      Office OpenXML Extract
      PDF Conversion
      PDF Conversion (Image Batching)
      PDF Conversion (Page Layout with Reblocking)
      PDF Conversion (Page Layout, Image Batching)
      PDF Conversion (Page Layout)
      PDF Conversion (Paged Text, No Rendering)
      Schema Validation
      SRA NetOwl Entity Enrichment Sample
      Status Change Handling
      Temis Entity Enrichment Sample
      WordprocessingML Process
      XHTML Conversion Processing
      XInclude Processing
      http://marklogic.com/xdmp/triggers

      Seven of the files are triggers - all of which are namespaced with the cpf prefix:

      cpf:any-property Default Documents
      cpf:create Default Documents
      cpf:delete Default Documents
      cpf:restart
      cpf:state Default Documents
      cpf:status Default Documents
      cpf:update Default Documents

      Removing the core files created when CPF was initially installed will disable it from further functioning in your environment.

      Scripting the removal of default CPF components

      This GitHub gist demonstrates a method for removing CPF configuration from a given database - in the example below, the "Triggers" database is specfied:

      Introduction

      If you have an existing MarkLogic Server cluster running on EC2, there may be circumstances where you need to upgrade the existing AMI with the latest MarkLogic rpm available. You can also add a custom OS configuration.

      This article assumes that you have started your cluster using the CloudFormation templates with Managed Cluster feature provided by MarkLogic.

      Procedure
      To upgrade manually the MarkLogic AMI, follow these steps:

      1. Launch a new small MarkLogic instance from the AWS MarketPlace, based on the latest available image. For example, t2.small based on MarkLogic Developer 9 (BYOL). The instance should be launched only with the root OS EBS volume.
      Note: If you are planning to leverage the PAYG-PayAsYouGo model, you must choose MarkLogic Essential Enterprise.
      a. Launch a MarkLogic instance from AWS MarketPlace, click Select and then click Continue:

      b. Choose instance type. For example, one of the smallest available, t2.small
      c. Configure instance details. For example, default VPC with a public IP for easy access
      d. Remove the second EBS data volume (/dev/sdf)
      e. Optional - Add Tags
      f. Configure Security Group - only SSH access is needed for the upgrade procedure
      g. Review and Launch
      Review step - AWS view:

      2. SSH into your new instance and switch the user to root in order to execute the commands in the following steps.

      $ sudo su -

      Note: As an option, you can also use "sudo ..." for each individual command.

      3. Stop MarkLogic and uninstall MarkLogic rpm:

      $ service MarkLogic stop
      $ rpm -e MarkLogic

      4. Update-patch the OS:

      $ yum -y update

      Note: If needed, restart the instance (For example: after a kernel upgrade/core-libraries).
      Note: If you would like to add more custom options/configuration/..., they should be done between steps 4 and 5.

      5. Install the new MarkLogic rpm
      a. Upload ML's rpm to the instance. (For example, via "scp" or S3)
      b. Install the rpm:

      $ yum install [<path_to_MarkLogic_RPM>]/[MarkLogic_RPM]

      Note: Do not start MarkLogic at any point of AMI's preparation.

      6. Double check to be sure that the following files and log traces do not exist. If they do, they must be deleted.

      $ rm -f /var/local/mlcmd.conf
      $ rm -f /var/tmp/mlcmd.trace
      $ rm -f /tmp/marklogic.host

      7. Remove artifacts
      Note: Performing the following actions will remove the ability to ssh back into the baseline image. New credentials are applied to the AMI when launched as an instance. If you need to add/change something, mount the root drive to another instance to make changes.

      $ rm -f /root/.ssh/authorized_keys
      $ rm -f /home/ec2user/.ssh/authorized_keys
      $ rm -f /home/ec2-user/.bash_history
      $ rm -rf /var/spool/mail/*
      $ rm -rf /tmp/userdata*
      $ rm -f [<path_to_MarkLogic_RPM>]/[MarkLogic_RPM]
      $ rm -f /root/.bash_history
      $ rm -rf /var/log/*
      $ sync

      8. Optional - Create an AMI from the stopped instance.[1] The AMI can be created at the end of step 7.

      $ init 0

      [1] For more information: https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/tkv-create-ami-from-instance.html

      At this point, your custom AMI should be ready and it can be used for your deployments. If you are using multiple AWS regions, you will have to copy the AMI as needed.
      Note: If you'd like to add more custom options/configuration/..., they should be done between steps 4 and 5.

      Additional references:
      [2] Upgrading the MarkLogic AMI - https://docs.marklogic.com/8.0/guide/ec2/managing#id_69624

      Introduction

      A powerful new feature was added to MarkLogic 8 - the ability to build applications around a declarative HTTP rewriter. You can read more about MarkLogic Server's HTTP rewriter and some of the new features it provides in our documentation.

      This article will cover some basic tips for debugging applications that make use of this feature.

      Validating your rewriter rules (Using XML Schema)

      The rewriter adheres to an XML Schema. At runtime the rewriter is not validated against this schema; this is by design so that potentially minor errors don't risk taking your application offline. As a best practice, we recommend validating your rewriters manually every time you make a change. In order to do this, you can use MarkLogic Server or any other tool that supports XML validation (the schema is standard XSD 1.0).  If you want to view the schema, it's copied to Config/rewriter.xsd when you install the product.

      In order to validate from within MarkLogic using XQuery you can simply execute:

      validate { fn:doc("/path/to/your/rewriter.xml") }

      The above will validate the XML if your rewriter rules are stored in a database. If you're using the filesystem, you can use xdmp:document-get instead.

      Alternatively, you can copy / paste the XML body into Query Console and wrap it with a call to validate as below:

      validate { * Paste your rewriter rules here * }

      The above approach should work without any issue as long as there is no content in your rewriter XML that contains any XQuery reserved syntax.

      General rewritter debugging and tracing

      For a simple "print" style debugging you can manually add trace statements at any point an eval rule is allowed. Like this:

      <trace event="customevent">data</trace>

      Then enable diagnostics (in your group settings) and add "customevent"; your custom trace will now show up in ErrorLog.txt whenever that endpoint is accessed. To read more on the use of trace events in your applications, refer to this Knowledgebase article

      There is error code handling:

      <error code="MYAPP-EXCEPTION" data1="value1" data2="... 

      You can also add ids - these will be traced out - which may aid debugging

      <match id="match-id-for-myregex" regex=".* ...

      Useful diagnostic trace events

      Note that additional trace events can generate a lot of data and may slow your application down, so make sure these do not get left on in a production-critical environment

      Below are some trace events you can use and a brief description of what each trace event does:

      Rewriter Parser Details of the parsing of the rewriter XML file
      Rewriter Evaluator Execution traces of rules as evaluated
      Rewriter Evaluator Verbose Additional (more verbose) tracing
      Declarative Rewriter Entry points into and out of the rewriter from the app server request handler
      Rewriter Print Rules After parsing and validation of the rewriter – a full dump of the internal data structures that resulted.

      Additional points to note

      Use of the "Evaluator" traces will write to the ErrorLog.txt on every request.

      The "Parser" trace event will only occur once or upon updating your rewriter.

      Introduction

      Prior to the 9.0-9 release, MarkLogic currently provides support for the Oracle JDK 8.  However, Oracle have recently announced End of Public Updates of Java SE 8

      What can we expect from MarkLogic?

      MarkLogic will support OpenJDK 9, OpenJDK 10 and OpenJDK 11 starting with MarkLogic Server 9.0-9 and associated products.

      These products include:

      From the 9.0-9 release onwards, we will no longer QA test our products with Oracle JDK.

      We will support Amazon Corretto JDK as part of our Amazon offerings.  Corretto meets the Java SE standard and certified compliant by AWS using the Java Technical Compatibility Kit.

      The latest version of MarkLogic Server is available to download from:

      http://developer.marklogic.com/products

      Summary

      The default configuration of MarkLogic Application Servers are not vulnerable to the FREAK SSL attack. 

      What is the FREAK SSL attack?

      Tuesday 2015/03/03 - Researchers of miTLS team (joint project between Inria and Microsoft Research) disclosed a new SSL/TLS vulnerability — the FREAK SSL attack (CVE-2015-0204). The vulnerability allows attackers to intercept HTTPS connections between vulnerable clients and servers and force them to use ‘export-grade’ cryptography, which can then be decrypted or altered.

      Read more about the FREAK SSL attack.

      Testing a webserver

      You can verify whether a webserver is attackable by the FREAK attack with this free SSL vulnerability checker.

      FIPS

      MarkLogic Server uses FIPS-capable OpenSSL to implement the Secure Sockets Layer (SSL v3) and Transport Layer Security (TLS v1) protocols. When you install MarkLogic Server, FIPS mode is enabled by default and SSL RSA keys are generated using secure FIPS 140-2 cryptography. This implementation disallows weak ciphers and uses only FIPS 140-2 approved cryptographic functions. Read more about OpenSSL FIPS mode in MarkLogic Server, and how to configure it.

      As long as FIPS mode was not explicitly disabled, MarkLogic Application Servers are not vulnerable to the FREAK SSL attack. 

      OpenSSL

      Eliminating the vulerability for all configurations requires an update to the OpenSSL library. MarkLogic Server continually updates the implementation version of the OpenSSL library so every MarkLogic Server maintenance release published after the discovery of this vulnerability will include the OpenSSL version that is not vulnerable to the FREAK attack.

      Conclusion

      As long as FIPS mode is enabled, which is the default configuration, MarkLogic Application Servers are not vulnerable to the FREAK SSL attack

       

      Summary

      MarkLogic 9 introduces Certificate based User Authentication, which allows users to Log into MarkLogic Server without being required to enter user name/password. In previous versions, Certificates were only utilized to restrict client access to MarkLogic Server with the Digest/Basic User Authentication Scheme. In addition to Certificate based User Authentication using Internal user and External name verification MarkLogic 9 also permits authenticating and authorizing user certificates against an LDAP or Active Directory database to permit access based on MarkLogic Roles and LDAP Group membership. By using this method of authentication and authorization a site is able to maintain all users access externally without the need to manage a separate set of users within the MarkLogic security database.


      This document will expand on the concepts and configuration examples described in the associated "MarkLogic Certificate based User Authentication" knowledge base article and will show the additional steps required to configure MarkLogic to authorize a User certificate against an LDAP or Active Directory. It is highly recommended that you make yourself familiar with the previous article as it covers in more detail the steps required to setup the MarkLogic App Server to ensure that TLS Client Authentication is configured correctly to request and verify the certificates that may be presented by the user.

      Creating the External Security definition

      To authorize users presenting a certificate you should first create a new External Security definition selecting “Certificate” for authentication and LDAP for authorization.

       ExternalSecurity.png

      Next, configure the LDAP server entry.

      LDAPServer.png

      Notes:

      • Unlike standard user authorization when MarkLogic searches for the user certificate, MarkLogic uses a base Object search using the full certificate distinguished name rather than a sub-tree search off the “ldap base”. MarkLogic UI currently requires an entry for the “ldap base”; Even though it is not used, as such you will need to code a dummy value to satisfy UI verification.
      • When performing the LDAP search, MarkLogic will request the “ldap attribute” value to use when creating the temporary userid. Care should be taken when selecting this value to ensure that the value is unique for all possible Certificate DN’s that may be presented.
      • Ensure that the “ldap default user” has the required permissions to search for the Certificate within the LDAP or Active Directory server and return the required attributes.
      • MarkLogic uses the “memberOf” and “member” attributes to return Group and Group of Group membership, if your LDAP or Active Directory server using different attributes such as “isMemberOf” you can override them in the “memberOf” and “member” attribute fields. 

      Configuring the App Server

      Configure the App Server to use “certificate” authentication, set “Internal Security” to false and select the external security definition created above.

      AppServer1.png

      Enable TLS Client Authentication and configure the SSL Client Certificate authorities that you will accept to sign the user certificates. Any certificates presented that is not signed by one of the specified CA’s will be rejected.

      AppServer2.png 

      AppServer3.png

      For more details on configuring the CA certificates required for certificate based authentication please from to the knowledge base article "MarkLogic Certificate based User Authentication". 

      Configure MarkLogic Security Roles

      For each role specify one or more external names that match the “memberOf” attribute returned for the Certificate DN.

      Role.png
       
      To confirm that users are being authorized to the MarkLogic AppServer correctly, connect using your browser or command line tool such as “cUrl”.

      MacPro-4505:~ $ curl -k --cert ./mluser1.p12:password https://localhost:8013
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      <html xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
      <head>
      <title>Welcome to the MarkLogic Test page.</title>
      </head>
      <body><p>This application is running on MarkLogic Server version 9.0-1.1</p></body>

       
      Within the AppServer AccessLog, you should see a mapping for a new temporary userid to the expected role.

      External User(mluser1) is Mapped to Temp User(mluser1) with Role(s): mladmin
      ::1 - mluser1 [18/Jul/2017:16:07:05 +0100] "GET / HTTP/1.1" 200 347 - "curl/7.51.0"

      Troubleshooting

      If a user is not able to connect using their certificate, the first thing to check is if the Certificate Distinguished Name (DN) can be found in the LDAP or Active Directory database and if it contains the required userid and memberOf attributes.

      Using a tool such as OpenSSL determine the correct Subject Certificate DN, e.g.

      MacPro-4505:~ $ openssl x509 -in mluser1.pem -text
      Certificate:
      Data:
      Version: 3 (0x2)
      Serial Number: 1497030421 (0x593adf15)
      Signature Algorithm: sha256WithRSAEncryption
      Issuer: CN=User Signing Authority, O=MarkLogic, OU=Support
      Validity
      Not Before: Jun 9 17:47:13 2017 GMT
      Not After : Jun 9 17:47:13 2018 GMT
      Subject: CN=mluser1, OU=Users, DC=MarkLogic, DC=Local
       
      Next using an LDAP lookup tool such as “ldapsearch” or "ldp.exe" on Microsoft Windows, perform a base Object search for the Certificate DN requesting the LDAP user and memberOf attribute (with the entries matching your LDAP External Security settings).

      If either the userid or memberOf attributes are missing access will be denied.


      MacPro-4505:~ $ ldapsearch -H ldap://192.168.66.240:389 -x -D "cn=manager,dc=marklogic,dc=local" -W -s base -b "cn=mluser1,ou=Users,dc=MarkLogic,dc=Local" "memberOf" "cn"
      # extended LDIF
      #
      # LDAPv3
      # base <cn=mluser1,ou=Users,dc=MarkLogic,dc=Local> with scope baseObject
      # filter: (objectclass=*)
      # requesting: memberOf uid
      #
      # mluser1, Users, MarkLogic.Local
      dn: cn=mluser1,ou=Users,dc=MarkLogic,dc=Local
      uid: mluser1
      memberOf: cn=AppAdmin,ou=Groups,dc=MarkLogic,dc=Local
      # search result
      search: 2
      result: 0 Success
       
      If MarkLogic is able successfully to locate the certificate and return the required attributes, then check if the external names in the security role matches (case-sensitive) the “memberOf” attribute returned by the LDAP search.

      The following XQuery can be used to show all the external names assigned to a specific role. 


      (: execute this against the security database :)
      xquery version "1.0-ml";
      import module namespace sec = "http://marklogic.com/xdmp/security"
          at "/MarkLogic/security.xqy";
      sec:role-get-external-names("mladmin")


      Result

      cn=AppAdmin,ou=Groups,dc=MarkLogic,dc=Local


      If MarkLogic is still not able to authenticate users, it is very useful to use a packet capture tool such as Wireshark to check - if MarkLogic is able to contact the LDAP or Active Directory server and is receiving the expected successful Admin bind and Search for the Certificate DN.

      The following example trace shows a successful BIND using the LDAP Default user followed by a successful search for the Certificate DN.

      LDAPWireshark.png

      Further Reading

      Summary

      MarkLogic 9 introduces Certificate based User Authentication, which allows users to Log into MarkLogic Server without being required to enter user name/password. In previous versions, Certificates were only utilized to restrict client access to MarkLogic Server with the Digest/Basic User Authentication Scheme. Certificate based User Authentication configuration can be achieved using Internal User or External Name based user configurations.

      Certificate Authentication: Internal User vs External Name based Authentication:

      The difference between Internal User or External Name based authentication lies in the existence of the Certificate CN field based User (demoUser1 in our example) in the MarkLogic Security Database (Internal User) vs if the user retrieved from Certificate Subject field (whole Subject field as DN) is mapped as External Name value in any Existing User.

      User Certificate Example:

      There are few common steps/examples listed to add to clarity. For our example setup, the certificate presented by the App Server User (demoUser1) will be as following. 

      $ openssl x509 -in UserCert.pem -text -noout
      Certificate:
          Data:
              Version: 1 (0x0)
              Serial Number: 7 (0x7)
          Signature Algorithm: sha1WithRSAEncryption
              Issuer: C=US, ST=NY, L=New York, O=MarkLogic Corporation, OU=Engineering, CN=MarkLogic DemoCA
              Validity
                  Not Before: Jul 11 02:58:24 2017 GMT
                  Not After : Aug 27 02:58:24 2019 GMT
              Subject: C=US, ST=NJ, L=Princeton, O=MarkLogic Corporation, OU=Engineering, CN=demoUser1
              Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                      Public-Key: (1024 bit)
                      Modulus:
                          .....................
                      Exponent: 65537 (0x10001)
          Signature Algorithm: sha1WithRSAEncryption

      CA Certificate (User Cert Signer) Import from Admin GUI

      In order to allow MarkLogic Server to accept the Certificate presented by a user, MarkLogic Server needs Certificate Authority (CA) to sign the User Certificate installed into MarkLogic. We can install CA Certificate (below) used to sign demoUser1 Cert using Admin GUI->Configure->Security->Certificate Authority Import tab.

      $ openssl x509 -in CACert.pem -text -noout
      Certificate:
          Data:
              Version: 3 (0x2)
              Serial Number: 9774683164744115905 (0x87a6a68cc29066c1)
          Signature Algorithm: sha256WithRSAEncryption
              Issuer: C=US, ST=NY, L=New York, O=MarkLogic Corporation, OU=Engineering, CN=MarkLogic DemoCA
              Validity
                  Not Before: Jul 11 02:53:18 2017 GMT
                  Not After : Jul  6 02:53:18 2037 GMT
              Subject: C=US, ST=NY, L=New York, O=MarkLogic Corporation, OU=Engineering, CN=MarkLogic DemoCA
              Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                      Public-Key: (4096 bit)
                      Modulus:
                         ......................
                      Exponent: 65537 (0x10001)
              X509v3 extensions:
                  X509v3 Subject Key Identifier:
                      D9:45:B9:9A:DC:93:7B:DB:47:07:C6:96:63:57:13:A7:A8:F1:D0:C8
                  X509v3 Authority Key Identifier:
                      keyid:D9:45:B9:9A:DC:93:7B:DB:47:07:C6:96:63:57:13:A7:A8:F1:D0:C8
                  X509v3 Basic Constraints: critical
                      CA:TRUE
                  X509v3 Key Usage: critical
                      Digital Signature, Certificate Sign, CRL Sign
          Signature Algorithm: sha256WithRSAEncryption

      CA Certificate Import into MarkLogic from Query Console

      We can also import above Certificate Authority with xquery call pki:insert-trusted-certificates to load the Trusted CA into MarkLogic.  The sample Query Console code below demonstrates this process. 

      (Please ensure this query is executed against the Security database)

      Certificate Template & Template CA import into Client (Browser/SSL Client)

      To enable SSL App Server, we will either

      1) Create Certificate Template to utilize Self Signed Certificate.

      or, 2) Import pre-signed Certificate Certificate into MarkLogic

      In both of the above cases, we will need to import CA used to sign Certificate used by MarkLogic SSL AppServer into Client Browser/SSL Client (below example clients).

      Importing a Self Signed Certificate Authority into Mozilla Firefox 

      Importing a Self Signed Certificate Authority into Windows

      Once template is created, we will link our Template with our App Server to enable SSL based App Server.

      Certificate Authentication: CN as Internal User vs External Name based Internal User

      Difference between above two lies in if Certificate CN field User (demoUser1 in our example) exist in MarkLogic Security Database as Internal User -vs- if User retrieved from Certificate Subject field is mapped as External Name to any Existing User.

      1.) Certificate Authentication: Certificate CN field value as MarkLogic Security Database Internal User

      Steps to configure Certificate based User Authentication for our User demoUser1 as MarkLogic Internal User.

      a.) Create User "demoUser1" with necessary roles in MarkLogic Security (Internal User).

      DemoUser1_Internal_User.png

      b.) On the AppServer page, we will set Authentication schema to "Certificate" with Internal Security to "true". Also, unless you want to have some Users Authenticated as External User as well, you should leave External Security object to "none".

      AppServer_Authentication_Certificate.png

      c.) AppServer would also select CA that will be used to sign Client/User Certificate as accepted Certificate Authorities (please see section: CA Certificate earlier for our example).

      ClientCert_CA.png

      Once Configured, accessing above App Server with Browser with User Certificate (demoUser1) installed will be able to log into MarkLogic with internal demoUser1 (Note- We will also need to assign necessary Roles to Internal User to access resource as needed). 

      2.) Certificate Authentication: User Certificate Subject field value as External Name for Internal User

      Steps to configure Certificate based User Authentication for our User demoUser1 as MarkLogic External Name for Internal User "newUser1".

      a.) Create User "newUser1" with necessary roles in MarkLogic Security (Internal User), and Configure User Certificate Subject field as External Name to User.

      NewUser1_External_Name.png

      b.) Create an External Security object with Certificate based Authentication.

      External_Sec_Object.png

      c.) On External Security Object Configuration itself, select CA that will be used to sign Client/User Certificate as accepted Certificate Authorities (please see section: CA Certificate earlier for our example).

      Please Note - below Configuration is different then configuring Client CA on App Server (required for Internal User).

      External_Sec_ClientCert_CA.png

      d.) For External Name (Cert Subject field) based linkage to Internal User, App Server needs to point to our External Security Object.

      AppServer_ExternalSec_Link.png

      Summary

      MarkLogic may fail to start, with an XDMP-ENCODING error, Initialization: XDMP-ENCODING: (err:XQST0087) Unsupported character encoding: ascii.  This is caused by a mismatch in the Linux Locale character set, and the UTF-8 character set required by MarkLogic.

      Solution

      This issue occurs when the Linux Locale LANG setting is not set to UTF-8.  This can be accomplished by changing the value of LC_ALL to "en_US.UTF-8".  This should be done for the root user for default installations of MarkLogic.  To change the system wide locale settings, the /etc/locale.conf needs to be modified. This can be done using the localectl command.

      • sudo localectl --set-locale LOCALE "LANG=en_US.utf8"

      If MarkLogic is configured to run as a non-root user, then setting the locale can be done in the users environment.  Setting the value can be done using the $HOME/.i18n file.  If the file does not exist, please create it and ensure it has the following:

      • export LANG="en_US.UTF-8"

      If that does not resolve the issue in the user environment, then you may need to look at setting LC_CTYPE, or LC_ALL for the locale.

      • LC_CTYPE will override the character set part of the LANG setting, but will not change other locale settings.
      • LC_ALL will override both LC_CTYPE and all locale configurations of the LANG setting.

      References

      https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-keyboard_configuration

      https://access.redhat.com/solutions/974273

      https://www.unix.com/man-page/centos/1/localectl/

      http://man7.org/linux/man-pages/man1/locale.1.html

      Introduction

      There is a lot of useful information in MarkLogic Server's documentation surrounding many of the new features of MarkLogic 9 - including the new SQL implementation, improvements made to the ODBC driver and the new system for generating SQL "view" templates for your data. This article attempts to pull it all together by showing all the measures needed to create a successful connection and to verify that everything is set up correctly and works as expected?

      This guide presents a step-by-step walk through covering the installation of all the necessary components, the configuration of the ODBC driver and the loading of data into MarkLogic in order to create a Template View that will allow a SQL query to be rendered.

      Prerequisites

      We're starting with a clean install of Redhat Enterprise Linux 7:

      $ uname -a
      Linux engrlab-128-084.engrlab.marklogic.com 3.10.0-327.4.5.el7.x86_64 #1 SMP Thu Jan 21 04:10:29 EST 2016 x86_64 x86_64 x86_64 GNU/Linux

      In this example, I'm using yum to manage the additional dependencies (openssl-libs and unixODBC) required for the MarkLogic ODBC driver:

      $ sudo yum install openssl-libs
      Package 1:openssl-libs-1.0.2k-8.el7.x86_64 already installed and latest version
      Nothing to do
      
      $ sudo yum install unixODBC
      Package unixODBC-2.3.1-11.el7.x86_64 already installed and latest version
      Nothing to do
      

      If you want to use the latest version of unixODBC (2.3.4 at the time of writing), you can get it using cURL by running curl -O ftp://ftp.unixodbc.org/pub/unixODBC/unixODBC-2.3.4.tar.gz

      $ curl -O ftp://ftp.unixodbc.org/pub/unixODBC/unixODBC-2.3.4.tar.gz
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100 1787k  100 1787k    0     0   235k      0  0:00:07  0:00:07 --:--:--  371k

      Please note - as per the documentation, this method will require unixODBC to be compiled so additional dependencies may need to be met for this.

      This article assumes that you have downloaded the ODBC driver for MarkLogic Server and the MarkLogic 9 install binary and have those available on your machine:

      $ ll
      total 310112
      -r--r--r-- 1 support support 316795526 Nov 16 04:19 MarkLogic-9.0-3.x86_64.rpm
      -r--r--r-- 1 support support    754596 Nov 16 04:18 mlsqlodbc-1.3-3.x86_64.rpm
      
      Getting started: installing and configuring MarkLogic 9 with an ODBC Server

      We will start by installing and starting MarkLogic 9:

      $ sudo rpm -i MarkLogic-9.0-3.x86_64.rpm
      $ sudo service MarkLogic start
      Starting MarkLogic:                                        [  OK  ]

      From there, we can point our browser at http://host:8001 and walk through the initial MarkLogic install process:

      As soon as the install process has been completed and you have created an Administrator user for MarkLogic Server, we're ready to create an ODBC Application Server.

      To do this, go to Configure > Groups > Default > App Servers and select the Create ODBC tab:

      Next we're going to make the minimal configuration necessary by entering the required fields - the odbc server name, the Application Server module directory root and the port.

      In this example we will configure the Application Server using the following values:

      odbc server name
      ml-odbc
      root
      /
      port
      5432

      After this is done, confirm that the Application Server has been created by going to Configure > Groups > Default > App Servers and ensure that you can see the ODBC Server listed and configured on port 5432 as per the image below:

      Getting started: Setting up the MarkLogic ODBC Driver

      Use RPM to install the ODBC driver:

      $ sudo rpm -i mlsqlodbc-1.3-3.x86_64.rpm
      odbcinst: Driver installed. Usage count increased to 1.
          Target directory is /etc

      Configure the base template as instructed in the installation guide:

      $ odbcinst -i -s -f /opt/MarkLogic/templates/mlsql.template
      Getting started: ensure unixODBC is configured

      To ensure the unixODBC commandline client is configured, you can run isql -h to bring up the help options:

      $ isql -h
      
      **********************************************
      * unixODBC - isql                            *
      **********************************************
      * Syntax                                     *
      *                                            *
      *      isql DSN [UID [PWD]] [options]        *
      *                                            *
      * Options                                    *
      *                                            *
      * -b         batch.(no prompting etc)        *
      * -dx        delimit columns with x          *
      * -x0xXX     delimit columns with XX, where  *
      *            x is in hex, ie 0x09 is tab     *
      * -w         wrap results in an HTML table   *
      * -c         column names on first row.      *
      *            (only used when -d)             *
      * -mn        limit column display width to n *
      * -v         verbose.                        *
      * -lx        set locale to x                 *
      * -q         wrap char fields in dquotes     *
      * -3         Use ODBC 3 calls                *
      * -n         Use new line processing         *
      * -e         Use SQLExecDirect not Prepare   *
      * -k         Use SQLDriverConnect            *
      * --version  version                         *
      *                                            *
      * Commands                                   *
      *                                            *
      * help - list tables                         *
      * help table - list columns in table         *
      * help help - list all help options          *
      *                                            *
      * Examples                                   *
      *                                            *
      *      isql WebDB MyID MyPWD -w < My.sql     *
      *                                            *
      *      Each line in My.sql must contain      *
      *      exactly 1 SQL command except for the  *
      *      last line which must be blank (unless *
      *      -n option specified).                 *
      *                                            *
      * Please visit;                              *
      *                                            *
      *      http://www.unixodbc.org               *
      *      nick@lurcher.org                      *
      *      pharvey@codebydesign.com              *
      **********************************************

      If you're not seeing the above message, it could be possible that there's another application on your system overriding this, for this configuration, the isql command is found at /usr/bin/isql:

      $ which isql /usr/bin/isql
      Getting started: initial connection test

      If you're happy that isql is correctly, installed, we're ready to test the connection using isql -v:

      $ isql -v MarkLogicSQL admin admin
      +---------------------------------------+
      | Connected!                            |
      |                                       |
      | sql-statement                         |
      | help [tablename]                      |
      | quit                                  |
      |                                       |
      +---------------------------------------+
      SQL>

      Let's confirm that it's really working by loading some data into MarkLogic and creating an SQL view around that data.

      Loading sample data into MarkLogic

      To load data, we're going to use Query Console to insert the same sample data that is created in the Quick Start Documentation:

      To access Query Console, point your browser at http://host:8000 and make note of the following:

      Ensure the database is set to Documents (or at least, matches the database specified by your ODBC Application Server) and ensure that the Query Type is set to JavaScript

      When these are both set correctly, run the code to generate sample data (note that this data is taken from the quick start guide and reproduced here for convenience):

      After that has run, you should see a null response back from the query:

      To confirm that the data was loaded successfully, you can use the Explore button.  You should now see that 22 employee documents (rows) are now in the database:

      Create the template view

      Now the documents are loaded, a tabular view for that data needs to be created.

      Ensure the database is (still) set to Documents (or at least, matches the database specified by your ODBC Application Server) and ensure that the Query Type is now set to XQuery

      As soon as this is set, you can run the code below to generate the template view (note that this data is taken from the quick start guide and reproduced here for convenience):

      And to confirm this was loaded, Query Console should report an empty sequence was returned.

      Test the template using a SQL Query

      The database should remain set to Documents and ensure that the Query Type is now set to SQL:

      Then you can run the following SQL Query:

      SELECT * FROM employees

      If everything has worked correctly, Query Console should render a view of the table in response to your query:

      Test the SQL Query via the ODBC Driver

      All that remains now is to go back to the shell and test the same connection over ODBC.

      To do this, we're going to use the isql command again and run the same request there:

      $ isql -v MarkLogicSQL admin admin
      +---------------------------------------+
      | Connected!                            |
      |                                       |
      | sql-statement                         |
      | help [tablename]                      |
      | quit                                  |
      |                                       |
      +---------------------------------------+
      SQL> select * from employees
      <<< RESPONSE CUT >>>
      SQLRowCount returns 7
      7 rows fetched
      

      Further reading

      Introduction

      This article details changes to the upgrade procedures for MarkLogic 9 AMIs.

      MarkLogic 9 now supports 1-click deployment in AWS Marketplace. This is an addition to existing options of manual launch of an AMI and launching MarkLogic clusters via CloudFormation templates. In order to make 1-click launch possible, our AMIs have pre-configured data volume (device on /dev/sdf).  The updated cloud formation templates account for the pre-configured data volume. This change also requires a different approach to our documented upgrade process.

      Details

      As per MarkLogic EC2 Guide, the main goal of the upgrade is to update AMI IDs in CloudFormation in order to upgrade all instances in the stack. There is now an additional step to handle the blank data volume that is pre-configured on MarkLogic AMIs.

      Always backup your data before attempting any upgrade procedures!

      Scenario 1:  You are using unmodified CF templates that were published by MarkLogic on http://developer.marklogic.com/products/cloud/aws starting from version 8.0-3.

      1. Update your CloudFormation stack with the latest template as there were no breaking changes since 8.0-3. The current templates for MarkLogic 9 include new AWS regions, new AMI IDs, and code to remove blank data volume that is bundled with current AMIs.
      2. In the EC2 Dashboard, stop one instance at the time and wait for it to be replaced with a new one.
      3. For a rolling upgrade (and as a good practice) terminate the other nodes one by one starting with the node that has Security database. They will come up and reconnect without any UI interaction.
      4. Go to 8001 port on any new instance where an upgrade prompt should be displayed.
      5. Click OK and wait for the upgrade to complete on the instance.

      Scenario 2: You made some changes to MarkLogic templates or you are using custom templates.

      1. Download current templates from http://developer.marklogic.com/products/cloud/aws.
      2. Locate the AMI IDs by searching for "AWSRegionArch2AMI" block in the template.
        "AWSRegionArch2AMI": {
              "us-east-1": {
                "HVM": "ami-54a8652e"
              },
              "us-east-2": {
                "HVM": "ami-2ab29f4f"
              }, ...
      3. Locate AMI IDs in the old template and replace them with the ones from the new template. 
      4. Locate "BlockDeviceMappings" section in the new template that was downloaded in step 1. This block of code was added to remove blank volume that is part of the new 1-click AMIs.
      5. Update the old template to include "BlockDeviceMappings" as a property of LaunchConfig. There will be one or three LaunchConfig blocks depending on the template used. Those can by located by searching for "AWS::AutoScaling::LaunchConfiguration". Here is an example of the new property under LaunchConfig.
        "LaunchConfig":
        {
          "Type":"AWS::AutoScaling::LaunchConfiguration",
        "Properties":
        {
        "BlockDeviceMappings":
        [{
        "DeviceName":"/dev/sdf",
        "NoDevice":true,
        "Ebs": {}
        }],
        ...
      6. Once all the changes are saved, update your stack with the updated CloudFormation template. Make sure the stack update is complete.
      7. In the EC2 Dashboard, terminate nodes one by one starting with the node that has Security database. New nodes will come up after a couple of minutes and reconnect without any UI interaction.
      8. Wait for all nodes to be up and in green state.
      9. Go to 8001 port on any new instance where an upgrade prompt should be displayed.
      10. Click OK and wait for the upgrade to complete on the instance.

      Scenario 3: You have instances that were brought up directly from MarkLogic AMI. For each MarkLogic instance in your cluster, do the following:

      1. Terminate the instance.
      2. Launch a new instance from the upgraded AMI.
      3. Detach blank volume that is mounted on /dev/sdf (should be 10GB in size)
      4. Attach the EBS data volume associated with the original instance.

      More details on how to update CloudFormation stack can be found at http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks.html

      Introduction: the decimal type

      In order to be compliant with the XQuery specification and to satisfy the needs of customers working with financial data, MarkLogic Server implements a decimal type, available in XQuery and server-side JavaScript.

      Decimal type has been implemented for very specific requirements, decimals have about a dozen more bits of precision than doubles but take up more memory and arithmetic operations over them are much slower.

      Use the double where possible

      Unless you have a specific requirement to use a Decimal data type, in most case it's better and faster to use the double data type to represent large numbers.

      Specific details about the decimal data type

      If you still want or need to use a decimal data type below are its limitations and details on how exactly it is implemented in MarkLogic Server:

      o   Precision

      • How many decimal digits of precision does it have?

      The MarkLogic implementation of xs:decimal representation is designed to meet the XQuery specification requirements to provide at least 18 decimal digits of precision. In practice, up to 19 decimal digits can be represented with full fidelity.

      • If it is a binary number, how many binary digits of precision does it have?

       A decimal number is represented inside MarkLogic with 64 binary bits of digits and an additional 64 bits of sign and a scale (specifies where the decimal point is).

      • What are the exact upper and lower bounds of its precision?

      -18446744073709551615 to 18446744073709551615 

      Any operation producing number smaller or bigger than this range will result in XDMP-DECOVRFLW error (decimal overflow)

      o   Scale

      • Does it have a fixed scale or floating scale?

      It has a floating scale.

      • What are the limitations on the scale?

      -20 to 0

      So you can only represent numbers between 1 * (2^-64) and 18446744073709551615

      • Is the scale binary or decimal?

      Decimal

      • How many decimal digits can it scale?

      20

      • How many binary digits can it scale?

      N/A

      • What is the smallest number it can represent and the largest?

      smallest: -1*(2^64)
      closest to zero: 1*(10^-20)
      largest: (2^64)

      • Are all integers safe or does it have a limited safe range for integers?

      It can represent 64 bit unsigned integers with full fidelity.

       

      o   Limitations

      • Does it have binary rounding errors?

      The division algorithm on Linux in particular does convert to an 80-bit binary floating point representation to calculate reciprocals - which can result in binary rounding errors. Other arithmetic algorithms work solely in base 10.

      • What numeric errors can it throw and when?

      Overflow: Number is too big or small to represent
      Underflow: Number is close to zero to represent
      Loss of precision: The result has too many digits of precision (essentially the 64bit digits value has overflowed)

      • Can it represent floating point values, such as NaN, -Infinity, +Infinity, etc.?

       No

      o   Implementation

      • How is the DECIMAL data type implemented?

      It has a representation with 64 bits of digits, a sign, and a base 10 negative exponent (fixed to range from -20 to 0). So the value is calculated like this:

      sign * digits * (10 ^ -exponent)

      • How many bytes does it consume?

      On disk, for example in triple indexes, it's not a fixed size as it uses integer compression. At maximum, the decimal scalar type consumes 16 bytes per value: eight bytes of digits, four bytes of sign, and four bytes of scale. It is not space efficient but it keeps the digits aligned on eight-byte boundaries.

      Summary

      A database or forest backup in MarkLogic Server may be significantly slower than just performing a file copy (cp in Linux).  Why is this so?

      Details

      Using cp on very large files on a large-memory linux can produce huge amounts of dirty pages that can saturate i/o channels for minutes in order to flush data to the disk. Cp also doesn’t wait for the data to be written before returning.  As a result, cp is very unfriendly to other applications running on the same system.

      When MarkLogic Server performs a backup, it works hard not to saturate any subsystem or resource. MarkLogic takes care that the number of dirty pages at any one time is never very large, and it keeps the i/o queues short so that any concurrent database queries and updates are not significantly impacted by the backup. Finishing the backup in the fastest possible time is not the priority. 

      Can I make it go faster?

      Yes, there is a diagnostic trace event “Unthrottle Backup” that turns off throttling in MarkLogic. However, even with throttling turned off, MarkLogic will still work to keep the number of dirty pages low.

      The diagnostic trace event can be enabled from the MarkLogic Server Admin UI by navigating to -> Configure -> Groups -> {group-name} -> Diagnostic:  trace events activated = true; Add  “Unthrottle Backup” (without quotes); Press "ok".

      Introduction

      MarkLogic automatically provides 

      • ANSI REPEATABLE READ level of isolation for update transactions, and 
      • Serializable isolation for read-only (query) transactions.

      MarkLogic can be made to provide ANSI SERIALIZABLE isolation for update transactions, but doing so requires developers to manage their own predicate locks.

      Isolation Levels - Background

      There are many possible levels of isolation, and many different taxonomies of isolation levels. The most common taxonomy (familiar to those with a RDBMS background) is the one defined by ANSI SQL, which defines four levels of isolation based on read phenomena that are possible at each level. ANSI has a definition for each phenomenon, but these definitions are open to interpretation. Broad interpretation results in more rigorous criteria for each isolation level (and therefore better isolation at each level), whereas strict interpretation results in less rigorous isolation at each level. Here I’ll use a shorthand notation to describe these phenomena, and will use the broad rather than the strict interpretation. The notation specifies the operation, the transaction performing the operation, and the item or domain on which the operation is performed. Operations in my notation are:

      • Write (w)
      • Read (r)
      • Commit (c)
      • Abort/rollback (a)

      An example of this shorthand: w1[x] means transaction1 writes to item x.

      Now the phenomena:

      • A dirty read happens when a transaction T2 reads an item that is being written by concurrently running transaction T1. In other words: w1[x]…r2[x]…((c1 or a1) and (c2 or a2) in any order). This phenomenon could lead to an anomaly in the case where T1 later aborts, and T2 has then read a value that never existed in the database.
      •  A non-repeatable read happens when a transaction T2 writes an item that was read by a transaction T1 prior to T1 completing. In other words: r1[x]…w2[x]…((c1 or a1) and (c2 or a2) in any order). Non-repeatable reads don’t produce the same anomalies as dirty reads, but can produce errors in cases where T1 relies on the value of x not changing between statements in a multi-statement transaction (e.g. reading and then updating a bank account balance).
      • A phantom read happens when a transaction T1 retrieves a set of data items matching some search condition and concurrently running transaction T2 makes a change that modifies the set of items that match that condition. In other words: (r1[P] and w2[x in P] in any order)…((c1 or a1) and (c2 or a2) in any order), where P is a set of results. Phantom reads are usually less serious than dirty or non-repeatable reads because it generally doesn’t matter if item x in P is written before or after T1 finishes unless T1 is itself explicitly reading x. And in this case the phenomenon would no longer be a phantom, but would instead be a dirty or non-repeatable read per the definitions above. That said, there are some cases where phantom reads are important.

       The isolation levels ANSI defines are based on which of these three phenomena are possible at that isolation level. They are:

      • READ UNCOMMITTED – all three phenomena are possible at this isolation level.
      • READ COMMITTED – Dirty reads are not possible, but non-repeatable and phantom reads are.
      • REPEATABLE READ – Dirty and non-repeatable reads are not possible, but phantom reads are.
      • SERIALIZABLE – None of the three phenomena are possible at this isolation level.

      Note that as defined above, ANSI SERIALIZABLE is not sufficient for transactions to be truly serializable (in the sense that running them concurrently and running them in series would in all cases produce the same result), so SERIALIZABLE is an unfortunate choice of names for this isolation level, but that’s what ANSI called it.

      Update Transaction Locks

      Typically, a DBMS will avoid dirty and non-repeatable reads by taking locks on records (called item locks). Locks are either shared locks (which can be held by more than one transaction) or exclusive locks (which can be held by only one transaction at a time). In most DBMSes (including MarkLogic), locks taken when reading an item are shared and locks taken when writing an item are exclusive.

      MarkLogic prevents dirty and non-repeatable reads in update transactions by taking item locks on items that are being read or written during a transaction and releasing those locks only on completion of the transaction (post-commit or post-abort). When a transaction needs to lock an item on which another transaction has an exclusive lock, that transaction waits until either the lock is released or the transaction times out. Deadlock detection prevents cases where two transactions are waiting on each other for exclusive locks. In this case one of the transactions will abort and restart.

      In addition, MarkLogic prevents some types of phantom reads by taking item locks on the set of items in a search result. This prevents phantom reads involving T2 removing an item in a set that T1 previously searched, but does not prevent phantom reads involving T2 inserting an item in a set that T1 previously searched, or those involving T2 searching for items and seeing a deletion caused by T1.

      Avoiding All Phantom Reads

      To avoid all phantom reads via locking, it is necessary to take locks not just on items that currently match the search criteria, but also on all items that could match the search criteria, whether they currently exist in the database or not. Such locks are called predicate locks. Because you can search for pretty-much anything in MarkLogic, guaranteeing a predicate lock for arbitrary searches would require locking the entire database. From a concurrency and throughput perspective, this is obviously not desirable. MarkLogic therefore leaves the decision to take predicate locks and the scope of those locks in the hands of application developers. Because the predicate domain can frequently be narrowed down with some application-specific knowledge, this provides the best balance between isolation and concurrency. To take a predicate lock, you lock a synthetic URI representing the predicate domain in every transaction that reads from or writes to that domain. You can take shared locks on a synthetic URI via fn:doc(URI). Exclusive locks are taken via xdmp:lock-for-update(URI).

      Note that predicate locks should only be taken in situations where phantom reads are intolerable. If your application can get by with REPEATABLE READ isolation, you should not take predicate locks, because any additional locking results in additional serialization and will impact performance.

      Summary

      To summarize, MarkLogic automatically provides ANSI REPEATABLE READ level of isolation for update transactions and true serializable isolation for read-only (query) transactions. MarkLogic can be made to provide ANSI SERIALIZABLE isolation for update transactions, but doing so requires developers to manage their own predicate locks.

      Summary

      Text is stored in MarkLogic Server in Unicode NFC normalized form.

      Discussion

      In MarkLogic Server, all text is converted into Unicode NFC normalized form before tokenization and storage. 

      Unicode considers NFC-compatible characters to be essentially equivalent. See the Unicode normalization FAQ and Conformance Requirements in the Unicode Standard.

      Example

      For example, consider the NFC equivalence of the codepoints x2126 (&#x2126) and x03A9 (&#x03A9). This is shown for the x2126 entry in the Unicode code chart for the U2100 block.

      You can see the effects of normalization alone, and during tokenization, by running the following in MarkLogic Server's Query Console:

      xquery version "1.0-ml";
      (: equivalence of Ω forms :)
      let $s := fn:codepoints-to-string (xdmp:hex-to-integer ('2126'))
      let $token := cts:tokenize ($s)
      return (
          'original: '||xdmp:integer-to-hex (fn:string-to-codepoints ($s)),
          'normalized: '||xdmp:integer-to-hex (fn:string-to-codepoints (fn:normalize-unicode ($s, 'NFC'))),
          'tokenized: '||xdmp:describe ($token, (), ())
      )
      

      The results show the original value, the normalized value, and the resulting token:

      original: 2126
      normalized: 3a9
      tokenized: cts:word("&#x03a9;")
      

      Abstract

      In MarkLogic Server version 9, the default tokenization and stemming code has been changed for all languages (except English tokenization). Some tokenization and stemming behavior will change between MarkLogic 8 and MarkLogic 9. We expect that, in most cases, results will be better in MarkLogic 9.

      Information is given for managing this change in the Release Notes at Default Stemming and Tokenization Libraries Changed for Most Languages, and for further related features at New Stemming and Tokenization.

      In-depth discussion is provided below for those interested in details.

      General Comments on Incompatibilities

      General implications of tokenization incompatibilities

      If you do not reindex, old content may no longer match the same searches, even for unstemmed searches.

      General tokenization incompatibilities

      There are some edge-case changes in the handling of apostrophes in some languages; in general this is not a problem, but some specific words may include/break at apostrophes.

      Tokenization is generally faster for all languages except English and Norwegian (which use the same tokenization as before).

      General implications of stemming incompatibilities

      Where there is only one stem, and it is now different:  Old data will not match stemmed searches without reindexing, even for the
      same word.

      Where the new stems are more precise:  Content that used to match a query may not match any more, even with
      reindexing.

      Where there are new stems, but the primary stem is unchanged:  Content that used to not match a query may now match it with advanced
      stemming or above. With basic stemming there should be no change.

      Where the decompounding is different, but the concatenation of the components is the same:  Under decompounding, content may match a query when it used to not match, or may not match a query when it used to match, when the query or content involves something with one of the old/new components. Matching under advanced or basic stemming would be generally the same.

      General stemming incompatibilities

      • MarkLogic now has general algorithms backing up explicit stemming dictionaries.  Words not found in the default dictionaries will sometimes be stemmed when they previously were not.
      • Diminutives/augmentatives are not usually stemmed to base form.
      • Comparatives/superlatives are not usually stemmed to base form.
      • There are differences in the exact stems for pronoun case variants.
      • Stemming is more precise and restricted by common usage. For example, if the past participle of a verb is not usually used as an adjective, then the past participle will not be included as an alternative stem. Similarly, plural forms that only have technical or obscure usages might not stem to the singular form.
      • Past participles will typically include the past participle as an alternative stem.
      • The preferred order of stems is not always the same: this will affect search under basic stemming.

      Reindexing

      It is advisable to reindex to be sure there are no incompatibilities. Where the data in the forests (tokens or stems) does not match the current behavior, reindexing is recommended. This will have to be a forced reindex or a reload of specific documents containing the offending data. For many languages this can be avoided if queries do not touch on specific cases. For certain languages (see below) the incompatibility is great enough that it is essential to reindex.

      Language Notes

      Below we give some specific information and recommendations for various languages.

      Arabic

      stemming

      The Arabic dictionaries are much larger than before. Implications:  (1) better precision, but (2) slower stemming.

      Chinese (Simplified)

      tokenization

      Tokenization is broadly incompatible.

      The new tokenizer uses a corpus-based language model.  Better precision can be expected.

      recommendation

      Reindex all Chinese (simplified).

      Chinese (Traditional)

      tokenization

      Tokenization is broadly incompatible.

      The new tokenizer uses a corpus-based language model.  Better precision can be expected.

      recommendation

      Reindex all Chinese (traditional).

      Danish

      tokenization

      This language now has algorithmic stemming, and may have slight tokenization differences around certain edge cases.

      recommendation

      Reindex all Danish content if you are using stemming.

      Dutch

      stemming

      There will be much more decompounding in general, but MarkLogic will not decompound certain known lexical items (e.g., "baastardwoorden").

      recommendation

      Reindex Dutch if you want to query with decompounding.

      English

      stemming

      British variants may include the British variant as an additional stem, although the first stem will still be the US variant.

      Stemming produces more alternative stems. Implications are (1) stemming is slightly slower and (2) index sizes are slightly larger (with advanced stemming).

      Finnish

      tokenization

      This language now has algorithmic stemming and may have slight tokenization differences around certain edge cases.

      recommendation

      Reindex all content in this language if you are using stemming.

      French

      See general comments above.

      German

      stemming

      Decompounding now applies to more than just pure noun combinations. For example, it applies to "noun plus adjectives" compound terms. Decompounding is more aggressive, which can result in identification of more false compounds. Implications: (1) stemming is slower, (2) decompounding takes more space, and (3) for compound terms, search gives better recall, with some loss of precision.

      recommendation

      Reindex all German.

      Hungarian

      tokenization

      This language now has algorithmic stemming and may have slight tokenization differences around certain edge cases.

      recommendation

      Reindex all content in this language if you are using stemming.

      Italian

      See general comments above.

      Japanese

      tokenization

      Tokenization is broadly incompatible.

      The tokenizer provides internal flags that the stemmer requires.  This means that (1) tokenization is incompatible for all words at the storage level due to the extra information and (2) if you install a custom tokenizer for Japanese, you must also install a custom stemmer.

      stemming

      Stemming is broadly incompatible.

      recommendation

      Reindex all Japanese content.

      Korean

      stemming

      Particles (e.g., 이다) are dropped from stems; they used to be treated as components for decompounding.

      There is different stemming of various honorific verb forms.

      North Korean variants are not in the dictionary, though they may handled by the algorithmic stemmer.

      recommendation

      Reindex Korean unless you use decompounding.

      Norwegian (Bokmal)

      stemming

      Previously, hardly any decompounding was in evidence; now it is pervasive.

      Implications: (1) stemming is slower, (2) decompounding takes more space, and (3) search gives better recall, with some loss of precision, at least where it comes to compounds.

      recommendation

      Reindex Bokmal if you want to query with decompounding.

      Norwegian (Nynorsk)

      stemming

      Previously hardly any decompounding was in evidence; now it is pervasive.

      Implications: (1) stemming is slower, (2) decompounding takes more space, and (3) search gives better recall, with some loss of precision, at least where it comes to compounds.

      recommendation

      Reindex Nynorsk if you want to query with decompounding.

      Norwegian (generic 'no')

      stemming

      Previously 'no' was treated as an unsupported language; now it is treated as both Bokmal and Nynorsk: for a word present in both dialects, all stem variants from both will be present.

      recommendation

      Do not use 'no' unless you really must; reindex if you want to query it.

      Persian

      See general comments above.

      Portuguese

      stemming

      More precision with respect to feminine variants (e.g., ator vs atriz).

      Romanian

      tokenization

      This language now has algorithmic stemming and may have slight tokenization differences around certain edge cases.

      recommendation

      Reindex all content in this language if you are using stemming.

      Russian

      stemming

      Inflectional variants of cardinal or ordinal numbers are no longer stemmed to a base form.

      Inflectional variants of proper nouns may stem together due to the backing algorithm, but it will be via affix-stripping, not to the nominal form.

      Stems for many verb forms used to be the perfective form; they are now the simple infinitive.

      Stems used to drop ё but now preserve it.

      recommendation

      Reindex all Russian.

      Spanish

      See general comments above.

      Swedish

      stemming

      Previously hardly any decompounding was in evidence; now it is pervasive.

      Implications: (1) stemming is slower, (2) decompounding takes more space, and (3) search gives better recall, with some loss of precision, at least where it comes to compounds.

      recommendation

      Reindex Swedish if you want to query with decompounding.

      Tamil

      tokenization

      This language now has algorithmic stemming and may have slight tokenization differences around certain edge cases.

      recommendation

      Reindex all content in this language if you are using stemming.

      Turkish

      tokenization

      This language now has algorithmic stemming and may have slight tokenization differences around certain edge cases.

      recommendation

      Reindex all content in this language if you are using stemming.

      What is MarkLogic Data Hub?

      MarkLogic’s Data Hub increases data integration agility, in contrast to time consuming upfront data modeling and ETL. Grouping all of an entity’s data into one consolidated record with that data’s context and history, a MarkLogic Data Hub provides a 360° view of data across silos. You can ingest your data from various sources into the Data Hub, standardize your data - then more easily consume that data in downstream applications. For more details, please see our Data Hub documentation.

      Note: Prior to version 5.x, Data Hub was previously known as Data Hub Framework (DHF)

      Takeaways:

      • In contrast to previous versions, Data Hub 5 is largely configuration-based. Upgrading to Data Hub 5 will require either:
        • Conversion of legacy flows from the code-based approach of previous versions to the configuration-based format of Data Hub 5
        • Executing your legacy flows with the “hubRunLegacyFlow” Gradle task
      • It’s very important to verify the “Version Support” information on the Data Hub GitHub README.md before installing or upgrading to any major Data Hub release

      Pre-requisites:

      One of the pre-requisites for installing Data Hub is to check for the supported/compatible MarkLogic Server version. For details, see our version compatibility matrix. Other pre-requisites can be seen here.

      New installations of Data Hub

      We always recommend installing the latest Data Hub version compatible with your current MarkLogic Server version. For example:

      -If a customer is running MarkLogic Server 9.0-7, one should install the most recent compatible Data Hub version (5.0.2), even if the previous Data Hub versions (such as 5.0.1, 5.0.0, 4.x and 3.x) also work with server version 9.0-7.

      -Similarly, if a customer is running 9.0-6, the recommended Data Hub version would be 4.3.1 instead of previous versions 4.0.0, 4.1.x, 4.2.x and 3.x.

      Note: A specific MarkLogic server version can be compatible with multiple Data Hub versions and vice versa, which allows independent upgrades of either Data Hub or MarkLogic Server.

       

      Upgrading from a previous version

      1. To determine your upgrade path, first find your current Data Hub version in the “Can upgrade from” column in the version compatibility matrix.
      2. While Data Hub should generally work with future server versions, it’s always best to run the latest Data Hub version that's also explicitly listed as compatible with your installed MarkLogic Server version.
      3. If required, make sure to upgrade your MarkLogic Server version to be compatible with your desired Data Hub version. You can upgrade MarkLogic Server and Data Hub independently of each other as long as you are running a version of MarkLogic Server that is compatible with the Data Hub version you plan to install. If you are running an older version of MarkLogic Server, then you must upgrade MarkLogic Server first, before upgrading Data Hub.

      Note: Data Hub is not designed to be 'backwards' compatible with any version before the MarkLogic Server version listed with the release. For example, you can’t use Data Hub 3.0.0 on 9.0-4 – you’ll need to either downgrade to Data Hub 2.0.6 while staying on MarkLogic Server 9.0-4, or alternatively upgrade MarkLogic Server to version 9.0-5 while staying on Data Hub 3.0.0.

      • Example 1 - Scenario where you DO NOT NEED to upgrade MarkLogic Server:

               

      • Current Data Hub version: 4.0.0
      • Target Data Hub version: 4.1.x
      • ML server version: 9.0-9
      • The “Can upgrade from” value for the target version shows 2.x which means you need to be at least be on Data Hub 2.x. Since, the current Data Hub version is 4.0.0, this requirement has been met.
      • Unless there is a strong reason for choosing 4.1.x, we highly recommend to upgrade to the latest version compatible with MarkLogic Server 9.0-9 in 4.x - which in this example is 4.3.2. Consequently, the recommended upgrade path here becomes 4.0.0-->4.3.2 instead of 4.0.0-->4.1.x.
      • Since 9.0-9 is supported by the recommended Data Hub version 4.3.2, there is no need to upgrade ML server.
      • Hence, recommended path will be Data Hub 4.0.0-->4.3.2

       

      • Example 2 - Scenario where you NEED to upgrade MarkLogic Server:

                 

      • Current Data Hub version: 3.0.0
      • Target Data Hub version: 5.0.2
      • ML server version: 9.0-6
      • The “Can upgrade from” value for the target version shows Data Hub version 4.3.1 which means you need to be at least be on 4.3.x (4.3.1 or 4.3.2 depending on your MarkLogic Server version). Since the current Data Hub version 3.0.0 doesn’t satisfy this requirement, upgrade path after this step becomes Data Hub 3.0.0-->4.3.x
      • As per the matrix, the latest compatible Data Hub version for 9.0-6 is 4.3.1, so the path becomes 3.0.0-->4.3.1
      • From the matrix, the minimum supported MarkLogic Server version for 5.0.2 is 9.0-7, so you will have to upgrade your MarkLogic Server version before upgrading your Data Hub version to 5.0.2.
      • Because 9.0-7 is supported by all 3 versions under consideration (3.0.0, 4.3.1 and 5.0.2), recommended path can be either
        1. 3.0.0-->4.3.1-->upgrade MarkLogic Server version to at least 9.0-7-->upgrading Data Hub version to 5.0.2
        2. Upgrading MarkLogic Server version to at least 9.0-7-->upgrade Data Hub from 3.0.0 to 4.3.1-->upgrade Data Hub version to 5.0.2
      • Recall that Data Hub 5 moved to a configuration-based approach from previous versions’ code-based approach. Upgrading to Data Hub 5 from a previous major version will require either:
        • Conversion of legacy flows from the code-based approach of previous versions to the configuration-based format of Data Hub 5
        • Executing your legacy flows with the “hubRunLegacyFlow” Gradle task

      Links for Reference:

      https://docs.marklogic.com/datahub/upgrade.html

       

       

       

       

      Summary

      In addition to the multiple language support in MarkLogic Server, MarkLogic Server also supports ISO codes listed below for representation of names for these languages.

       

      MarkLogic supported ISO codes

      MarkLogic supports following ISO codes for the representation of language names:
      1. ISO 639-1
      2. ISO 639-2/T , and
      3. ISO 639-2/B

      Further, NOTE:
      a. MarkLogic uses the 2-letter ISO 639-1 codes, including zh's zh_Hant variant, and
      b. MarkLogic uses the 3-letter ISO 639-2 codes. To get a more specific list of ISO 639-2 codes go to http://www.loc.gov/standards/iso639-2/php/code_list.php


      Again, MarkLogic only supports below listed languages, http://docs.marklogic.com/guide/search-dev/languages#id_64343
      English
      French
      Italian
      German
      Russian
      Spanish
      Arabic
      Chinese (Simplified and Traditional)
      Korean
      Persian (Farsi)
      Dutch
      Japanese
      Portuguese
      Norwegian (Nynorsk and Bokmål)
      Swedish

       

      Suggestion

      The function cdict:get-languages() can be used to get ISO Codes for all supported languages. Here is an example of the usage:

        xquery version "1.0-ml";
        import module namespace cdict = "http://marklogic.com/xdmp/custom-dictionary" 
      		  at "/MarkLogic/custom-dictionary.xqy";
      
        cdict:get-languages()
      
        ==> ("en", "ja", "zh", "zh_Hant")

       

      Summary

      There are many different kinds of locks present in MarkLogic Server.

      Transaction locks are obtained when MarkLogic Server detects the potential of a transaction to change the database, at which point the server considers it to be an update transaction. Once a lock is acquired, it is held until the transaction ends. Transaction locks are set by MarkLogic Server either explicitly or implicitly depending on the configured commit mode. Because it's very common to see poorly performing application code written against MarkLogic Server due to unintentional locking, the two concepts of transaction type and commit mode have been combined into a single, simpler control - transaction mode

      MarkLogic Server also has the notion of document and directory locks. Unlike transaction locks, document and directory locks must be set explicitly and are persistent in the database - they are not tied to a transaction. Document locks also apply to temporal documents. Any version of a temporal document can be locked in the same way as a regular document.

      Cache partition locks are used by threads which can make changes to a cache. Threads need to acquire a write lock for both the relevant cache and cache partition before it makes the change.

      Transaction Locks and Commit Mode vs. Transaction Mode

      Transaction lock types are associated with transaction types. Query type transactions do not use locks to obtain a consistent view of data, but rather the state of the data at a particular timestamp. Update type transactions have the potential to change the database and therefore require locks on documents to ensure transactional integrity. 

      So - if an update transaction type is run in explicit commit mode, then locks are acquired for all statements in an update transaction -  whether or not those statements perform updates. Once a lock is acquired, it is held until the transaction ends. If an update transaction type is run in auto commit mode, by default MarkLogic Server detects the transaction type through static analysis of the first statement in that transaction. If the server detects the potential for updates during static analysis, then the transaction is considered an update transaction - which results in a write lock being acquired.

      In multi-statement transactions, if an update transaction type is run in explicit commit mode, then the transaction is an update transaction and locks are acquired for all statements in an update transaction - even if no update occurs. In auto commit mode MarkLogic Server determines the transaction type through static analysis of the first statement. If in auto commit mode, and the first statement is a query, and an update occurs later in that transaction, MarkLogic Server will throw an exception. In multi-statement transactions, the transaction ends only when it is explicitly committed or rolled back. Failure to explicitly commit or roll back a multi-statement transaction might retain locks until the transaction times out or reaches the end of the session - at which point the transaction rolls back.

      Best practices:

      1) Avoid unnecessary transaction locks or holding on to transaction locks for too long. For single-statement transactions, do not explicitly set the transaction type to update if running a query. For multi-statement transactions, always explicitly commit or rollback the relevant transaction to free transaction locks as soon as possible.

      2) It's very common for users to write code that unintentionally takes write locks. One of the best ways to avoid unintentional locks is to use transaction modes instead of transaction types/commit modes. Transaction modes combines transaction type and commit mode into a single configurable value. You can read more about transaction mode in our documentation at: Transaction Mode Overview.

      3) Be aware that when setting transaction mode, the xdmp:commit and xdmp:update XQuery prolog options affect only the next transaction created after their declaration; they do not affect an entire session. Use xdmp:set-transaction-mode or xdmp.setTransactionMode if you need to change the transaction mode settings at the session level.

      Document and Directory Locks

      Document and directory locks are not tied to a transaction. The locks must be explicitly set and stored as a lock document in a MarkLogic Server database. So the locks can last a specified time period or be persistent until explicitly unlocked.

      Each document and directory can have a lock. The lock can be used as part of an application's update strategy. MarkLogic Server provides the flexibility for client to set up a policy of how to use the locks that suitable for client environment. For example, if only one user is allowed to update the specific database objects, you can set the lock to be "exclusive." In contrast, if you have multiple users updating the same database object, you can set the lock to be "shared."

      Unlike transaction locks, document and directory locks are persistent in the database and are consequently searchable.   

      Temporal Document Locks

      A temporal collection contain bi-temporal or uni-temporal documents. Each version of a temporal document can be locked in the same way as a regular, non-temporal document.

      Cache and Cache Partition Locks

      If a thread attempts to make a change to database cache, it needs to acquire a write lock for the relevant cache and cache partition. This cache or cache partition write lock is serializes write access, which keep date in the relevant cache or cache partition thread-safe. While cache and cache partition locks are short-lived, be aware that in the case of a single cache partition, all of the threads needing to access that would need to serialize through a single cache partition write lock. For multiple cache partitions, multiple write locks can be acquired with one lock per partition - which allows multiple threads to make concurrent cache partition updates.

      References and Additional Reading:

      1) Understanding Transactions in MarkLogic Server

      2) Cache Partitions

      3) Document and Directory Locks

      4) Understanding Locking in MarkLogic Server Using Examples

      5) Understanding XDMP-DEADLOCK

      6) Understanding the Lock Trace Diagnostic Trace Event

      7) How MarkLogic Server Supports ACID Transactions

      With the release of MarkLogic Server versions 8.0-8 and 9.0-4, detailing memory use broken out by major areas is periodically recorded to the error log. These diagnostic messages can be useful for quickly identifying memory resource consumption at a glance and aid in determining where to investigate memory-related issues.

      Error Log Message and Description of Details

      At one hour intervals, an Info level log message will be written to the server error log in the following format:

      Info: Memory 18% phys=147456 virt=246146(166%) rss=27330(18%) anon=53794(36%) file=250(0%) forest=1021(0%) cache=40960(27%) registry=1(0%)

      The error log entry contains memory-related figures for non-zero statistics: Raw figures are in megabytes; Percentages are relative to the amount of physical memory reported by the operating system. The figures include:

      Memory: Percentage of physical memory consumed by the MarkLogic Server process;
      phys: Size of physical memory in the machine ;
      virt: Size of virtual address space reported by the operating system. This figure is often greater than 100%;
      swap: The amount of swap consumed by the MarkLogic Server process;
      rss: Resident Set Size reported by the operating system;
      anon: Anonymous mapped memory used by the MarkLogic Server;
      file: Total amount of memory-mapped data files used the MarkLogic Server. (The MarkLogic Server executable itself, for example, is memory-mapped by the operating system, but is not included in this figure.) ;
      forest: Forest-related memory allocated by the MarkLogic Server process;
      cache: User configured cache memory (list cache, expanded tree cache, etc) consumed by the MarkLogic Server process;
      registry: Amount of memory consumed by registered queries;
      huge: Huge page memory reserved by the operating system, and percentage comparing this to total physical memory;
      join: Memory consumed by joins for active running queries within the MarkLogic Server process, and percentage comparing this to total physical memory;
      unclosed: Unclosed memory, signifying memory consumed by unclosed or obsolete stands still held by the MarkLogic Server process, and percentage comparing this figure to total physical memory.

      In addition to reporting once an hour, the Info level error log entry is written whenever the amount of main memory used by MarkLogic Server changes by more than five percent from one check to the next. MarkLogic Server will check the raw metering data obtained from the operating system once per minute. If metering is disabled, the check will not occur and no log entries will be made.

      With the release of MarkLogic Server versions 8.0-8 and 9.0-5, this same information will be available in the output from the function xdmp:host-status().

      <host-status xmlns="http://marklogic.com/xdmp/status/host">
      . . .
      <memory-process-size>246162</memory-process-size>
      <memory-process-rss>27412</memory-process-rss>
      <memory-process-anon>54208</memory-process-anon>
      <memory-process-rss-hwm>73706</memory-process-rss-hwm>
      <memory-process-swap-size>0</memory-process-swap-size>
      <memory-system-pagein-rate>0</memory-system-pagein-rate>
      <memory-system-pageout-rate>14.6835</memory-system-pageout-rate>
      <memory-system-swapin-rate>0</memory-system-swapin-rate>
      <memory-system-swapout-rate>0</memory-system-swapout-rate>
      <memory-size>147456</memory-size>
      <memory-file-size>279</memory-file-size>
      <memory-forest-size>1791</memory-forest-size>
      <memory-unclosed-size>0</memory-unclosed-size>
      <memory-cache-size>40960</memory-cache-size>
      <memory-registry-size>1</memory-registry-size>
      . . .
      </host-status>


      Additionally, with the release of MarkLogic Server 8.0-9.3 and 9.0-7, Warning-level log messages may be reported when the host is low on memory — the messages will indicate the areas involved, for example:

      Warning: Memory low: forest+cache=97%phys

      The messages are reported if the total memory used by the mentioned areas is greater than 90% of physical memory (phys). As best practice, the total of the areas should never be more than around 80% of physical memory, and should be even less if you are using the host for query processing.

      If the hosts are regularly encountering these warnings, remedial action to support the memory requirements might include:

      • Adding more physical memory to each of the hosts;
      • Adding additional hosts to the cluster to spread the data across;
      • Adding additional forests to any under-utilized hosts.

      Other action might include:

      • Archiving/dropping any older forest data that is no longer used;
      • Reviewing the group level cache settings to ensure they are not set too high, as they make up the cache part of the total. For reference, default (and recommended) group level cache settings based on common RAM configurations may be found in our Group Level Cache Settings based on RAM Knowledgebase article.

      Summary

      This enhancement to MarkLogic Server allows for easy periodic monitoring of memory consumption over time, and records it in a summary fashion in the same place as other data pertaining to the operation of a running node in a cluster. Since all these figures have at their source raw Meters data, more in-depth investigation should start with the Meters history. However, having this information available at a glance can aid in identifying whether memory-related resources need to be explored when investigating performance, scale, or other like issues during testing or operation.

      Introduction

      The MarkLogic Monitoring History feature allows you to capture and view critical performance data from your cluster. By default, this performance data is stored in the Meters database. This article explains how you can plan for the additional disk space required for the Meters database.

      Meters Database Disk Usage

      Just like any other database, Meters database is also made up of forests which in turn are made up of stands that reside physically on-disk. As Meters database is used by Monitoring History to store critical performance data of your cluster, the amount of information can grow significantly with more number of hosts, forests, databases etc. Thus the need to plan and manage the disk space required by Meters database.

      Recommendation

      Meters database stores critical performance data of your cluster. The size of data is proportional to the number of hosts, app servers, forests, databases etc. Typically, the raw retention settings have the largest impact on size.

      MarkLogic's recommendation for a new install is to start with the default settings and monitor usage over the first two weeks of an install. The performance history charts, constrained to just show the Meters database, will show an increasing storage utilization over the first week, then leveling off for the second week. This would give you a decent idea of space utilization going forward.

      You can then adjust the number of days of raw measurements that are retained.

      You can also add additional forests to spread the Meters database over more hosts if needed.

      Monitoring History

      The Monitoring History feature allows you to capture and view critical performance data from your cluster. Monitoring History capture is enabled at the group level. Once the performance data has been collected, you can view the data in the Monitoring History page.

      By default, the performance data is stored in the Meters database. A consolidated Meters database that captures performance metrics from multiple groups can be configured, if there is more than one group in the cluster.

      Monitoring History Data Retention Policy

      How long the performance data should be kept in the Meters database before it is deleted can be configured with the data retention policy. (http://docs.marklogic.com/guide/monitoring/history#id_80656)

      If it is observed that meters data is not being cleared according to the retention policy, the first place to check would be the range indexes configured for the Meters database.

      Range indexes and the Meters Database

      Meters database is configured with a set of range indexes which, if not configured correctly (or not present) can prevent the cleaning up of Meters database according to the set retention policy.

      It is possible to have missing or misconfigured range indexes in either of the below scenarios

      •  if the cluster was upgraded from a version of ML before 7.0 and the upgrade had some issues
      •  if the indexes were manually created (when using another database for meters data instead of the default Meters database)

      The size of the meters database can grow significantly as the cluster grows, so it is important that the meters database is cleared per the retention policy.

      The required indexes (as of 8.0-5 and 7.0-6) are attached as an ML Configuration Manager package(http://docs.marklogic.com/guide/admin/config_manager#id_38038). Once these are added, the Meters database will reindex and the older data should be deleted.

      Note that deletion of data older than the retention policy occurs no sooner than the retention policy. Data older than the retention policy may still be maintained for an unspecified amount of time.

      Related documentation

      http://docs.marklogic.com/guide/monitoring

      https://help.marklogic.com/Knowledgebase/Article/View/259/0/metering-database-disk-space-requirements

       

       

       

       

       

       

       

       

       

       

       

       

      SUMMARY:

      Prior to MarkLogic 4.1-5, role-ids were randomly generated.  We now use a hash algothm that ensures that roles created with the same name will be assigned the same role-id.  When attempting to migrate data from a forest created prior to MarkLogic 4.1-5 to a newer installation can cause the user to be met with a "role not defined error".  In order to work around this issue, we will need to create a new role with the role-id defined in the legacy system. 

      Procedure:

      This process creates a new role with the same role-id from your legacy installation and assigns this old role to your new role with the correct name.

      Step 1: You will need to find the role-id of the legacy role. This will need to be run against the security DB on the legacy server. 

      <code>

      xquery version "1.0-ml";
      import module namespace sec="http://marklogic.com/xdmp/security" at
      "/MarkLogic/security.xqy";

      let $role-name := "Enter Roll Name Here" 

      return
      /sec:role[./sec:role-name=$role-name]/sec:role-id/text()

      </code>


      Step 2: In the new environment, store the attached module to the following location on the host containing the security DB.

      /opt/MarkLogic/Modules/role-edit/create-master-role.xqy

      Step 3: Ensure that you have created the role on the new cluster.

      Step 4: Run the following code against the new clusters security DB. This will create a new role with the legacy role-id. Be sure to enter the role name, description, and role-id from Step 1.

      <code>
      xquery version "1.0-ml";
      import module namespace cmr="http://role-edit.com/create-master-role" at
      "/role-edit/create-master-role.xqy";

      let $role-name := "ENTER ROLE NAME"
      let $role-description := "ENTER ROLE DESCRIPTION"
      let $legacy-role-id := 11658627418524087702 (: Replace this with the Role ID from Step 1:)

      let $legacy-role := fn:concat($role-name,"-legacy")
      let $legacy-role-create := cmr:create-role-with-id($legacy-role, $role-description, (), (), (), $legacy-role-id)

      return
      fn:concat("Inserted role named ",$legacy-role," with id of ",$legacy-role-id)

      </code>


      Step 5: Run the following code against the new clusters security database to assign the legacy role to the new role.

      <code>
      xquery version "1.0-ml";
      import module namespace sec="http://marklogic.com/xdmp/security" at
      "/MarkLogic/security.xqy";

      let $role-name := "ENTER ROLE NAME"
      let $legacy-role := fn:concat($role-name,"-legacy")

      return
      (
      sec:role-set-roles($role-name, ($legacy-role)),
      "Assigned ",$legacy-role," role to ",$role-name," role"
      )

      </code>

       

      You should now have a new role named [your-role]-legacy.  This legacy role will contain the role-id from your legacy installation and will be assigned to [your-role] on the new installation.  Legacy documents in your DB will now have the same rights they had in the legacy system.

      Introduction

      Those familiar with versions of MarkLogic Server prior to MarkLogic 7 may have heard the 3X disk space rule being mentioned. At the time of writing, references to are to be found in the MarkLogic 5 documentation and the MarkLogic 6 documentation

      The Monitoring Metrics of Interest section in the Monitoring MarkLogic Guide refers to the 3X rule as during a preparatory question on disk allocation for a database:

      • Is there enough disk space for forest data and merges? Merges require at least twice as much free disk space as used by the forest data (3X rule). If a merge runs out of disk space, it will fail.

      For anyone reading the requirements guidelines for MarkLogic 7 (and above), you may have noticed a section that suggests that you should plan to ensure disk space is available to:

      • 1.5 times the disk space of the total forest size. Specifically, each forest on a filesystem requires its filesystem to have at least 1.5 times the forest size in disk space (or, for each forest less than 32GB, 3 times the forest size). This translates to 1.5 times the disk space of the source content after it is loaded.

        For example, if you plan on loading content that will result in a 100 GB database, reserve at least 150GB of disk space. The disk space reserve is required for merges.

      This Knowledgebase article will cover both requirements and offer some further guidance as to how to plan and size your databases and - crucially - how you can take advantage of the newer 1.5X rule.

      3X

      The original logic behind the allocation of 3X disk space was to provide ample space to allow for a situation where a database is fully reindexed. The allocation would be in thirds according to the following measures:

      1. Your Data
      2. Space for reindexing
      3. Space for merges

      The 3X disk provision rule was offered as a very general (and very safe for production) rule to cover the most extreme example where your data gets reindexed in its entirety and then merges have to take place on top of that.

      ... but why 3X?

      To understand this, we need to briefly explore what happens when a document is updated in MarkLogic Server.

      As an update is made to a document - and the same rule applies to an update to a document when index changes are concerned - the transaction takes place at a given timestamp (a given point in time). At that point, the original fragment is marked as deleted and a new fragment is written to an in-memory-stand. Eventually, the in-memory stand is written to disk.

      For a period of time - especially at times where a MarkLogic instance/cluster is busy performing a large number of updates - it's likely that there will be occasions where two versions of the same fragment exist in different stands on disk; one stand will contain the fragment now marked as deleted and the other stand will contain the newly written fragment - which will be used by any subsequent queries running at later timestamps.

      ... so that covers 2X - what about the other third?

      When a merge takes place, merge candidate stands are identified and a new stand is created. As the candidate stands are read through, the active fragments are copied over to the new stand.

      At the point where the merge takes place, the new stand coexists with the older stand because - like updates and reindexing - queries will still need to run against the candidate stands; the timestamp will only get moved on to accommodate the data in the new stand as soon as the process has completed in it's entirety.

      While all of this is taking place, other updates could be taking place to documents in other stands and the same rules apply to those fragments too.

      So the 3X rule provides a true safeguard; allowing for a situation where forest sizes are likely to swell way above and beyond the size of the data they contain, to accommodate the fragments marked deleted for queries at earlier timestamps and to accommodate the additional headroom required by a merge of some very large stands.

      1.5X

      Some changes were made in MarkLogic 7 which effectively reduce the footprint of your data on-disk. With some careful planning, you can take advantage of the lower sizing rule.

      While the documentation still acknowledges the 3X rule (which is still true if you're performing an upgrade directly from MarkLogic 6 or earlier without making any other configuration changes), a new default configuration has been introduced to databases created under MarkLogic 7; this is the merge max size

      What does the merge max size do?

      This setting enforces an upper limit of 32GB on the size of an individual stand.

      With previous versions of the product, the expectation would be for the contents of a forest to merge down to one large stand. That is: given a quiesced database, on full completion of a merge, all content (all active fragments) should be in a single stand.

      For databases on MarkLogic 7 (and later), you can now expect to see more stands - each with a maximum size of 32GB.

      This means you should expect to see your data in more stands than you would have done on prior versions of the product, but it also means that you can lower the amount of disk space you need due to this size restriction.

      From MarkLogic 7 and onwards - with the merge max size correctly set - the largest amount of space a single merge operation should require would be 64GB

      ... but why 1.5X?

      If we return to this line in the documentation:

      • For example, if you plan on loading content that will result in a 100 GB database, reserve at least 150GB of disk space. The disk space reserve is required for merges.

      Given that we now have an upper limit on the size of a stand (32GB), as two smaller stands are being merged to create the new, larger stand and given the space required by other concurrent operations that may be taking place in other stands, a space limit of 1.5X should now cover any merges (and subsequent updates to documents).

      For further understanding or the 1.5X rule, read our knowledgebase article 'Explanation of the 1.5X Disk Space Requirement' .

      How do I find out whether my database is configured for this new merge max size?

      If you're on the admin interface at http://[yourhostname]:8001

      Go to: Configure > Databases > [Your Database Name] > Merge Policy

      On the right-hand panel, you should see the merge max size; the default should now be 32768

      Important caveats

      MarkLogic 7 is designed to allow you to work with more stands. While it's safe to say that you should be concerned when you see a system with a very large number of small stands exists, a slightly different rule requires a shift in thinking and this has implications in particular when you start to think about applying the 1.5x disk space rule in your environment.

      In releases prior to MarkLogic 6, the expectation (over time) was that all data in a forest would ultimately attempt to get merged into a single stand.

      In MarkLogic 7, at least with the default setting of the merge-max-size (to 32768 - 32GB), it is understood that a reasonably large forest would now be divided into a number of 32GB stands.

      If you are strictly following this rule for all reasonably large forests on your system - then the 1.5x rule can safely be used operationally in a production environment, but reliance on the rule should require careful management when migrating an existing system as running out of disk space can have catastrophic consequences for a live system.

      For very small forests, the 1.5X rule does not apply.  Due to the 32GB stand size overhead, your forests need to be sufficiently larger in order to use the 1.5X rule. 

      You should treat the 1.5x rule as an absolute minimum requirement for disk space for a given database. If you are going to use it, we would recommend having a strategy in place for allocating more space until you are confident that the cluster can run safely within the lower (1.5x) boundaries.

      I'm upgrading from an earlier version of MarkLogic to MarkLogic 7 - I have changed the merge max size to 32768. Can I reclaim the disk space?

      It's important to note that the 1.5x guidelines will only work if your forests all contain stands that have the new maximum size of 32GB. If your forests still contain larger stands, you'll need to break these down before you can consider reclaiming disk space. 

      ... Breaking Large Stands Down

      If your forests contain stands larger than 32 GB, you will want to break these stands down in order to take advantage of the lower disk space requirements.

      Different techniques can be followed to break the stands and reclaim disk space:

      1. Re-ingesting the content of the forests with large stands - When documents are re-ingested in a forest, the old fragments will be marked as deleted and the new fragment will be written to a new stand. Once there are sufficient deleted fragments, the large stands will be merged down into smaller stands.
      2. Perform re-indexing – A Forced re-index will update every fragment in the database, effectively re-loading the content - the original fragments will be marked as deleted and the new fragments will be written to a new stand. Once there are sufficient deleted fragments, the large stands will be merged down into smaller stands.  
      3. Forest rebalancing  - Rebalance active fragments from existing forests and retire old forest with Max Merge Size configured, this will merge out deleted fragments in old stand and maintain active fragments in smaller stand/stands in other rebalanced forests.

      Conclusion

      The major points for the 1.5X rule:

      • The estimated 1.5X disk space utilization is only true for databases where merge-max-size is correctly set and for forests that are sufficiently large. For databases created in MarkLogic Server v7 or later, the default merge-max-size is to 32768 (32GB)
      • If you're upgrading from earlier releases, you would need to make sure you set this value as part of your upgrade process.
        • After upgrading from a version previous to MarkLogic 7, you will have to take explicit steps to decrease the size of any pre-existing large stands. 

       

      Summary

      New and updated mimetypes were added for MarkLogic 8.  If your MarkLogic Server instance has customized mimetypes, the upgrade to MarkLogic Server v8.0-1 will not update the mimetypes table. 

      Details

      MarkLogic 8 includes the following new mimetype values:

      Name    Extension Format
      application/json json json
      application/rdf+json rj json
      application/sparql-results+json srj json
      application/xml xml xsd xvs sch    xml
      text/json   json
      text/xml   xml
      application/vnd.marklogic-javascript     sjs text
      application/vnd.marklogic-ruleset rules text

      If you upgraded to 8.0 from a previous version of MarkLogic Server and if you have ever customized your mimetypes (for example, using the MIME Types Configuration page of the Admin Interface), the upgrade will not automatically add the new mimetypes to your configuration. If you have not added any mimetypes, then the new mimetypes will be automatically added during the upgrade. You can check if you have these mimetypes configured by going to the Mimetype page of the Admin Interface and checking if the above mimetypes exist. If they exist, then there is nothing you need to do.

      Effect

      Not having these mimetypes may lead to application level failures - for example: running Javascript code via Query Console will fail. 

      Resolving Manually

      If you do not have the above mimetypes after upgrading to 8.0, you can manually add the mimetypes to your configuration using the Admin Interface. To manually add the configuration, perform the following

      1. Open the Admin Interface in a browser (for example, open http://localhost:8001).
      2. Navigate to the Mimetypes page, near the bottom of the tree menu.
      3. Click the Create tab.
      4. Enter the name,the extension, and the format for the mimetype (see the table above).
      5. Click OK.
      6. Repeat the preceding steps for each mimetype in the above table.

      Please be aware that updating the mimetype table results in a MarkLogic Server restart.  You will want to execute this procedure when MarkLogic Server is idle or during a maintenance window.

      Resolve by Script

      Alternatively, if you do not have the above mimetypes after upgrading to 8.0, you can add the mimetypes to your configuration by executing the following script in Query Console:

      xquery version "1.0-ml";

      import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";
      declare namespace mt = "http://marklogic.com/xdmp/mimetypes";

      let $config := admin:get-configuration()
      let $all-mimetypes := admin:mimetypes-get($config) (: existing mimetypes defined :)
      let $new-mimetypes := (admin:mimetype("application/json""json""json"),
          admin:mimetype("application/rdf+json""rj""json"),
          admin:mimetype("application/sparql-results+json""srj""json"),
          admin:mimetype("application/xml""xml xsd xvs sch""xml"),
          admin:mimetype("text/json""""json"),
          admin:mimetype("text/xml""""xml"),
          admin:mimetype("application/vnd.marklogic-javascript", "sjs", "text"),
          admin:mimetype("application/vnd.marklogic-ruleset", "rules", "text"))
      (: remove intersection to avoid conflicts :)
      let $delete-mimetypes :=
          for $mimetype in $all-mimetypes
          return if ($mimetype//mt:name/data() = $new-mimetypes//mt:name/data()) then $mimetype else ()
      let $config := admin:mimetypes-delete($config, $delete-mimetypes)
      (: save new mimetype definitions :)
      return admin:save-configuration( admin:mimetypes-add( $config, $new-mimetypes))
      (: executing this query will result in a restart of MarkLogic Server :)

      Please be aware that updating the mimetype table results in a MarkLogic Server restart.    You will want to execute this script when MarkLogic Server is idle or during a maintenance window.

      Fixes

      At the time of this writting, it is expected that the upgrade scripts will be improved in a maintenance release of MarkLogic Server where these updates will occur automatically.

      Introduction

      In this article, we discuss use of xdmp:cache-status in monitoring cache status, and explain the values returned.

      Details

      Note that this is a relatively expensive operation, so it’s not something to run every minute, but it may be valuable to run it occasionally for information on current cache usage.

      Output format

      The values returned by xdmp:cache-status are per host, defaulting to the current host. It takes an optional host-id to allow you to gather values from a specific host in the cluster.

      The output of xdmp:cache-status will look something like this:

      <cache-status xmlns="http://marklogic.com/xdmp/status/cache">
        <host-id>18349804367231394552</host-id>
        <host-name>macpro-2113.local</host-name>
        <compressed-tree-cache-partitions>
          <compressed-tree-cache-partition>
            <partition-size>512</partition-size>
            <partition-table>0.2</partition-table>
            <partition-used>0.8</partition-used>
            <partition-free>99.2</partition-free>
            <partition-overhead>0</partition-overhead>
          </compressed-tree-cache-partition>
        </compressed-tree-cache-partitions>
        <expanded-tree-cache-partitions>
          <expanded-tree-cache-partition>
            <partition-size>1024</partition-size>
            <partition-table>0.7</partition-table>
            <partition-busy>0</partition-busy>
            <partition-used>30.4</partition-used>
            <partition-free>69.6</partition-free>
            <partition-overhead>0</partition-overhead>
          </expanded-tree-cache-partition>
        </expanded-tree-cache-partitions>
        <list-cache-partitions>
          <list-cache-partition>
            <partition-size>1024</partition-size>
            <partition-table>0.2</partition-table>
            <partition-busy>0</partition-busy>
            <partition-used>0</partition-used>
            <partition-free>100</partition-free>
            <partition-overhead>0</partition-overhead>
          </list-cache-partition>
        </list-cache-partitions>
        <triple-cache-partitions>
          <triple-cache-partition>
            <partition-size>1024</partition-size>
            <partition-busy>0</partition-busy>
            <partition-used>0</partition-used>
            <partition-free>100</partition-free>
          </triple-cache-partition>
        </triple-cache-partitions>
        <triple-value-cache-partitions>
          <triple-value-cache-partition>
            <partition-size>512</partition-size>
            <partition-busy>0</partition-busy>
            <partition-used>0</partition-used>
            <partition-free>100</partition-free>
          </triple-value-cache-partition>
        </triple-value-cache-partitions>
      </cache-status>
      

      Values

      cache-status contains information for each partition of the caches:

      • The list cache holds search term lists in memory and helps optimize XPath expressions and text searches.
      • The compressed tree cache holds compressed XML tree data in memory. The data is cached in memory in the same compressed format that is stored on disk.
      • The expanded tree cache holds the uncompressed XML data in memory (in its expanded format).
      • The triple cache hold triple data.
      • The triple value cache holds triple values.

      The following are descriptions of the values returned:

      • partition-size: The size of a cache partition, in MB.
      • partition-table: The percentage of the table for a cache partition that is currently used. The table is a data structure that has a fixed overhead per cache entry, for cache admin. This will fix the number of entries that can be resident in the cache. If the partition table is full, something will need to be removed before another entry can be added to the cache.
      • partition-busy: The percentage of the space in a cache partition that is currently used and cannot be freed.
      • partition-used: The percentage of the space in a cache partition that is currently used.
      • partition-free: The percentage of the space in a cache partition that is currently free.
      • partition-overhead: The percentage of the space in a cache partition that is currently overhead.

      When do I get errors?

      You will get a cache-full error when nothing can be removed from the cache to make room for a new entry.

      The "partition-busy" value is the most useful indicator of getting a cache-full error. It tells you what percent of the cache partition is locked down and cannot be freed to make room for a new entry. 

       

      We do not recommend configuring multiple forests for the Security database, as this can cause failover issues when doing upgrades and restarts. Security database should have a single primary forest and replicas on all hosts to ensure High Availability.

      Summary

      When restarting very large forests, some customers have noted that it may take a while for them to mount. While the forests are mounting, the database is unable to come online, thus impacting the availability of your main site. This article shows you how to change a few database settings to improve forest-mounting time.

       


       

      When encountering delays with forest mounting time after restarts, we usually recommend the following settings:

      format-compatibility set to the latest format
      expunge-locks set to none
      index-detection set to none

      Additionally, some customers might be able to spread out the work of memory mapping forest indexes by setting preload-mapped-data to false - though it should be noted that instead of the necessary time being taken during the mounting of the forest, memory-mapped file data will be loaded on demand through page faults as the server accesses it.

      While the above settings should help with forest mounting time, in general, their effects can be situationally dependent. You can read more about each of these settings in our documentation here: http://docs.marklogic.com/admin-help/database. In particular:


      1) Regarding format compatability: "The automatic detection occurs during database startup and after any database configuration changes, and can take some time and system resources for very large forests and for very large clusters. The default value of automatic is recommended for most installations." So to your question, while automatic is recommended in most cases, you should try changing the setting if you're seeing long forest mount times.

      2) Regarding expunge-locks: "Setting this to none is only recommended to speed cluster startup time for extremely large clusters. The default setting of automatic, which cleans up the locks as they expire, is recommended for most installations."

      3) Regarding index-detection: "This detection occurs during database startup and after any database configuration changes, and can take some time and system resources for very large forests and for very large clusters. Setting this to none also causes queries to use the current database index settings, even if some settings have not completed reindexing. The default value of automatic is recommended for most installations"

      It may also be worth considering why forests are taking a long time to mount. If your data size has grown significantly over the lifetime of the affected database, it might be the case that your forests are now overly large, in which case a better approach might be to instead distribute the data across more forests.

      Introduction
       
      MarkLogic Server's 'DatabaseClient' instance represents a database connection sharable across threads. The connection is stateless, except that authentication is done the first time a client interacts with the database via a Document Manager, Query Manager, or other manager. For instance: you may instantiate a DatabaseClient as follows:
       
      // Create the database client

      DatabaseClient client = DatabaseClientFactory.newClient(host, port,
                                                user, password, authType);

      And release it as follows:
      // release the client
      client.release();

      Details on DatabaseClient Usage

      To use the Java Client API efficiently, it helps to know a little bit about what goes on behind the scenes.

      You specify the enode or load balancer host when you create a database client object.  Internally, the database client object instantiates an Apache HttpClient object to communicate with the host.

      The internal Apache HttpClient object creates a connection pool for the host.  The connection pool makes it possible to reuse a single persistent HTTP connection for many requests, typically improving performance.

      Setting up the connection pool has a cost, however.

      As a result, we strongly recommend that applications create one database client for each unique combination of host, database, and user.  Applications should share the database client across threads.  In addition, applications should keep a reference to the database client for the entire life of the application interaction with that host.


      For instance, a servlet might create the database client during initialization and release the database client during destruction. The same servlet may also use two separate database client instances with different permissions, one for read-only users and one with read/write permissions for editors. In the latter case, both client instances are used throughout the life of the servlet and destroyed during client destruction.

      Summary

      Clock synchronization plays a critical part in the operation of a MarkLogic Cluster.

      MarkLogic Server expects the system clocks to be synchronized across all the nodes in a cluster, as well as between Primary and Replica clusters. The acceptable level of clock skew (or drift) between hosts is less than 0.5 seconds, and values greater than 30 seconds will trigger XDMP-CLOCKSKEW errors, and could impact cluster availability.

      Tools

      Network Time Protocol (NTP) is the recommended solution for maintaining system clock synchronization.  NTP services can be provided by public (internet) servers, private servers, network devices, peer servers and more.

      NTP Basics

      NTP uses a daemon process (ntpd) that runs on the host.  The ntpd periodically wakes up, and polls the configured NTP servers to get the current time, and then adjust the local system clock as necessary.  Time can be adjusted two ways, by immediately changing to the correct time, or by slowly speeding up or slowing down the system clock as necessary until it has reached the correct time. The frequency that the ntpd wakes up, called the polling interval, can be adjusted based on the level of accuracy needed anywhere between 1 and 17 minutes.  NTP uses a hierarchy of servers called a strata.  Each strata synchronizes with the layer above it, and provides synchronization to the later below it.

      Public NTP Reference Servers

      There are many public NTP reference servers available for time synchronization.  It's important to note that the most common public NTP reference server addresses are for a pool of servers, so hosts synchronizing against them may end up using different physical servers.  Additionally, the level of polling recommended for cluster synchronization is usually higher, and excessive polling could result in the reference server throttling or blocking traffic from your systems.

      Stand Alone Cluster

      For a cluster that is not replicated or connected to another cluster in some way, the primary concern is that all the hosts in the cluster be in sync with each other, rather than being accurate to UTC.

      Primary/Replica Clusters

      Clusters that act as either Primary or Replicas need to be synchronized with each other for replication to work correctly.  This usually means that the hosts in both clusters should reference the same NTP servers.

      NTP Configuration

      It is common to have multiple servers referenced in the NTP configuration file, /etc/ntpd.conf. NTP may not choose the server based on the order in the file.  Because of this, hosts could synchronize with different reference servers, introducing differences in the system clocks between the hosts in the cluster. Most organizations may have devices that can act as NTP servers in their infrastructure already, as many network devices are capable of acting as NTP servers, as are Windows Primary Domain Controllers.  These devices can use default polling intervals, which avoids excessive polling against public servers.

      Once you have identified your NTP server, you can configure NTP on the cluster hosts. We suggest using a single reference server for all the cluster hosts, then add all the hosts in the cluster as peers of the current node.  We also suggest adding an entry for the local host as it's own server, assigning it a low strata.  Using peers, and the local host allows the cluster hosts to negotiate and choose one of them to act as the reference server, providing redundancy in case the reference server is unavailable.

      The following is a sample ntpd.conf file:

      #The current host has an ip of 10.10.0.1
      server ntpserver burst iburst minpoll 4 maxpoll 4
       
      #All of the cluster hosts are peered with each other.
      peer mlHost01 burst iburst minpoll 4 maxpoll 4
      peer mlHost02 burst iburst minpoll 4 maxpoll 4
      peer mlHost03 burst iburst minpoll 4 maxpoll 4
       
      #Add the local host so the peered servers can negotiate
      # and choose a host to act as the reference server
      server 10.10.0.1
      fudge 10.10.0.1 stratum 10

      The burst option sends a burst of 8 packets when polling to increase the average of time offset statistics.  Using it against a public NTP server is considered abuse.

      The iburst sends a burst of 8 packets at initial synchronization which is designed to speed up the initial synchronization.  Using it against a public NTP server is considered aggressive.

      The minpoll and maxpoll settings are measured in seconds to the power of two, so a setting of 4 is 16 seconds.

      The fudge setting is used to alter the stratum of the server from the default of 0.

      As always, system configuration changes should always be tested and validated prior to putting them into production use.

      References

      Summary

      On March 1, 2016, a vulnerability in OpenSSL named DROWN, a man-in-the-middle attack that stands for “Decrypting RSA with Obsolete and Weakened eNcryption", was announced. All MarkLogic Server versions 5.0 and later are *not* affected by this vulnerability.

      Advisory

      The Advisory reported by OpenSSL.org states

      CVE-2016-0800 (OpenSSL advisory)  [High severity] 1st March 2016: 

      A cross-protocol attack was discovered that could lead to decryption of TLS sessions by using a server supporting SSLv2 and EXPORT cipher suites as a Bleichenbacher RSA padding oracle. Note that traffic between clients and non-vulnerable servers can be decrypted provided another server supporting SSLv2 and EXPORT ciphers (even with a different protocol such as SMTP, IMAP or POP) shares the RSA keys of the non-vulnerable server. This vulnerability is known as DROWN (CVE-2016-0800). Recovering one session key requires the attacker to perform approximately 2^50 computation, as well as thousands of connections to the affected server. A more efficient variant of the DROWN attack exists against unpatched OpenSSL servers using versions that predate 1.0.2a, 1.0.1m, 1.0.0r and 0.9.8zf released on 19/Mar/2015 (see CVE-2016-0703 below). Users can avoid this issue by disabling the SSLv2 protocol in all their SSL/TLS servers, if they've not done so already. Disabling all SSLv2 ciphers is also sufficient, provided the patches for CVE-2015-3197 (fixed in OpenSSL 1.0.1r and 1.0.2f) have been deployed. Servers that have not disabled the SSLv2 protocol, and are not patched for CVE-2015-3197 are vulnerable to DROWN even if all SSLv2 ciphers are nominally disabled, because malicious clients can force the use of SSLv2 with EXPORT ciphers. OpenSSL 1.0.2g and 1.0.1s deploy the following mitigation against DROWN: SSLv2 is now by default disabled at build-time. Builds that are not configured with "enable-ssl2" will not support SSLv2. Even if "enable-ssl2" is used, users who want to negotiate SSLv2 via the version-flexible SSLv23_method() will need to explicitly call either of: SSL_CTX_clear_options(ctx, SSL_OP_NO_SSLv2); or SSL_clear_options(ssl, SSL_OP_NO_SSLv2); as appropriate. Even if either of those is used, or the application explicitly uses the version-specific SSLv2_method() or its client or server variants, SSLv2 ciphers vulnerable to exhaustive search key recovery have been removed. Specifically, the SSLv2 40-bit EXPORT ciphers, and SSLv2 56-bit DES are no longer available. In addition, weak ciphers in SSLv3 and up are now disabled in default builds of OpenSSL. Builds that are not configured with "enable-weak-ssl-ciphers" will not provide any "EXPORT" or "LOW" strength ciphers. Reported by Nimrod Aviram and Sebastian Schinzel.

      Fixed in OpenSSL 1.0.1s (Affected 1.0.1r, 1.0.1q, 1.0.1p, 1.0.1o, 1.0.1n, 1.0.1m, 1.0.1l, 1.0.1k, 1.0.1j, 1.0.1i, 1.0.1h, 1.0.1g, 1.0.1f, 1.0.1e, 1.0.1d, 1.0.1c, 1.0.1b, 1.0.1a, 1.0.1)

      Fixed in OpenSSL 1.0.2g (Affected 1.0.2f, 1.0.2e, 1.0.2d, 1.0.2c, 1.0.2b, 1.0.2a, 1.0.2)

      MarkLogic Server Details

      Marklogic Server disallows SSLv2 and disallows weak ciphers in all supported version.  As a result, MarkLogic Server is not affected by this vulverability.

      Whenever MarkLogic releases a new version of MarkLogic Server, OpenSSL versions are reviewed and updated. 

       

      Introduction

      Ops Director enables you to monitor MarkLogic clusters ranging from a single node to large multi-node deployments. A single Ops Director server can monitor multiple clusters. Ops Director provides a unified browser-based interface for easy access and navigation.

      Ops Director presents a consolidated view of your MarkLogic infrastructure, to streamline monitoring and troubleshooting of clusters with alerting, performance, and log data. Ops Director provides enterprise-grade security of your cluster configuration and performance data with robust role-based access control and information security powered by MarkLogic Server.

      Problems installing Ops Director 2.0.0, 2.0.1 & 2.0.1-1

      Check gradle.properties

      To successfully install Ops Director, the value for mlhost in gradle.properties must have a hostname and that hostname must match the name of one of the hosts in the cluster.  You can not use localhost to install Ops Director, nor can you use a host name other than one that is listed as a host in the cluster as this effects the use of certificates for authentication to the OpsDirectorSystem application server.

      Check for App-Services

      Ops Director can sometimes encounter errors when attempting to install in groups other than Default. To successfully install, the Ops Director installer needs to be able to connect to the App-Services application server on port 8000 in the group where Ops Director is being installed.  There are two ways to work around this issue:

      • Create a copy of the App-Services app server in the new group, then install Ops Director
        • Be aware this allows QConsole access in the new group, for users with appropriate privileges. 
        • If you wish to prevent QConsole access in that group, the App-Services application server should be deleted after Ops Director has been installed.
      • Install Ops Director in the Default group, then move the host to the new group, and create the OpsDirector app servers in the new group.
        • Be aware this allows Ops Director access to remain in the Default group.
        • If you wish to prevent Ops Director access in the Default, the Ops Director application servers should be deleted from the Default group.
          • To do this you must also copy the scheduled tasks associated with Ops Director over to the new group, and delete the scheduled tasks from the old group

      See the attached Workspace OpsDirCopyAppServers.xml which has scripts to do the following:

      • Copy and/or remove the App-Services app server
      • Copy and/or remove the OpsDirectorSystem/OpsDirectorApplication/SecureManage app servers
      • Copy and/or remove the scheduled tasks associated with the Ops Director application.

      Also note that Ops Director will install forests on all hosts in the cluster, regardless of group assignments.

      Managing a Cluster

      Check DNS Settings

      When setting up a managed host, it's important to note that the hosts in both the Ops Director cluster, and the cluster being managed must be able to resolve hostnames via DNS.  Modifying the /etc/hosts file is not sufficient.

      Check Ops Director Scheduled Tasks

      When setting up a managed host, you may encounter a XDMP-DEADLOCK error, or have an issue seeing the data for a managed cluster.  If this occurs do the following:

      • Un-manage the affected cluster.  If there are any issues un-managing the cluster, use the procedures in this KB under the Problems with Un-managing Clusters to un-manage the cluster
      • Disable the scheduled tasks associated with Ops Director
        • /common/tasks/info.xqy
        • /common/tasks/running.xqy
        • /common/tasks/expire.xqy
        • /common/tasks/health.xqy
      • Manage the cluster again
      • Enable the scheduled tasks that were disabled

      Upgrading Ops Director

      When upgrading to a new version of Ops Director, it is frequently necessary to uninstall the previous version.  To do that, you must un-manage any clusters being managed by Ops Director, prior to uninstalling the application.

      Un-managing Clusters

      The first step in uninstalling Ops Director is to remove any clusters from being managed from Ops Director.  This is done via the Admin UI on a host in the managed cluster, as detailed in the Ops Director Guide: Disconnecting a Managed Cluster from Ops Director

      Uninstalling Ops Director 2.0.0 & 2.0.1

      These versions of Ops Director use the ml-gradle plugin for deployment.  To uninstall these versions, you will also use gradle, as detailed in the Ops Director Guide: Removing Ops Director 2.0.0 and 2.0.1

      Uninstalling Ops Director 1.1 or Earlier

      If you are using the 1.1  version that was installed via the Admin UI, then it can be uninstalled via the Admin UI as detailed in the Ops Director Guide: Removing Ops Director 1.1 or Earlier

      Problems with Uninstalling Ops Director

      Occasionally an Ops Director installation may partially fail, due to misconfiguration, or missing dependencies.  Issues can also occur that prevent the standard removal methods from working correctly.  In these cases, Ops Director can be removed manually using the attached QConsole Workspace, OpsDirRemove.xml.  The instructions for running the scripts are contained in the first tab of the workspace.

      Problems with Un-managing Clusters

      Occasionally, disconnecting a managed cluster from Ops Director may partially fail.  If this occurs, you can use the attached QConsole Workspace, OpsDirUnmanage.xml.  The instructions for running the scripts are contained in the first tab of the workspace.

      Further Reading

      Installing, Uninstalling, and Configuring Ops Director

      Monitoring MarkLogic with Ops Director

      Summary

      This article briefly looks at the performance implications of ad hoc queries versus passing external variables to a query in a module

      Details

      Programatically, you can achieve similar results by dynamically generating ad hoc queries on the client as you can by definining your queries in modules and passing in external variable values as necessary.

      Dynamically generating ad hoc queries on the client side results in each of your queries being compiled and linked with library modules before they can be evaluated - for every query you submit. In contrast, queries in modules only experience that performance overhead the first time they're invoked.

      While it's possible to submit queries to MarkLogic Server in any number of ways, in terms of performance, it's far better to define your queries in modules, passing in external variable values as necessary.

      Summary

      MarkLogic does not enforce a programmatic upper limit on How many indexes you *can* have. This leaves open the question of how many range indexes should be used in your application. The answer is that you should have as many as the application requires, but with the caveat that there are some infrastructure limits that should be taken into account. For instance:

      1. More Memory Mapped file Handles (file fd)

      OS has limits of how many file handles a given process can have at a given point in time. This limit, therefore, affects how many range index files, and therefore range indexes a given MarkLogic process can have; However, One could configure higher File Handle limits on most platforms (ulimit, vm.max_map_count).

      2. More RAM requirement 

      In-memory footprint of node involves In-memory structures like in-memory-list-cache, in-memory-tree-cache, in-memory-range index, in-memory-reverse-index (if-reverse-query-enabled) , in-memory-triple-index (if-triple-positions-enabled); multiply those with total number of forests + buffer.

      A Large number of Range indexes can result in a huge index expansion in memory use. Also, values mentioned above are in addition to memory that would be required for MarkLogic Server to maintain its HTTP servers, perform merges, reindex, re-balance, as well as operations like processing queries, etc.

      Tip: Memory consumption can be reduced by configuring a database to optimize range indexes for minimum memory usage (memory-size); Default is configured for maximum performance (facet-time). 

      UI : Admin UI > Databases > {database-name} > Configure > range index optimize [facet-time or memory-size]

      API : admin:database-set-range-index-optimize 

      3. Longer Merge Times (Bigger stands due to Large index expansion)

      Large number of Range Index ends up expanding data in forests. Now for a given host size and number of hosts- larger stand sizes in forest will make range index query faster; However it will also make merge times slower. If we want to make Queries and merges all fast with a large number of range indexes, we will need to scale out the number of physical hosts. 

      4. More CPU, Disk & IO requirement 

      Merges are IO intensive processes; this, combined with frequent updates/load could result in CPU as well as IO bottlenecks.

      5. Longer Forest Mount times

      In general, Each configured range index with data takes two memory mapped files per stand.

      A typical busy host has on the order of 10 forests, each forest with on the order of 10 stands; So a typical busy host has on the order of 100 stands.

      Now for 100 stands -

      • With 100 range indexes, we have in the order of 10,000 files to open and map when the server starts up.
      • While for 1,000 range indexes, we have in the order of 100,000 files to open and map when the server starts up.
      • While for 10,000 range indexes, we have in the order of 1,000,000 mapped files to open and map when the server starts up.

      As we increase our range indexes, at some point of time, Server will take unreasonably long time to start up (unless we throw equivalent processing power).

      The amount of time one is willing to wait for the server to start up is not a hard limit, but the question should be "what is 'reasonable' behavior for Server start-up in eyes of Server Admin based on current hardware."

      Conclusion

      Range Indexes in magnitude of a thousand starts affecting Performance if not managed properly and if above consideration are not accounted for; In most scenarios the solution to the problem is not about "How many indexes can we configure", but rather about "How many indexes do we need".

      MarkLogic considers configured range index in the order of 100 as a “reasonable” limit, because it results in “reasonable” behaviors of the Server.

      Tips for Best Performance for Solutions with lots of Range Indexes

      Before launching your application, review the number of Range Indexes and work to 1) Remove ones that are not being used, and 2) Consolidate any range indexes that are mutually redundant. This will help you get under the prescribed 100 range index limit.

      On systems that already have a large number of range indexes (say 100+), merging multiple stands may become a performance issue. Thus, you will need to think about easing the query and merge load, here are some strategies for easing the load on your system: 

      1. Increase merge-max-size from 32768 to 49152 on your database. This will create larger stands and will lower the number of merges that need to be performed.
      2. There is configuration setting "preload mapped data" (default false), by leaving it as false, it will speed up merging of forest stands. Bear in mind that this will come at the cost of slower query performance immediately after forest mounts.
      3. If your system begins to slow down due to merging activity, you can spread the load by adding more hosts & forests to your cluster. The smaller forests and stands will merge and load faster when there are more CPU cores and IO bandwidth to service them.

      Further Reading

      Performance implications of updating Module and Schema databases

      This article briefly looks at the performance implications of adding or modifying modules or schemas to live (production) databases.

      Details

      When XQuery modules or schemas are referenced for the first time after upload, they are parsed and then cached in memory so that subsequent access is faster.

      When a module is added or updated, the modules cache is invalidated and every module (for all Modules databases within the cluster) will need to be parsed again before they can be evaluated by MarkLogic Server.

      Special consideration should be made when updating modules or schemas in a production environment as reparsing can impact the performance of MarkLogic server for the duration that the cache is being rebuilt.

      MarkLogic was designed with the assumption that modules and schemas are rarely updated. As such, the recommendation is that updates to modules or schemas in production environments is made during periods of low activity or out of hours.

      Further reading

      Overview

      Performance issues in MarkLogic Server typically involve either 1) unnecessary waiting on locks or 2) overlarge workloads. The goal of this knowledgebase article is to give a high level overview of both of these classes of performance issue, as well as some guidelines in terms of what they look like - and what you should do about them.

      Waiting on Locks

      We often see customer applications waiting on unnecessary read or write locks. 

      What does waiting on read or write locks look like? You can see read or write lock activity in our Monitoring History dashboard at port 8002 in the Lock Rate, Lock Wait Load, Lock Hold Load, and Deadlock Wait Load displays. This scenario will typically present with low resource utilization, but spikes in the read/write lock displays and high request latency.

      What should you do when faced with unnecessary read or write locks? Remediation of this scenario pretty much always goes through optimization of either request code, data model, or both. Additional hardware resources will not help in this case because there is no hardware resource bound present. You can learn more about data model optimizations through MarkLogic University's On-Demand courses, in particular XML and JSON Data Modeling Best Practices and Impact of Normalization: Lessons Learned

      Relevant Knowledgebase articles:

      1. Understanding XDMP Deadlock
      2. How Do Updates Work in MarkLogic Server?
      3. Fast vs Strict Locking
      4. Read Only Queries Run at a Timestamp & Update Transactions use Locks
      5. Performance Theory: Tales From MarkLogic Support

      Overlarge Workloads

      Overlarge workloads typically take two forms: a. too many concurrent workloads or b. work intensive individual requests

      Too Many Concurrent Workloads

      With regard to too many concurrent workloads - we often see clusters exhibit poor performance when subjected to many more workloads than the cluster can reasonably handle. In this scenario, any individual workload could be fine - but when the total amount of work over many, many concurrently running workloads is large, the end result is often the oversubscription of the underlying resources.

      What does too many concurrent workloads look like? You can see this scenario in our Monitoring History at port 8002, in the Disk I/O, CPU, Memory Footprint, App Server Request Rate, App Server Latency, or Task Server Queue Size displays. This scenario will typically present with spikes in both App Server Latency and App Server Request Rate, and correlated maximum level plateaus in one or more of the aforementioned hardware resource utilization charts.

      What should you do when faced with too many concurrent workloads? Remediation of this scenario pretty much always involves the addition of more rate-limiting hardware resource(s). This assumes, of course, that request code and/or data model are both already fully optimized. If either could be further optimized, then it might be possible to enable a higher request count given the same amount of resources - see the "Work Intensive Individual Requests" section, below. Rarely, in circumstances where traffic spikes are unpredictable - but likely - we’ve seen customers incorporate load shedding or traffic management techniques in their application architectures. For example, when request times pass a certain threshold, traffic is then routed through a less resource hungry code path.

      Note that concurrent workloads entail both request workload and maintenance activities such as merging or reindexing. If your cluster is not able to serve both requests and maintenance acitvities, then the remidiation tactics are the same as listed above: you either need to a. add more rate-limiting hardware resource(s) to serve both, or b. you need to incorporate load shedding or traffic management techniques like restricting maintenance activities to periods where the necessary resources are indeed available.

      Relevant Knowledgebase articles:

      1. When submitting lots of parallel queries, some subset of those queries take much longer - why?
      2. How reindexing works, and its impact on performance
      3. MarkLogic Server I/O Requirements Guide
      4. Sizing E-nodes
      5. Performance Theory: Tales From MarkLogic Support
      Work Intensive Individual Requests

      With regard to work intensive individual requests - we often see clusters exhibit poor performance when individual requests attempt to do too much work. Too much work can entail an unoptimmized query, but it can also be seen when an otherwise optimized query attempts to work over a dataset that has grown past its original hardware specification.

      What do work intensive requests look like? You can see this scenario in our Monitoring History at port 8002, in the Disk I/O, CPU, Memory Footprint, App Server Request Rate, App Server Latency, or Task Server Queue Size displays. This scenario will typically present with spikes in one or more system resources (Disk I/O, CPU, Memory Footprint) and App Server Latency. In contrast to the "Too Many Concurrent Requests" scenario App Server Request Rate should not exhibit a spike.

      What should you do when faced with work intensive requests? As in the case with too many concurrent requests, it's sometimes possible for customers to address this situation with additional hardware resources. However, remediation in this scenario more typically involves finding additional efficiencies via code or data model optimizations. Code optimizations can be made with the use of xdmp:plan() and xdmp:query-trace(). You can learn more about data model optimizations through MarkLogic University's On-Demand courses, in particular XML and JSON Data Modeling Best Practices and Impact of Normalization: Lessons Learned. If the increase in work is rooted in data growth, it's also possible to reduce the amount of data. Customers pursuing this route will typically do periodic data purges or by using features like Tiered Storage.

      Relevant Knowledgebase articles:

      1. Gathering information to troubleshoot long-running queries
      2. Fast searches: resolving from the indexes vs. filtering
      3. What do I do about XDMP-LISTCACHEFULL errors?
      4. Resolving XDMP-EXPNTREECACHEFULL errors
      5. When should I look into query or data model tuning?
      6. Performance Theory: Tales From MarkLogic Support

      Additional Resources

      1. Monitoring MarkLogic Guide
      2. Query Performance and Tuning Guide
      3. Performance: Understanding System Resources

       

      This article is a snapshot of the talk that Jason Hunter and Franklin Salonga gave next at MarkLogic World 2014, also titled, “Performance Theory: Tales From The MarkLogic Support Desk.” Jason Hunter is Chief Architect and Frank Salonga is Lead Engineer at MarkLogic. 

      MarkLogic is extremely well-designed, and from the ground up it’s built for speed, yet many of our support cases have to do with performance. Often that’s because people are following historical conventions that no longer apply. Today, there are big-memory systems using a 64-bit address space with lots of CPU cores, holding disks that are insanely fast (but that haven’t grown in speed as much as they have in size*), hooked together by high-speed bandwidth. MarkLogic lives natively in this new reality, and that changes the guidelines you want to follow for finding optimal performance in your database.

      The Top 10 (Actually 16) Tips

      The following is a list of top 16 tips to realize optimal performance when using MarkLogic, all based on some of the common problems encountered by our customers:

      1. Buy Enough Iron
      MarkLogic is optimized for server-grade systems, those just to the left of the hockey-stick price jump. Today (April 2014) that means 16 cores, 128-256 Gigs of RAM, 8-20 TB of disk, 2 disk controllers.

      2. Aim for 100KB docs +/- 2 Orders of Magnitude
      MarkLogic’s internal algorithms are optimized for documents around 100 KB (remember, in MarkLogic, each document should be one unit of query and should be seen more like relational rows than tables). You can go down to 1 KB but below that the memory/disk/lock overhead per document starts to be troublesome. And, you can go up to 10 MB but above that line the time to read it off disk starts to be noticeable.

      3. Avoid Fragmentation
      Just avoid it, but if you must, then understand the tradeoffs.  See also Search and Fragmentation.

      4. Think of MarkLogic Like an Only Child
      It’s not a bug to use 100 percent of the CPU—that’s a feature. MarkLogic assumes you want maximum performance given available resources. If you’re using shared resources (a SAN, a virtual machine) you may want to impose restrictions that limit what MarkLogic can use.

      5. Six Forests, Six Replicas
      Every use case is different, but in general deployments of MarkLogic 7 are proving optimal with 6 forests on each computer and (if doing High Availability) 6 replicas.

      6. Earlier Indexing is Better Indexing
      Adding an index after loading requires touching every document with data relating to that index. Turning off an index is instant, but no space will be reclaimed until the re-index occurs. A little thought into index settings before loading will save you time.

      7. Filtering: Your Friend or Foe
      Indexes isolate candidate documents, then filtering verifies the hits. Filtering lets you get accurate results even without accurate indexes (e.g., a case sensitive query without the case sensitive index). So, watch out, as filtering can hide bad index settings! If you really trust the indexes, you can use “unfiltered.” It is best to perfect your index settings in a small test environment, then apply them to production.

      8. Use Meaningful Markup If You Can
      If you can use meaningful markup (where the tags describe the content they hold) you get both prettier XML and XML that’s easier to write indexes against.

      9. Don’t Try to Outsmart Merging
      Contact support if you plan to change any of the advanced merge settings (max size, min size, min ratio, timeout periods). You shouldn’t usually tweak these. If you’re thinking about merge settings, you’re probably underprovisioned (See Recommendation #1).

      10. Big Reads Go In Queries, Not Updates
      Hurrah! Using MVCC for transaction processing means lock-free reads. But, to be a “read” your module can’t include any update calls. This is determined by static analysis in advance, so even if the update call isn’t made, it still changes your behavior. Locks are cheap but they’re not free, and any big search to find the top 10 results will lock the full result set during the sort. Whenever possible, do update calls in a separate nested transaction context using xdmp:invoke() with an option specifying “different-transaction”.

      11. Taste Test
      Load a bit of data early, so you can get an idea about rates, sizes, and loads. Different index settings will affect performance and sizes. Test at a few sizes because some things scale linearly, some logarithmically.

      12. Measure
      Measure before. Measure after. Measure at all levels. When you know what’s normal, you can isolate when something goes different. MarkLogic 7 can internally capture “Monitoring History” to a Meters database. There are also tools such as Cacti, Ganglia, Nagios, Graphite, and others.

      13. Keep a Staging Box
      A staging box (or cluster) means you can measure changes in isolation (new application code, new indexes, new data models, MarkLogic upgrades, etc.). If you’re running on a cluster, then stage on a cluster (because you’ll see the effects of distribution, like net traffic and 2-phase commits). With AWS it’s easier than ever to “spin up” a cluster to test something.

      14. Adjust as Needed
      You need to be measuring so you know what is normal and then know what you should adjust. So, what can you adjust?

      • Code: Adjusting your code often provides the biggest bang
      • Memory sizes: The defaults assume a combo E-node/D-node server
      • Indexes: Best in advance, maybe during tasting. Or, try on staging
      • Cluster size and forest distribution: This is much easier in MarkLogic 7

      15. Follow Our Advice on Swap Space
      Our release notes tell you:

      • Windows: 2x the physical memory
      • Linux: 1x the physical memory (minus any huge pages), or 32GB, whichever is lower
      • Solaris: 1x-2x the physical memory

      MarkLogic doesn’t intend to leverage swap space! But, for an OS to give memory to MarkLogic, it wants the swap space to exist. Remember, disk is 100x cheaper than RAM, and this helps us use the RAM.

      16. Don’t Forget New Features
      MarkLogic has plenty of features that help with performance, including MLCP, tiered storage, and semantics. With the MLCP fast-load option, you can perform forest assignments on the client, and directly insert to that forest. It’s really a sharp tool, but you don’t use it if you’re changing forest topology or assignment policies. With tiered storage, you can use HDFS as cheap mass storage of data that doesn’t need high performance. Remember, you can “partition” data (i.e. based on dates) and let it age to slower disks. With semantics, you have a whole new way to model your data, which in many cases can produce easier to optimize queries.

      That’s it! With these pro tips, you should be able to handle the most common performance issues. 

      *With regard to storage, as you add capacity, it is critical that you add throughput in order to maintain a fast system (http://tylermuth.wordpress.com/2011/11/02/a-little-hard-drive-history-and-the-big-data-problem/)

      Introduction

      Administrators can achieve very fine granularity on restores when incremental backups are used in conjunction with log archiving.

      Details

      Journal archiving can enable a restore to be performed to any timestamp since the last incremental backup.  For example, when using daily incremental backups in conjunction with 24-hour log archive retention, a restore can be made to any point in the previous 24 hours.

      This capability enables administrators to go back to the exact point in time before a user error caused bad data to be ingested into the database, minimizing any data loss on the restore. Although this is a very powerful capability, the entire operation to perform a restore is simplified. Administrators can execute a simple operation as the server restores the backup set and replays the journal starting from the timestamp given by the admin.

      For further information, see the documentation Restoring from an Incremental Backup with Journal Archiving.

      Summary

      There are index settings that may be problematic if your documents contain encoded binary data (such as Base64 encoded binary).  This article identifies a couple of these index settings and explains the potential pitafall.

      Details

      When word lexicons or string range indexes are enabled, each stand in the database's forest will contain a file called the 'atom data' file.  The contents of this file includes all of the relevant unique tokens.  This could include all the unique tokens in the forest (stand).  If your documents contain encoded binary data, all of the encode binary may be replicated as atom data and stored in the atom data file.

      Pitfall: There is an undocumented limit on the size of the atom data file of 4GB.  If this limit is exceeded for the content of a forest, then stand merges will begin to fail with the error

          "XDMP-FORESTERR: Error in merge of forest forest-nameSVC-MAPBIG: Mapped file too large to map: NNN bytes: '\path\Forests\forest-name\stand-id\AtomData'"

      Workarounds

      There are a few options that you can pursue to get around these problems

      1. Do not include encoded binary data in your documents.  An alternative is to store the binary content seperately using MarkLogic Server support for binary documents and to include a reference to the binary document in the original.

      2. If word lexicons are required, and the encoded binary data is limited to a finite number of elements in your documents, then you can create word query exclusions for those elements. In the MarkLogic Server Admin UI, word query element exclusions can be configured by navigating to -> Configure -> Databases -> {database-name} -> Word Query -> Exclude tab. 

      3. If a string range index is defined on an element that contains encoded binary, then you can either remove the string range index or change the document data model so that the element containing the encoded binary is not shared with an element that requires a string range index. 

       

       

      Introduction

      Looking at the MarkLogic Admin UI, you may have noticed that the status page for a given database displays the last backup date and time for a given database. We have been asked in the past how this gets computed so the same check can be performed using your own code. This Knowledgebase article shows examples that utilise XQuery to get this information and explores the possibility of retrieving this using the MarkLogic ReST API

      XQuery: How does the code work?

      The simple answer is in the forest status for each of the forests in the database (note these values only appear if you have created a backup already).  For the sake of these examples, let's say I have a database (called "test") which contains 12 forests (test-1 to test-12).  I can get the backup status using a call to our ReST API:

      http://localhost:8002/manage/LATEST/forests/test-1?view=status&format=html

      In the results returned, you should see something like:

      last-backup : 2016-02-12T12:30:39.916Z datetime
      last-incr-backup : 2016-02-12T12:37:29.085Z datetime
      

      In generating that status page in the MarkLogic Admin UI code, we create an aggregate - a database doesn't contain documents in MarkLogic, it contains forests and those forests contain documents.

      Continuing the example above (with a database called "test" containing 12 forests) if I run the following:

      This will return the forest status(es) for all forests in the database "test" and return the forest names using XPath, so in my case, I would see:

      <forest-name xmlns="http://marklogic.com/xdmp/status/forest">test-1</forest-name>
      [...]
      <forest-name xmlns="http://marklogic.com/xdmp/status/forest">test-12</forest-name>
      

      The MarkLogic Admin UI interrogate each forest in turn for that database and finds the metrics for the last backup.  To put that into context, if we ran the following:

      This gives us:

      <last-backup xmlns="http://marklogic.com/xdmp/status/forest">2016-02-12T12:30:39.946Z</last-backup>
      [...]
      <last-backup xmlns="http://marklogic.com/xdmp/status/forest">2016-02-12T12:30:39.925Z</last-backup>
      

      The code (or the status report) doesn't want values for all 12 forests, it just wants the time the last forest completed the backup (because that's the real time the backup completed), so our code is running a call to fn:max:

      Which gives us the max value (as these are all xs:dateTimes, it's finding the most recent date), which in the case of this example is:

      2016-02-12T12:30:39.993Z

      The same is true for the last incremental backup (note all that we're changing here is the XPath to get to the correct element):

      So we can get the max value for this by getting the most recent time across all forests:

      This would give us 2016-02-12T12:37:29.161Z

      Using the ReST API

      The ReST API does allow you to get this information but you'd need to jump through a few hoops to get to it:

      The ReST API status for a given database would give you the names of all the forests attached to that database:

      http://localhost:8002/manage/LATEST/databases/test

      And from there you could GET the information for all of those forests:

      http://localhost:8002/manage/LATEST/forests/test-1?view=status&format=html
      [...]
      http://localhost:8002/manage/LATEST/forests/test-12?view=status&format=html

      Once you'd got all those values, you could calculate the max values for them - but at this point, I think it would make more sense to write a custom endpoint that returns this information, something like:

      Where you could make a call to that module to get the aggregates (e.g.):

      http://[server]:[port]/[modulename.xqy]?db=test

      This would return the database status for any given parameter-name that is passed in.

      Introduction

      In this Knowledgebase article, we will discuss a technique which will allow you to scope queries in such a way to ensure that they occur only within a parent element.

      Details

      cts:element-query

      Consider a scenario where you have an XML document structured in this way:

      <rootElement>
        <id>7635940284725382398</id>
        <parentElement>
        <childElement1>valuea</childElement1>
        <childElement2>false</childElement2>
        </parentElement>
        <parentElement>
        <childElement1>valuea</childElement1>
      <childElement2>truthy</childElement2>
      </parentElement>
      <parentElement>
      <childElement1>valueb</childElement1>
      <childElement2>true</childElement2>
      </parentElement>
      <childElement1>valuec</childElement1>
      </rootElement>

      And you want to find the document where where a parentElement has a childElement1 with a value of 'valuec'.

      A search like

      cts:search (/,
          cts:element-value-query(xs:QName('childElement1'), 'valuec', 'exact')
      )

      will give you the above document, but doesn't consider where the childElement1 value is. This isn't what you want. Search queries perform matching per fragment, so there is no constraint that childElement1 be in any particular spot in the fragment.

      Wrapping a cts:element-query around a subquery will constrain the subquery to exist within an instance of the named element. Therefore,

      cts:search (/,
          cts:element-query (
              xs:QName ('parentElement'),
              cts:element-value-query(xs:QName('childElement1'), 'valuec', 'exact')
          )
      )

      will not return the above document since there is no childElement1 with a value of 'valuec' inside a parentElement.

      This applies to more-complicated subqueries too. For example, looking for a document that has a childElement1 with a value of 'valuea' AND a childElement2 with a value of 'true' as

      cts:search (/, 
          cts:and-query ((
              cts:element-value-query(xs:QName('childElement1'), 'valuea', 'exact'),
              cts:element-value-query(xs:QName('childElement2'), 'true', 'exact')
          ))
      )

      will return the above document. But you may want these two child element-values both inside the same parentElement. This can be accomplished with

      cts:search (/, 
          cts:element-query (
              xs:QName ('parentElement'),
              cts:and-query ((
                  cts:element-value-query(xs:QName('childElement1'), 'valuea', 'exact'),
                  cts:element-value-query(xs:QName('childElement2'), 'true', 'exact')
              ))
          )
      )

      This should give you expected results, as it won't return the above document since the two child element-value queries do not match inside the same parentElement instance.

      Filtering and indexes

      Investigating a bit further, if you run the query with xdmp:query-meters you will see (depending on your database settings) 

          <qm:filter-hits>0</qm:filter-hits>
          <qm:filter-misses>1</qm:filter-misses>

      What is happening is that the query can only determine from the current indexes that there is a fragment with a parentElement, and a childElement1 with a value of 'valuea', and a childElement2 with a value of 'true'. Then, after retrieving the document and filtering, it finds that the document is not a complete match and so does not return it (thus filter-misses = 1).

      (To learn more about filtering, refer to Understanding the Search Process section in our Query Performance and Tuning Guide.)

      At scale you may find this filtering slow, or the query may hit Expanded Tree Cache limits if it retrieves many false positives to filter through.

      If you have the correct positions enabled, the indexes can resolve this query without retrieving the document and filtering. In this case, after setting both

      element-word-positions

      and

      element-value-positions

      to true on the database and reindexing, xdmp:query-meters now shows

      <qm:filter-hits>0</qm:filter-hits>
      <qm:filter-misses>0</qm:filter-misses>

      (To track element-value-queries inside element-queries you need element-word-positions and element-value-positions enabled. The former is for element-query and the latter is for element-value-query.)

      Now this query can be run without filtering. However, if you have a lot of relationship instances in a document, the calculations using positions can become quite expensive to compute.

      Position details

      Further details: Empty-element positions are problematic. Positions are word positions, and the position of an element is the word position of the first word after the element starts to the word position of the first word after the element ends. Positions of attributes are the positions of their element. If everything is an empty element, you have no words and everything has the same position and so positions cannot discriminate between elements.

      Reindexing

      Note that if you change these settings you will need to reindex your database, and the usual tradeoffs apply (larger indexes and slower indexing). Please see the following for guidance on adding an index and reindexing in general:

      See also:

      Reindexing impact
      Adding an index in production

      Summary

      This article explains why you may encounter Cross-Site Request Forgery (CSRF) error (SECURITY-BADREQUEST) when using MarkLogic Server's Query Console application and how the issue can be resolved.

      Details

      Since the 8.0-6 release of MarkLogic Server, the security of Query Console is increased. Every time you load the application in the browser, there is a handshake between the browser and server, generating a secure CSRF token for the logged in user. This pairs the client with the server, allowing for secure communication. If another person logs into Query Console as the same user, their browser will perform another handshake, generating a new token and storing it on the server for that user. The other user whom was previously paired with the server will now have the wrong token and will see that CSRF error when performing any actions in the app that make a request to the server, until they refresh.

      MarkLogic is implementing the industry standard recommendation for CSRF. At this time, there is no option to disable this security feature.

      Best Practice

      Best practice would be to create a new user on MarkLogic Server for each person using the system. The "qconsole-user" role is enough to use the Query Console application. If they must be administrators, you can give them the "admin" role, but note that with this special role, the user will have the authority to perform any activity in MarkLogic Server, including adding or deleting users, adding or deleting documents, changing passwords, and so on.

      Further Reading

      Summary

      There is a limit to the number of registered queries held in the forest registry.  If your application does not account for that fact, you may get unexpected results. 

      Where is it?

      If a specific registered query is not found, then a cts:search operation with an invalid cts:registered-query throws an XDMP-UNREGISTERED exception. The XDMP-UNREGISTERED error occurs when a query could not be found in a forest query registry. If a query that had been previously registered can not be found, it may have been discarded automatically.  (In the most recent versions of MarkLogic Server at the time of this writing) The forest query registry only contains up to about 48,000 of the most recently used registered queries. If you register more than that, the least recently used ones get discarded.

      Recommendation

      To avoid registered queries being dropped, it’s a good idea to unregister queries when you know they aren’t needed any more.

      Summary

      If your index settings have a very large number of range indexes specified (on the order of thousands or even tens of thousands), you may find your MarkLogic Server instance returning a message saying that it "Cannot allocate memory" - even when your OS monitoring metrics indicate that there appears to be plenty of unused RAM.

      XDMP-FORESTERR: Error in startup of forest: SVC-MAPINI: Mapped file initialization error: mmap: Cannot allocate memory

      Detail

      The issue is not how much memory a system has, but how it's being used. In the interests of performance, MarkLogic Server indexes your content upon ingestion to the system, then memory maps those indexes to serialized data structures on disk. While it's true that each of those memory maps requires some amount of RAM, if you've got thousands of indexes and system monitoring is reporting RAM to spare, then you might be running up against Linux's default vm.max_map_count value.

      While it's possible to get past this issue by simply increasing the vm.max_map_count limit, you should seriously consider revisiting your index usage, as 1) it's likely the current indexing scheme could be replaced by a different one that uses far fewer indexes and 2) when your configuration exceeds on the order of 100 or so range indexes, you'll likely need to take special care to size and manage your topology so that you don’t run out of system resources, as well as potentially make configuration changes to the linux kernel on the d-nodes to which the relevant forests are assigned.

      ---

      Related Blog Post - 10000 Range Indexes  

      Introduction

      Seeing too many "stand limit" messages in your logs frequently? This article explains what this message means to your application and what actions should you take.

       

      What are Stands and how their numbers can increase?

      A stand holds a subset of the forest data and exists as a physical subdirectory under the forest directory. This directory contains a set of compressed binary files with names like TreeData, IndexData, Frequencies, Qualities, and such. This is where the actual compressed XMLdata (in TreeData) and indexes (in IndexData) can be found.

      At any given time, a forest can have multiple stands. To keep the number of stands to a manageable level MarkLogic runs merges in the background. A merge takes some of the stands on disk and creates a new singular stand out of them, coalescing and optimizing the indexes and data, as well as removing any previously deleted fragments.

      MarkLogic Server has a fixed limit for the maximum number of stands (64). When that limit is reached you will no longer be able to update your system. While MarkLogic automatically manage merges and it is unlikely to reach this limit, there are few configurations under user control that may impact merges and you may see this issue. e.g.

      1.) You can manage merges using Merge Policy Controls. e.g. setting a low merge max size would stop merges beyond the configured size and hence the overall number of stands would keep growing.

      2.) Low value of background-io-limit would mean less amount of I/O for background tasks such as merges. This may also adversely affect the merge rate and hence the number of stands may grow.

      3.) Low in-memory settings not keeping up with an aggressive data load. e.g. If you are bulk loading large documents and have low in memory tree size then stands may accumulate and reach the hard limit.

       

      What you can do to keep the number of stands within manageable limit?

      While MarkLogic automatically manage merges to keep the number of stands at a manageable level, it adds WARNING entry to the logs when it sees the number of stands growing alarmingly! e.g. Warning: Forest XXXXX is at 92% of stand limit

      If you see such messages in your logs, you should take some action as reaching the hard limit of 64 would mean you will no longer be able to update your system.

      Here's what you can check and do to lower the number of stands.

      1.) If you have configured merge policy controls then check if they actually match with your application usage. You could change the required settings as needed. For instance:

      2.) There should be no merge blackouts during ingestion, or any time there is heavy updating of your content.

      3.) Beginning with MarkLogic version 7, the server is able to manage merges with less free space required on your drives (1.5 times the size of your content). This is accomplished by setting the merge max size to 32768 (32GB). Although this does create more stands, this is OK on newer systems, since the server is able to use extra CPU cores in parallel.

      2.) If you have configured background-io-limit then check if that is sufficient for your application usage. If needed, increase the value so that merges can make use of more IO. You should only use this setting on systems that have limited disk IO. In general you want to first set it to 200, and if the disk IO seems to still be overwhelmed, set it to 150 and so on. A setting of 1oo may be too low for systems that are doing ingestion, since the merge process needs to be able to keep up with stand creation.

      3.) If you are performing bulk loads then check if the in-memory settings are suffificient and can be increased. If needed, increase the required value so that in-memory stands (and as a result on-disk stands) accomodate more data and thereby decreases the number of stands. If you do grow the in-memory caches, make sure to grow the database journal files by a corresponding amount. This will insure that a single large transaction will be able to fit in the journals.

       

      Conclusion 

      If you decide to control MarkLogic's merge process, you should monitor the system for any adverse effect that it may cause and take actions accordingly. MarkLogic Server continuously assesses the state of each database and the default merge settings and the dynamic nature of merges will keep the database tuned optimally at all times. So if you are unsure - let MarkLogic handle the merges for you!

      Introduction

      This article presents the steps to create a Read only Access User and a full access user to a Webdav Server.

      Details

      For read-only WebDAV access you can connect to WebDAV using the credentials of a user who does not have the rights to insert/update documents. This can be accomplished by creating a user and assigning roles to them through steps given below.

      1. If one does not already exist, create a WebDAV server (Instructions available in the MarkLogic Server Administrators Guide)

      • leave default user to "nobody", and 
      • leave required privilege empty

      2. Create a role - for the purpose of these instructions, call the new role "Read_only_Access" 

      • After you have entered a name for the new role (Read-Only-Access),  refresh the page and scroll to the "Default Permissions" section near the end of the page. The default permissions section will allow you to assign a capability to a particular role. In this case, we would select the "Read-Only-Access" role from the role drop down as well as the "read" capability.

      3. Create a user and grant that user the "Read_only_Access" role.

      4. Create another role - for the purpose of these instructions, call the new role "Write_only_Access"

      • After you have entered a name for the new role (Write_only_Access), you can refresh the page and scroll to the "Default Permissions" section near the end of the page. The default permissions section will allow you to assign a capability to a particular role. In this case, we would select the "Write_only_Access" role from the role drop down as well as the "read", "insert","execute" and "update"capabilities.

      5. Create another user and grant that user the "Write_only_Access" role.

      6. Set permission on the "/" directory so the "Read_only_Access" / "Write_only_Access" role can view/make changes respectivley.  This can also be accomplished by code as well.

         xdmp:document-add-permissions("/",xdmp:permission("Read_only_Access","read"))

        xdmp:document-add-permissions("/",xdmp:permission("Write_only_Access",("read", "insert","execute","update"))

      7. When you connect to a WebDAV client, both user will be able to view the root "/" directory, but cannot create files or folders. For this you will need to create a URI privilege for the "/" URI and add the  "Write_only_Access" role.

      Now the "Read_only" user can read those documents, and the "Write_only" user can both read and update the documents.

      Existing Documents

      While the user just created will have expected access to all the new documents, for previously existing documents in the database you will need to add the read permission to the documents contained in your database. This can be accomplished with xdmp:document-add-permission().

      For example:
          xdmp:document-add-permissions("/example.xml", xdmp:permission("Read_only_Access", "read"))

      MarkLogic Documentation

      For more details on how to manage security. please refer to the Security Administration section of our Administrators Guide.

       

       

       

       

      Overview

      Update transactions run with readers/writers locks, obtaining locks as needed for documents accessed in the transaction. Because update transactions only obtain locks as needed, update statements always see the latest version of a document. The view is still consistent for any given document from the time the document is locked. Once a document is locked, any update statements in other transactions wait for the lock to be released before updating the document.

      Read only query transactions run at a particular system timestamp, instead of acquiring locks, and have a read-consistent view of the database. That is, the query transaction runs at a point in time where all documents are in a consistent state.

      The system timestamp is a number maintained by MarkLogic Server that increases every time a change or a set of changes occurs in any of the databases in a system (including configuration changes from any host in a cluster). Each fragment stored in a database has system timestamps associated with it to determine the range of timestamps during which the fragment is valid.

      On a clustered system where there are multiple hosts, the timestamps need to be coordinated accross all hosts. Marklogic Server does this by passing the timestamp in every message communicated between hosts of the cluster, including the heartbeat message. Typically, the message carries two important pieces of information:

      • The origin host id
      • The precise time on the host at the time that heartbeat took place

      In addition to the heartbeat information, the "Label" file for each forest in the database is written as changes are made. The Label file also contains timestamp information; this is what each host uses to ascertain the current "view" of the data at a given moment in time. This technique is what allows queries to be executed at a 'point in time' to give insight into the data within a forest at that moment.

      You can learn more about transactions in MarkLogic Server by reading the Understanding Transactions in MarkLogic Server section of the MarkLogic Server Application Developers Guide.

      The distribute timestamps option on Application Server can specify how the latest timestamp is distributed after updates. This affects performance of updates and the timeliness of read-after-write query results from other hosts in the group.

      When set to fast, updates return as quickly as possible. No special timestamp notification messages are broadcasted to other hosts. Instead, timestamps are distributed to other hosts when any other message is sent. The maximum amount of time that could pass before other hosts see the update timestamp is one second, because a heartbeat message is sent to other hosts every second.

      When set to strict, updates immediately broadcast timestamp notification messages to every other host in the group. Updates do not return until their timestamp has been distributed. This ensures timeliness of read-after-write query results from other hosts in the group.

      When set to cluster, updates immediately broadcast timestamp notification messages to every other host in the cluster. Updates do not return until their timestamp has been distributed. This ensures timeliness of read-after-write query results from any host in the cluster, so requests made to any app server on any host in the cluster will see immediately consistent results.

      The default value for "distribute timestamps" option is fast. The remainder of this article is applicable when fast mode is used.

      Read after Write in Fast Mode

      We will look at the different scenario for the case where a read occurs in a transaction immediately following an update transaction.

      • If the read transaction is executed against an application server on the same node of the cluster (or any node that participated in the update) then the read will execute at a timestamp equal to or greater than the time that the update occurred.
      • If the read is executed in the context of an update transaction, then, by acquiring locks, the view of the documents will be the latest version of the documents.
      • If the read is executed in a query transaction, then the query will execute at the latest timestamp that the host on which it was executed is aware of. Although this will always produce a transactionally consistent view of the database, it may not return the latest updates. The remainder of this article addresses this case.

      Consider the following code:

      The above example performs the following steps:

      • Instantiates two XCC ContentSource Objects - each connecting to a different host in the cluster.
      • Establishes a short loop (which runs the enclosed steps 10 times)
        • Creates a unique UUID which is used as a URI for the Document
        • Establishes a session with the first host in the cluster and performs he following:
          • Gets the timestamp (session.getCurrentServerPointInTime()) and writes it out to the console / stdout
          • Inserts a simple, single element () as a document-node into a given database
          • Gets the timestamp again and writes it out to the console / stdout
        • The session with the first host is then closed. A new session is established with the second host and the following steps are performed:
          • Gets the timestamp at the start of the session and writes it out to the console / stdout
          • An attempt is made to retrieve the document which was just inserted
        • On success the second session will be closed.
        • If the document could not be read successfully, an immediate retry attempt follows thereafter - which will result a successful retrieval.

      Running this test will yield one of two results for each iteration of the loop:

      Query Transaction at Timestamp that includes Update

      Most of the time, you will find that the timestamps will be in lockstep with the host before - note that there is no time difference between the output from getCurrentServerPointInTime() after the document has been inserted and before the attempt is made to retrieve the document from the connection to the second host in the cluster.

      ----------------- START OF INSERT / READ CYCLE (1) -----------------
      First host timestamp before document is inserted: 	13673327800295300
      First host timestamp after document is inserted: 	13673328229180040
      Second host timestamp before document is read: 	13673328229180040
      ------------------ END OF INSERT / READ CYCLE (1) ------------------

      However, you may also see this:

      ----------------- START OF INSERT / READ CYCLE (10) -----------------
      First host timestamp before document is inserted: 	13673328311216780
      First host timestamp after document is inserted: 	13673328322546380
      Second host timestamp before document is read: 	13673328311216780
      ------------------ END OF INSERT / READ CYCLE (10) ------------------

      Note that on this run, the timestamps are out of sync; at the point where getCurrentServerPointInTime() is called, the timestamp for the second connection is at that point just before the document is inserted.

      Yet this also returns results that include the updates; in the interval between the timestamp being written to the console and the construction and submission of the newAdhocQuery(), the document has become available and was successfully retrieved during the read process.

      The path with an immediate retry

      Now let's explore what happens when the read only query transaction runs at a point in time that does not include the updates:

      ----------------- START OF INSERT / READ CYCLE (2) -----------------
      First host timestamp before document is inserted: 	13673328229180040
      First host timestamp after document is inserted: 	13673328240679460
      Second host timestamp before document is read: 		13673328229180040
      WARNING: Immediate read failed; performing an immediate retry
      Second host timestamp for read retry: 		13673328240679460
      Result Sequence below:
      <?xml version="1.0" encoding="UTF-8"?>
      <ok/>
      ------------------ END OF INSERT / READ CYCLE (2) ------------------

      Note that on this occasion, we see an outcome that starts much like the previous example; the timestamps mismatch and we see that we've hit the point in the code where our validation of the response fails.

      Also note that the timestamp at the point where the retry takes place is now back in step; from this, we can see that the document should be available even before the retry request is executed. Under these conditions, the response (the result) is also written to stdout so we can be sure the document was available on this attempt.

      Multi Version Concurrency Control

      In order to gurarantee that the "holistic" view of the data is current  and available in a read only query transaction across each host in the cluster, two things need to take place:

      • All forests need to be up-to-date and all pending transactions need to be committed.
      • Each host must be in complete agreement as to the 'last known good' (safest) timestamp from which the query can be allowed to take place.

      In all situations, to ensure a complete (and reliable) view of the data, the read only query transaction must take place at the lowest known timestamp across the cluster

      With every message between nodes in the cluster, the latest timestamp information is communicated across each host in the cluster - the first "failed" attempt to read the document necessitates communication between each host in the cluster - and by doing so, this action propagates a new "agreed" timestamp across every node in the cluster.

      It is because of this, the retry will always work; at the point where the immediate read after write fails, timestamp changes are propagated, and the new timestamp is now at a waypoint for the retry query to take place. This is why the single retry is always guaranteed to work.

      Context

      This KB article talks specifically about how the Rebalancer interacts with database replication, and how to solve the issues that may arise if not configured correctly.

      For a general discussion on how rebalancing works in MarkLogic, refer to this article and the server documentation.

      Rebalancing and replication

      When database replication is configured for a database, rebalancing will not take place on Replica until database replication is broken. Until the time when the primary is available, forest to forest mapping will remain.

      It is important to make sure that the assignment policy on the Replica is the same as the Master - so that in a DR situation, when the Replica takes over as the Primary, rebalancing is not triggered.

      Forest order mismatch can cause Rebalancing

      Forest order is the order in which forests are attached the the database. When the document assignment policy is set to either 'Legacy' or 'Bucket', it is required that the Replica database configuration should have the same forest order as the Master to ensure rebalancing does not occur if or when replication is deconfigured.

      If there is a difference in forest orders between the Master and the Replica, a Warning level message is logged on the Replica, which looks like this:

      2015-10-21 13:34:59.359 Warning: forest order mismatch: local forest Test_12 is at position 15 
      while foreign master forest 2108358988113530610 (cluster=8893136914265436826) is at position 12

      In this state, when replication is broken, and the Replica takes over as the primary, then rebalancing starts off and it could take variable amount of time depending on how many documents need to be rebalanced.

      Fixing the forest order:

      The following steps help in removing the mismatch and making the forest order same on both Master and Replica:

      i. Break/deconfigure replication

      ii. Disable rebalancing on both clusters for the database in question.

      iii. Obtain the forest order from the Master cluster - below is the query:

      iv. On the Replica cluster, reorder the forests according to the order returned on the Master from step iii:

      v. Re-enable rebalancing on both clusters, and let it finish if it starts rebalancing

      vi. Reconfigure replication.

      Again, make sure that the assignment policy is the same on both the clusters.

      On a MarkLogic 7 cluster or a MarkLogic 8 cluster that was previously upgraded from MarkLogic Server version 6, reindexing of the triple index does not always get triggered when the triple index is turned off. Reindexing is performed after turning off an index in order to reclaim space that the index was using.

      The workaround is to force a manual reindexing.

      Summary

      When used as a file system, GFS needs to be tuned for optimal performance with MarkLogic Server.

      Recommendations

      Specifically, we recommend tuning the demote_secs and statfs_fast parameters. The demote_secs parameter determines the amount of time GFS will wait before demoting a lock on a file that is not in use. (GFS uses a time-based locking system.) One of the ways that MarkLogic Server makes queries go fast is its use of memory mapped index files. When index files are stored on a GFS filesystem, locks on these memory-mapped files are demoted purely on the basis of demote_secs, regardless of use. This is because they are not accessed using a method that keeps the lock active -- the server interacts with the memory map, not direct access to the on-disk file.

      When a GFS lock is demoted, pages from the memory-mapped index files are removed from cache. When the server makes another request of the memory-mapped file, GFS must acquire another lock and the requested page(s) from the on-disk file must be read back into cache. The lock reacquisition process, as well as the I/O needed to load data from disk into cache, may causes noticeable performance degradation.

      Starting with MarkLogic Server 4.0-4, MarkLogic introduced an optimization for GFS. From that maintenance release forward, MarkLogic gets the status of its memory-maps files every hour, which results in the retention of the GFS locks on those files so that they do not get demoted. Therefore, it is important that demote_secs is equal to or greater than one hour. It is also recommended that the tuning parameter statfs_fast is set to "1" (true), which makes statfs on GFS faster.

      Using gfs_tool, you should be able to set the demote_secs and statfs_fast parameters to the following values:

      demote_secs 3600

      statfs_fast 1

      While we're discussin tuning a Linux filesystem, it is worth noting the following Linux tuning tips also:

      • Use the deadline elevator (aka I/O scheduler), rather than cfq, on all hosts in the cluster. This has been added to our installation requirements for RHEL. With RHEL-4, this requires the elevator=deadline option at boot time. With RHEL-5, this can be changed at any time via /sys/block/*/queue/scheduler
      • If you are running on a VM slice, then no-op I/O scheduler is recommended.
      • Set the following kernel tuning parameters:

      Edit /etc/sysctl.conf:

      vm.swappiness = 0

      vm.dirty_background_ratio=1

      vm.dirty_ratio=40

      Use sudo sysctl -f to apply these changes.

      • It is very important to have at least one journal per host that will mount the filesystem. If the number of hosts exceeds the number of journals, performance will suffer. It is, unfortunately, impossible to add more journals without rebuilding the entire filesystem, so be sure to set journals up for each host during your initial build.

       

      Working with RedHat

      Should you run into GFS-related problems, running the following Script will provide all the information that you need in order to work with the Redhat Support Team:


      mkdir /tmp/debugfs

      mount -t debugfs none /tmp/debugfs

      mkdir /tmp/$(hostname)-hangdata

      cp -rf /tmp/debugfs/dlm/ /tmp/$(hostname)-hangdata

      cp -rf /tmp/debugfs/gfs2/ /tmp/$(hostname)-hangdata

      echo 1 > /proc/sys/kernel/sysrq 

      echo 't' > /proc/sysrq-trigger 

      sleep 60

      cp /var/log/messages /tmp/$(hostname)-hangdata/

      clustat > /tmp/$(hostname)-hangdata/clustat.out

      cman_tool services > /tmp/$(hostname)-hangdata/clustat.out

      mount -l > /tmp/$(hostname)-hangdata/mount-l.out

      ps aux > /tmp/$(hostname)-hangdata/ps-aux.out

      tar cjvf /tmp/$(hostname)-hangdata.tar.bz /tmp/$(hostname)-hangdata/

      umount /tmp/debugfs/

      rm -rf /tmp/debugfs

      rm -rf /tmp/$(hostname)-hangdata

      Introduction

      MarkLogic is supported on XFS filesystem. The minimum system requirements can be found here:

      https://developer.marklogic.com/products/marklogic-server/requirements-9.0

      The default mount options will generally give good performance, assuming the underlying hardware is capable enough in terms of IO performance and durability of writes, but if you can test your system adequately, you can consider different mount options.

      The values provided here are just general recommendations, if you wish to fine tune your storage performance, you need to ensure that you do adequate testing both with MarkLogic and low level tools such as fio:

      http://freecode.com/projects/fio

      1. I/O Schedulers

      Unless you have a directly connected single HDD or SSD, noop is usually the best choice, see here for more details:

      https://help.marklogic.com/Knowledgebase/Article/View/8/0/notes-on-io-schedulers

      2. XFS Mount options

      nobarrier If your disk controller has a battery-backup-unit (BBU) or similar technology to protect the cache contents on power loss, adding the mount option nobarrier for XFS or ext4 can significantly increase throughput. If it doesn't this can lead to to journal data corruption upon power loss.

      relatimeThe default atime behaviour is relatime, which has almost no overhead compared to noatime but still maintains sane atime values. All Linux filesystems use this as the default now (since around 2.6.30), but XFS has used relatime-like behaviour since 2006, so no-one should really need to ever use noatime on XFS for performance reasons.

      attr2 This options enables an "opportunistic" improvement to be made in the way inline extended attributes are stored on-disk. It's the default and should be kept as such in most scenarios.

      inode64 - to sum up this allows xfs to create nodes anywhere and not worry about backwards compatibility, which should result in better scalability. See here for more information: https://access.redhat.com/solutions/67091

      sunit=x,swidth=y XFS allows you to specify RAID settings. This enables the file system to optimize its read and write access for RAID alignment, e.g. by committing data as complete stripe sets for maximum throughput. These RAID optimizations can significantly improve performance, but only if your partition is properly aligned or of you are avoiding misalignment by creating the xfs on a device without partitions. 

      largeio, swalloc - these are intended to further optimize streaming performance on RAID storage. You need to do your own testing.

      isize=512 - XFS allow inlinings of data into inodes to avoid the need for additional blocks and the corresponding expensive extra disk seeks for directories. In order to use this efficiently, the inode size should be increased to 512 bytes or larger.

      allocsize=131072k (or larger) XFS can be tuned to a fixed allocation size, for optimal streaming write throughput. This setting could have a significant impact on the interim space usage in systems with many parallel write and create operations.

      As with any advice of this nature, we strongly advise that you always do your own testing to ensure that options you choose are stable and reliable for your workload.

      Summary

      The XDMP-LABELBADMAGIC error appears when attempting to mount a forest with a corrupted or zero length Label file.  This article identifies a potential cause and provides the steps required to work around this issue.

      Details

      The XDMP-LABELBADMAGIC error is often seen on systems where the server was running out of disk space.  If there is no space for MarkLogic Server to write the forest's Label file, a zero length Label file may result. The side effect of that would be the XDMP-LABELBADMAGIC error.

      Below is an example showing how this error might appear in ErrorLog.txt when the Triggers forest has a zero length Label file.

      2013-03-21 13:02:11.835 Alert: XDMP-FORESTERR: Error in mount of forest Triggers: XDMP-LABELBADMAGIC: Bad forest label magic number: 0x0 instead of 0x1020304

      2013-03-21 13:02:11.835 Error: NullAssignment::localMount: XDMP-LABELBADMAGIC: Bad forest label magic number: 0x0 instead of 0x1020304

      In order to recover from this error, you will need to manually remove the bad Label file.  Removing the Label file will force MarkLogic Server to recreate the file and will allow the forest to be mounted.

      Steps for recovery:

      1. Make sure MarkLogic Server is shutdown on the affected host.

      2. Remove the Label file for the forest displaying the error

      a. In Linux the default location is "/var/opt/MarkLogic/Forests/[Forest-Name]/Label"

      b. In Windows the default location is "c:\Program Files\MarkLogic\Data\Forests\[Forest-Name]\Label"

      3. Restart MarkLogic Server.

      Introduction

      In some situations an existing cluster node needs to be replaced. There are multiple reasons for this activity like hardware failure or hardware replacement.

      In this Knowledgebase article we will outline the steps necessary to replace the node by reusing the existing cluster configuration without registering it again.

      Important notes:

      • The replacement node must have the same architecture as all other nodes of the cluster (e.g., Windows, Linux, Solaris). The CPUs must also have the same number of bits (e.g., 64, 32).
      • The replacement node must have the same (or higher) count of CPU cores
      • The replacement node must have the same (or higher) allocated disk space and mount points as the old node
      • The replacement node must have the same hostname as the old node, unless the node is an AWS EC2 instance using MARKLOGIC_EC2=1(default when using MarkLogic AMIs)

      Preparation steps for re-joining a node into the cluster

      • Install and configure the operating system
        • make sure the mount points are matching the old setup
        • in case the previous storage is healthy it can be reused (forests located on it will be mounted)
      • For any non-MarkLogic data (such as XQuery modules, Deployment scripts etc.) required to run on this node, ensure these are manually zipped and copied over as part of the staging process
      • Copy over MarkLogic configuration files (/var/opt/MarkLogic/*.xml) from a backup of the old node
        • If xdqp ssl enabled is set to true, change the setting to false.  If you can’t do this through the Admin UI, you can manually update the value of xdqp-ssl-enabled to false.
        • To re-enable ssl for xdqp connections once the node has rejoined the cluster, you will need to regenerate the replacement host certificate.  Follow the instructions in theRegenerating a XDQP Host Certificatessection of this article.

      Downloading MarkLogic for the New Host

      MarkLogic Server, and the optional MarkLogic Converters and Filters, can be downloaded from the MarkLogic Developer Community, the most recent versions can be found at the following URLS, and will provide you the option of downloading by either https or curl:

      If the exact version you are running is not available, you may still be able to download it by getting the download link for the closest current version (8,9 or 10), and editing the minor version number in the link.

      So if you need 10.0-1, and the current available version is 10.0-2, when you choose the Download via Curl option, you will get a download link that looks like this:

      https://developer.marklogic.com/download/binaries/10.0/MarkLogic-10.0-2-amd64.msi?t=SomeHashValue/1&email=myemail%40mycompany.com

      Update the URL with the minor release version you need:

      https://developer.marklogic.com/download/binaries/10.0/MarkLogic-10.0-1-amd64.msi?t=SomeHashValue/1&email=myemail%40mycompany.com

      If you are unable to get the version you need this way, then contact MarkLogic Support.

      Rejoining the Replacement Node to the Cluster

      There are two methods to rejoin a host into the cluster, depending on the availability of configuration files.

      1. Using an older set of configuration files from the node being replaced
      2. Creating a new set of configuration files from another node in the cluster

      Method 1: Rejoining the Cluster With Existing Configuration Files

      This procedure can be only performed if existing configuration files from /var/opt/MarkLogic/*.xml are available from the lost/old node otherwise it will fail causes a lot of problems.

      • Perform a standard MarkLogic server installation on the new target node
        • $ rpm -Uvh /path/to/MarkLogic-<version>.x86_64.rpm or yum install /path/to/MarkLogic-<version>.x86_64.rpm
        • $ rpm -Uvh /path/to/MarkLogicConverters-<version>.x86_64.rpm or yum install /path/to/MarkLogicConverters-<version>.x86_64.rpm (optional)
        • Verify local configuration settings in/etc/marklogic.conf (optional)
        • Do not start MarkLogic server
      • Create a new data directory
        • $ mkdir /var/opt/MarkLogic (default location; might already exist if this separate mount point)
        • Verify ownership of the data directory, daemon.daemon by default.
          • To fix: $ chown -R daemon:daemon /var/opt/MarkLogic
      • Copy an existing set of configuration files into the data directory
        • $ cp /path/to/old/config/*.xml /var/opt/MarkLogic
        • Verify ownership of the configuration files, daemon.daemon by default.
          • To fix: $ chown daemon:daemon /var/opt/MarkLogic/*.xml
      • Perform a last sanity check
        • Hostname must be the same as the old node, except for AWS EC2 nodes as mentioned above
        • Verify firewall or Security Group rules are correct
        • Verify mount points, file ownership and permissions are correct
      • Start MarkLogic
        • $ service MarkLogic start
      • Monitor the startup process

      After starting the node it will reuse the existing configuration settings and assume the identity of the missing node. 

      Method 2: Rejoining the Cluster With Configuration Files From Another Node

      This procedure is required if there is no older configuration file set available. For example no file backup was made from /var/opt/MarkLogic/*.xml. It requires manual editing of a configuration file.  

      • Perform a standard MarkLogic server installation on the new target node
        • $ rpm -Uvh /path/to/MarkLogic-<version>.x86_64.rpm or yum install /path/to/MarkLogic-<version>.x86_64.rpm
        • $ rpm -Uvh /path/to/MarkLogicConverters-<version>.x86_64.rpm or yum install /path/to/MarkLogicConverters-<version>.x86_64.rpm (optional)
        • Verify local configuration settings in /etc/marklogic.conf (optional)
      • Start MarkLogic, and perform a normal server setup as a single node. DO NOT join the cluster now.
        • $ service MarkLogic start
        • Perform a basic setup
        • DO NOT join the host to the cluster!
      • Stop MarkLogic, and move current configuration files in /var/opt/MarkLogic to a new location
        • $ service stop MarkLogic
        • $ mv /var/opt/MarkLogic/*.xml/some/place
      • Copy a configuration files set from one of the other nodes over
        • $ scp <othernode>:/var/opt/MarkLogic/*.xml /var/opt/MarkLogic
        • Verify ownership of the data directory, daemon.daemon by default.
          • To fix: $ chown -R daemon:daemon /var/opt/MarkLogic
      • Make note of the <host-id> for the node be recreated in hosts.xml
        • $ grep -B1 hostname /var/opt/MarkLogic/hosts.xml
      • Edit /var/opt/MArkLogic/server.xml **Note: This step is critically important to ensure correct operation of the cluster.
        • Use a UTF-8 safe editor like nano or vi
        • Update <host-id> with the value found in/var/opt/MarkLogic/hosts.xml
        • Update <license-key> value if necessary.
        • Update <licensee> value if necessary.
        • Save the changes
      • Perform a last sanity check
        • <host-id> must match the <host> defined in hosts.xml.
          • Important: host will not start if these values do not match 
        • Hostname must be the same as the old node, unless the node is an AWS EC2 instance using the configuration option MARKLOGIC_EC2=1, which is the default when using the MarkLogic provided AMIs.
        • Firewall or Security Group rules are correct
        • Mount points, ownership and permissions are correct
      • Start MarkLogic and monitor the startup process

      As emphasized in the procedures, it is very important to update server.xml and change the <host-id> to match the value defined in hosts.xml and apply the correct license information. Without these changes the node may not start up, may confuse the other nodes, or it may exhibit unexpected behavior.

      Wrapping Up

      For both methods, the startup process is the same. MarkLogic will use the configuration files to rejoin the cluster. Forests that no longer exist will automatically be recreated. Existing forests that have been mounted or copied to the correct location, will be mounted like before. Forests configured for local disk failover will automatically start synching with the online forests.  If configured, replication will start replicating the forests after the node is started. The forests can also be restored from backup, in case there is no local disk failover, or replication configured.

      Regenerating a XDQP Host Certificates

      The first step in the process is to check the Certificate to see whether it is valid or not.  If you replaced your node using method 1, the certificate is likely to be valid.  If you replaced your node using method 2, then the certificate is likely to be invalid.

      Log into a terminal on the newly replaced host, and extract the private key from /var/opt/MarkLogic/server.xml and the hosts certificate from /var/opt/MarkLogic/hosts.xml:

      • $ cp /var/opt/MarkLogic/server.xml /tmp/server.key
      • Edit /tmp/server.key to remove all XML formatting
        • File should start with "-----BEGIN PRIVATE KEY-----"
        • File should end with "-----END PRIVATE KEY-----"

      Now extract the certificate for the new host from/var/opt/MarkLogic/hosts.xml.

      • $ grep -A25 my-host.name /var/opt/MarkLogic/hosts.xml > /tmp/server.crt
      • Remove all the data from the file, except the certificate for the new host
        • File should start with "-----BEGIN CERTIFICATE-----"
        • File should end with "-----END CERTIFICATE-----"

      Once you have the private key, and the certificate, you can compare the md5 signatures of the files usingopenssl, to see if they match.

      • $ openssl rsa -in /tmp/server.key -noout -modulus | openssl md5; openssl x509 -in /tmp/server.crt -noout -modulus | openssl md5

      If the values match, STOP HERE.  The certificate is valid and does not need to be regenerated. If the values do not match, then the certificate needs to be regenerated.

      Make note of the <host-id> from /var/opt/MarkLogic/server.xml.  This will be used to populate the value for the Common Name (CN) when the certificate is generated.

      • $ grep -B1 hostname /var/opt/MarkLogic/hosts.xml

      Create the new self-signed certificate using the servers private key.  Typically these are set to 10 years (3650 days) by default when MarkLogic first runs, but you can choose another value if needed.  Use the <host-id> from the previos step as the CN.

      • $ sudo openssl req -key /tmp/server.key -new -x509 -days 3650 -out /tmp/new-server.crt -subj "/CN=[server-id-number]"

      Compare the MD5 Checksums with openssl, this time they should match:

      • $ openssl rsa -in /tmp/server.key -noout -modulus | openssl md5; openssl x509 -in /tmp/new-server.crt -noout -modulus | openssl md5

      Make a copy of hosts.xml to replace the certs, also note the host-id for use in a later step.

      • $ cp -p /var/opt/MarkLogic/hosts.xml /tmp/hosts.xml

      Edit /tmp/hosts.xml and replace the old certificate for the host with the new certificate.  Find the entry with the correct <host-id> and replace the <ssl-certificate> field with the new certificate in /tmp/new-server.crt

      Replace the existing hosts.xml with our updated copy

      • $ cp -p /tmp/hosts.xml /var/opt/MarkLogic/hosts.xml

      Restart MarkLogic on the node.  This can be done from any host in the cluster, using the Admin Interface, the REST Management API endpoint, or Query Console.

      • Admin Interface: In the left tree menu, click onConfigure à Hosts à [Hostname], then select theStatus tab and click Restart
      • REST Management API: $ curl --anyauth --user password:password -X POST -i --data "state=restart" -H "Content-type: application/x-www-form-urlencoded" http://localhost:8002/manage/v2/hosts/[host-name]
      • Query Console: xdmp:restart((xdmp:host("engrlab-129-179.engrlab.marklogic.com")), "To reload hosts.xml after certificate update")

      Verify the changes to hosts.xml have propagated to all hosts in the cluster.  Check that the hosts.xml is now the same for the hosts in the cluster.  One way of doing this is comparing md5 checksums.

      • $ md5sum /var/opt/MarkLogic/hosts.xml

      You should now be able to set xdqp ssl enabled to true in the group configurations.  Check the cluster status page in the Administrative Interface to ensure all the hosts have reconnected successfully, or review the ErrorLog files to ensure there are no SVC-SOCACC errors in the log.

      Additional Notes

      This article explains how to directly replace a node in a cluster by using the same host name. Another way is to add a new node to the cluster and transfer the forests which is explained in the following knowledge base article "Replacing a D-Node with local disk failover".

      Some of these steps may differ, such as operating system calls or file system locations. On a different OS, the specific commands will need to be adjusted to match the environment.

      Related Reading

      Replacing a failed MarkLogic node in a cluster: a step by step walkthrough

      Stemming:

      MarkLogic Server supports stemming in English and other languages. If stemmed searches are enabled in the database configuration, MarkLogic Server automatically searches for words that come from the same stem of the word specified in the query, not just the exact string specified in the query. A stemmed search for a word finds the exact same terms as well as terms that derive from the same meaning and part of speech as the search term.

      For e.g. in a stemmed search, a query for 'running' will match 'running', 'run' and 'ran' as they all stem to 'run'. The query is actually stemmed before being resolved, so queries for both 'running' and 'ran' are actually performed as queries for 'run', and they return similar results.

       

      Relevance score for stemmed searches:

       

      Search results in MarkLogic Server return in relevance order; that is, the result that is most relevant to the cts:query expression in the search is the first item in the search return sequence, and the least relevant is the last. (Documentation at http://docs.marklogic.com/guide/search-dev/relevance#chapter gives detailed information of how relevance score is computed).

      However, when using stemmed searches, the original query term and its stemmed matches are both ranked equally. That is, higher relevance score is not given to the exact match of the word.

       

      For example, consider the following 3 documents:

       

      run.xml

      <root>

        <id>001</id>

        <text>run out of time</text>

      </root>

       

      running.xml

      <root>

        <id>002</id>

        <text>running out of time</text>

      </root>

       

      ran.xml

      <root>

        <id>003</id>

        <text>ran out of time</text>

      </root>

       

      The below search query for "running" returns all 3 documents ranked equally.

       

      let $query:= cts:word-query("running")

       

      for $hit in cts:search(doc(), $query,"relevance-trace")

       

      return element hit {

      attribute score { cts:score($hit) },

      xdmp:node-uri($hit)

      }

       

      ==>

       

      <hit score="2048">run.xml</hit>

      <hit score="2048">running.xml</hit>

      <hit score="2048">ran.xml</hit>

      This behavior is desirable  in most search applications. However, to give higher score for the original query term, so that it comes up first in the search results, stemmed and unstemmed word-queries should be combined in an or-query.

      let $query:=

      cts:or-query(

      (cts:word-query("running","stemmed"),

      cts:word-query("running","unstemmed")))

       

       

      for $hit in cts:search(doc(), $query)

      return element hit {

      attribute score { cts:score($hit) },

      xdmp:node-uri($hit)

      }

       

      ==>

       

      <hit score="11264">running.xml</hit>

      <hit score="1024">run.xml</hit>

      <hit score="1024">ran.xml</hit>

      Note that for the above cts:or-query, 'word searches' option should be enabled for the database, else  the query returns an XDMP-WORDSEARCH  error.

      Introduction

      MarkLogic Server offers Fast Data Directories, and Large Data Directories to allow customers to better utilize their available infrastructure. This allows an organization to offload large objects to cheaper storage, or improve performance with SSDs for portions of a forest.  These directories are defined at the forest level, usually when the forest was created.

      Removing Fast or Large Data Directories

      There are two primary methods to remove these directories from a forest.

      • Rebalance to a new forest
      • Backup/Restore to a new forest

      Rebalancing to a New Forest

      This method takes advantage of the rebalancing mechanism in the server to move data from the forest with the Fast/Large Data Directories. New forests can be defined as part of this process, but it is not required.  The advantage of this method is that it does not require any downtime.  The primary disadvantage is that in can increase the IO, and CPU load on the servers as the data is moved between forests, and can result in data being moved more than once. If needed, these issues can be mitigate by adjusting the rebalancer priority and merge settings.

      Backup/Restore to a new forest

      This method allows a simple 1 for 1 swap of a forest with a Fast/Large Data Directory to one without these directories.  The advantage of this method is that, depending on the size of the forest, it can be completed faster than rebalancing.  There are a couple of disadvantages to this method.  The first is that the forest being replaced needs to be in read only mode when the backup is taken, until the restore is complete to the new forest.  The second is that it does require some downtime when switching between the old and new forests.  These issues can be mitigated with some careful planning.

      Procedures for Using Rebalance

      • Create the new forest/s
      • Attach the new forest/s to the database AND retire the existing forest/s
        • This will cause the database to rebalance, and move the data from the old forest/s to the new forest/s.
      • Detach the old forest/s from the database once the forest/s no longer have active documents or active fragments.
      • Delete the old forest/s

      Procedures for Using Backup/Restore

      • Put the forest/s in read only mode and perform a forest level backup
        • Database level backups can be used, but the whole database will need to be in read only mode when the backups are started.
      • Create a new forest.  Do not attach it to the database yet.
      • Restore the backup to the new forest/s
      • Verify the old forest/s and new forest/s have the same active document and active fragment count.
      • Detach the old forest/s and attach the new forest/s
      • Delete the old forest/s

      References

      This is a procedure to take hosts out of a MarkLogic cluster with minimal unavailability. It is assumed that High availability is configured using local disk failover and all master forests have at least one replica forest configured.

      When a host in a MarkLogic cluster becomes unavailable, the host is not be fully disconnected from the cluster until the configured host timeout (default is 30 seconds) expires. If a master forest resides on that host, the database and any application that references it will be unavailable from the time the host becomes unavailable until all replica forests assume the role of acting master.  

      If the host unavailability is planned, then you can take steps to minimize the database and application unavailability.  This article discusses that a procedure. 

      Planning

      When a host is removed from the MarkLogic cluster, all the remaining hosts must assume the workload previously performed by that host.  For this reason, we recommend

      • Scheduling server maintenance during low usage periods.
      • Evenly distributing a host's replica forests across the other nodes in the cluster so that the extra workload is evenly distributed when that host is unavailable. 
      • Minimize the number of hosts removed for maintenance at any one time.   

      If removing more than one host at a time:

      • Define a maintenance group of hosts containing configured master forests that have their local disk replica forests on hosts not in the maintenance group.
      • All required forests must have replica forests defined. This includes all content forest, security database forests and forests for all linked schema databases. 

      Important Note: Schemas and Security databases must also be configured for high availability. Unavailability of Schemas and Security databases can impact availability of other databases, and also availability of administrative functions.

      • Maintenance groups should be sized so that the remaining available hosts represents a reasonable portion of compute, memory and IO resources that can absorb the extra workload required during the maintenance period.

      Step 0: Verify all replica forests are synchronized

      Before initiating this procedure, verify that all replica forests are in sync with the master forest by checking the forest status of the replicas are in the “sync replicating” state.

      This step can also be achieved via script using the MarkLogic Server administrative function xdmp:foreststatus or management api GET /manage/v2/forests/{id|name}?view=status.

      Step 1: Force failover from master forests to replica forests

      Disable all master forests on the maintenance group together.  This minimizes the database unavailability time as the forest failing over from the master forest to replica forest can happens in parallel.

      This step can also be achieved via script using the MarkLogic Server administrative function admin:forest-set-enabled or management api POST /manage/v2/forests/{id|name}.

      Step 2: Verify failover succeeded.

      Wait until all of the replica forests take over – configured replica forests are now the acting master forests and in the “open” state, while the configured master forest is now disabled.  You can manually monitor forest status in the Admin UI by refreshing the Forest status display.  Once all forests have assumed their new roles, the database will be online.

      This step can also be achieved via script using the MarkLogic Server administrative function xdmp:foreststatus or management api GET /manage/v2/forests/{id|name}?view=status.

      Step 3: Shutdown hosts and perform maintenance

      Shutdown MarkLogic Server instance on all hosts in the maintenance group.

      Verify the rest of the cluster is still responsive before taking down the hosts themselves.

      When maintenance complete, bring hosts back online.

      Step 4: Enable configured master forests 

      Once all hosts are back online, enable all forests disabled in step 1.  Once enabled, the configured master forests will assume the role of acting replica forest and will initiate a process to synchronize the master/replica pairs.

      This step can also be achieved via script using the MarkLogic Server administrative function admin:forest-set-enabled or management api POST /manage/v2/forests/{id|name}.

      Step 5: Verify forests synchronized

      Before forcing the configured master forests to assume the role of acting master, verify all acting replica forests are in sync with the acting master forest by checking the forest status of the acting replica forests are in the “sync replicating” state.

      This step can also be achieved via script using the MarkLogic Server administrative function xdmp:foreststatus or management api GET /manage/v2/forests/{id|name}?view=status.

      Step 6: Force configured master forests to resume acting master forest role. 

      In order to force the configured master forests to assume the role of acting master forests, restart the configured replica / acting master forests together.  Restarting all forests together will help minimize outage impact.

      This step can also be achieved via script using the MarkLogic Server administrative function xdmp:forest-restart or management api POST /manage/v2/forests/{id|name}. 

      Further Reading

      Scripting Failover: "flipping" replica forests back to their masters using XQuery

      Further reading

      Introduction

      Using MarkLogic Server's Admin UI, it is possible to modify the name of a single host via Admin UI -> Configure -> Hosts -> 'Select Host in question' and update the name and click ok.

      However, if you would want to change/update the hostnames across cluster, we recommend that you follow the below steps:

      1) Renaming hosts in a cluster

      • Add the new hostnames to the DNS or /etc/hosts on all hosts.
      • Make sure all new hostnames can be resolved from the nodes in the cluster.
      • Rename all host-names in Admin-UI or using Admin-API function admin:host-set-name() to the new names.
        • Note: changing the hostname will require a restart.
      • Host/cluster should come up if the DNS entries have been set up correctly.
      • Remove old host names.

      2) Once the hostnames are updated, we recommend you verify the items below that may be affected by hostname changes:

      • Application Servers
      • PKI Certificates
      • Database replication
      • Flexible replication
      • Application code

      Introduction

      In a multiple node cluster with local disk failover configured, there may be a need to replace a server with new hardware. This article explains how to do that while preserving the failover configuration.

      Sample configuration

      Consider a 3-node cluster with local disk failover for database Test, and the forest assignment for the hosts looks like this:  (all forests ending with 'p' are primary and those ending with 'r' are replica)

      Host A Host B Host C
      forest a-1p forest b-3p forest c-5p
      forest a-2p forest b-4p forest c-6p
      forest a-3r forest b-1r forest c-2r
      forest a-6r forest b-5r forest c-4r

      With this configuration under normal operations, each host will have the two primary forests "open" and the replica forests "sync replicating".

      Failover Example

      In the event of a node failure of say, Host B, primary forests on Host B will failover to Hosts A & C as expected. The forests a-3r and c-4r are now "open" and acting as master forests. 

      When Host B comes back online, the replica forests a-3r and c-4r will continue as acting masters, and forests b-3p & b-4p on Host B will now act as replicas; This state will persist until another failover event occurs or the forests are manually restarted.

      Replacing a Host 

      In the case where a node in the cluster needs to be physically replaced with another node, it is important to preserve the original master-replica configuration of the forests, so that there is no performance burden on a single node hosting all the primary forests.

      Example: replacing Host-B with a new Host-D

      The steps listed below show how to replacing a node (old Host-B with new Host-D) without affecting the failover configuration:

      1. Shut down Host B and make sure forest failover successful - Forests c-4r & a-3r are "open" (acting masters).
      2. Add Host D as a node to the cluster; 
      3. Create new replica forests (d-1r and d-5r) on Host D and make them replicas of the corresponding primary forests on Host A & C. 
      4. Create new primary forests 'd-3p' and 'd-4p' on Host D  (These will replace b-3p and b-4p); 
      5. Break replication between a-1p and b-1r, and between c-5p and b-5r by updating the forest configuration for the primary forests.
      6. Take forest level backup of the failed over forests ('forest a-3r' and 'forest c-4r')
      7. Restore the backups from step 6 to the new primary forests 'forest d-3p' and 'forest d-4p' on Host D
      8. Attach forests 'forest d-3p' and 'forest d-4p' to the database and make forests 'forest a-3r' and 'forest c-4r'  their replicas.

      This will replace Host B with Host D, as well as preserve the previously existing primary-replica configuration among the hosts.

      Host A Host D Host C
      forest a-1p forest a-2p forest a-3r
      forest d-3p forest d-4p forest d-1r
      forest c-5p forest c-6p forest c-2r
      forest a-6r forest d-5r forest c-4r

      Additional Notes

      It is important to make sure that the database is quiesced before taking the forest backups. The idea is to disallow ingestion/updates on the database: One technique is to quiesce a database by making all of its forests 'read-only' -  http://docs.marklogic.com/guide/admin/forests#id_72520 during the process and revert once complete.

      Note: This example assumes a distributed master-replica configuration of a 3-node cluster. However, the same procedure works with other configurations with some careful attention to the number of forests on each host and breaking replication between the right set of hosts.

       

       

       

       

       

       

      Summary

      A Socket bind error will occur when there are more than two MarkLogic Server instances running simultaneously in the same host. Two simultaneous instances of MarkLogic Server might occur if a MarkLogic Server process did not gracefully shutdown while a new one was spawned.

      Example error messages seen in the MarkLogic Server ErrorLog.txt file:

      Critical: Server::updateConfigServers: SVC-SOCBIND: Socket bind error: bind 0.0.0.0:8000: Address already in use
      Critical: Server::updateConfigServers: SVC-SOCBIND: Socket bind error: bind 0.0.0.0:8001: Address already in use
      Critical: Server::updateConfigServers: SVC-SOCBIND: Socket bind error: bind 0.0.0.0:8002: Address already in use

      It is dangerous for two instances of the server to be running simultaneously on the same host. Both instances will attempt to operate from the same server configuration files and on the same forest data files. The behavior is unpredictable and, in the worst case, it might lead to inconsistent data.

      Mitigation

      If you suspect that there are multiple MarkLogic Server instances running at the same time on the same host, you should follow these steps:

      1. To get a list of MarkLogic processes running, execute

      ps -ef | grep -i mark

      Under normal circumstances, it will return 2 process - a watchdog process running at root and the main MarkLogic Server process.  For example, the ps command  should return something like. 

      root 1766 1 0 Apr03 ? 00:00:00 /opt/MarkLogic/bin/MarkLogic
      daemon 1767 1766 0 Apr03 ? 04:00:24 /opt/MarkLogic/bin/MarkLogic

      2. Run the above command on all hosts in your MarkLogic cluster. If you discover more than the expected 2 processes on any single host, then 
          -  Shutdown MarkLogic on the node and verify that no MarkLogic processes are running.
          -  If there are still MarkLogic processes running, kill the processes by executing

           kill -9 <pid>

      where <pid> is the process id discovered while executing the ps command.

          -  If that still does not clear the errant MarkLogic process, reboot the host machine.

      3. Once there are no more MarkLogic Server processes running, restart MarkLogic Server.

       

      Backwards Compatibility

      Newer versions of MarkLogic will support backups taken from older versions of the software.  This restore may cause a reindex of the data in order to upgrade the database to the current feature release version.  Information on backing-up/restoring can be found in the following documentation:

      Database Level Backups: Backing Up and Restoring a Database

      Forest Level Backups and Restores: Making Backups of a Forest, Restoring a Forest

      Upgrade compatibility: Upgrades and Database Compatibility

      Downgrading

      MarkLogic does not support downgrading to an older version.  Therefore, backups that were taken on a newer version of MarkLogic will not be compatible with older versions of MarkLogic.  For more details please see MarkLogic Server Version Downgrades are Not Supported.

      Introduction

      When a database backup taken on Cluster A is restored (using incremental backup) on Cluster B, sometimes it fails with the message on the admin screen  -

      The database restore has failed. Please check the server logs for details.

      A quick look at the logs will show an error indicating that the backup directory does not exist, even though the backup was copied from Cluster A to Cluster B

      Error: TaskManager::runTask: XDMP-FORESTRESTOREFAILED: Restore failed for forest Documents: SVC-DIROPEN: Directory open error: opendir '/tmp/backup/20180827-1607002170310/20180827/1609002389230/Forests/Documents': No such file or directory

      Error: 1-forest database restore from /space/backup/20180827-1607002170310, jobid=472666486696782942 failed: XDMP-FORESTRESTOREFAILED: Restore failed for forest Documents: SVC-DIROPEN: Directory open error: opendir '/tmp/backup/20180827-1607002170310/20180827/1609002389230/Forests/Documents': No such file or directory

      This happens when the backup directory structure is different between the clusters. For example, on Cluster A, the backup directory exists under /tmp/backup.

      When copying the backup for restore on Cluster B, it was copied to /space/backup.

      Even though the backup directory was moved to a different location, per the error logs, the restore job is looking to find it in the old location (/tmp/backup) and fails as it does not find it.

      Resolution

      Every incremental backup will store a reference to the location of the previous incremental backup and the very first one will store a reference to the location of the full backup. These are stored in a file by the name BackupTag.txt .It is from here that the restore job fetches the backup locations and if they still point to an older location, then incremental restore will fail.

      To get past this, BackupTag.txt which is located under incremental-backup-directory/incremental-backup/Forests/forest-name/ should be edited such that the BasePath parameter reflects the current backup directory.

      For example, on Cluster B, BasePath in BackupTag.txt(/space/backup/20180827-1607002170310/20180827/1609002389230/Forests/Documents) should be changed from


      BasePath /tmp/backup/20180827-1607002170310  to 

      BasePath /space/backup/20180827-1607002170310

      This should be done on every incremental backup in the directory.

      Note that the example presented in this article does not specify a separate location for incremental backups.

      Further Reading

      Backup Directory Structure

      Notes about Backup and Restore Operations

      Incremental Backup

      Incremental Backup - RTO and RPO Considerations

      Disaster Recovery

       

      Summary

      When performing a Security database backup on one cluster and restoring on another cluster, there are precautionary measures to be taken. 

      Details

      Since MarkLogic Server version 4.1-5,  the internal user IDs are derived from the hash of the user name when the user object is created. Thus, two user objects created on two different Security databases should have the same user ID if they are created with the same name. This makes it possible to restore a Security database from one environment to another.

      However, we strongly recommend checking for the below conditions before restore in order to avoid any serious damage to the Security database. 

      • Ensure that both the environments are running the same MarkLogic Server versions and are on the same Operating System.
      • Verify that no Users, Roles or Amps have been added to the new cluster, that are not also present in the original cluster. Restoration of the Security database is a complete replacement, and any intentional differences in the two clusters will be lost.   Any applications using obsolete roles might become inaccessible.

      Although the user IDs are derived from the hash of the username, the id's can be different in some cases:

      • If there is already was an existing user object with that id when a new user was created (i.e. hash collision)
      • The username was changed on an existing user object.

      Review all the above conditions before restoring the Security database.

      Note: It is recommended that a backup of the security database from the new cluster is created and saved before performing the restore of a Security database from a different cluster.

      Restoring from a different server version

      When restoring the Security database from a backup made on an older version of ML server to a newer version of ML, a manual upgrade of the Security db is also required after the restore. Without this additional step, there is a mismatch between the server version and the security database version and some features will not work as expected. There will be issues with reindexing, query results,etc.

      A security database upgrade can be done by navigating to Admin UI -> 'Support' tab -> click on 'Upgrade' button on the bottom right corner

      Note that MarkLogic does not support restoring a backup made on a newer version of MarkLogic Server onto an older version of MarkLogic Server.

      Restoring Security Database with different Certificate template content

      If your AppServer is associated with Template and Security DB you intend to restore has different Template then to avoid lingering Template ID, we recommend that you detach AppServer to Template association for app servers(disabling SSL) prior to restoring security DB, please read -  Security Database restore leading to lingering Certificate Template id in Config files 

       

       

       

       

      Summary

      If MarkLogic Server is installed on an Amazon Elastic Compute Cloud (EC2) instance and you execute queries in the MarkLogic Query Console, it is possible that the queries will be silently cancelled. Long running queries may time out because of an AWS attached Load Balancer.

      Details,

      The Amazon Elastic Load Balancer (ELB) performs health check on running instances using protocols, timeouts etc.  The ELB terminates a connection if it is idle for more than 60 seconds. An idle connection is established when there is no action or event performed i.e. read or write. Consequently, when queries run for more than 60 seconds, the load balancer will think the connection is idle and will terminate it. When the ELB terminates a Query Console connection, it does not give any message in the display. Instead, an “XDMP-CANCELLED” message is logged to the MarkLogic ErrorLog.txt file. An XDMP-CANCELLED message indicates that query was cancelled either explicitly or as a result of a system event.

      Removing the Load balancer from your EC2 instance is one solutions to enable long running Query Console queries on an Amazon EC2 instance.

       

      [ref: http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/ts-elb-healthcheck.html]

      Introduction: Summary

      When attempting to access files stored on an Amazon AWS S3 Bucket using MarkLogic an SVC-S3SOCERR error is raised. 

      Cause

      Under some conditions when installing a MarkLogic server, such as unattended scripted installation, some required CA Root certificates were missed resulting in the error seen.

      Prior to MarkLogic 9.0-5, there was an error in the CA certificate installation processed whereby some certificates were incorrectly flagged as disabled and therefore not installed.

      Resolution

      Upgrade to the latest version of MarkLogic to ensure all required certificates are installed as well as any recent and updated CA Root certificates.

      In the interim, the missing AWS Root certificates can be downloaded from the Amazon Trust Repository and installed manually using the MarkLogic Admin UI (Configure -> Security -> Certificate Authorities)

      References

      Introduction

      If you have forest level failover configured on your MarkLogic cluster, in the event that a single host in the cluster loses contact with the other hosts, the forests will fail over to the backup set of forests: the replica forests

      What should I do in the event of a failover?

      Failover shifts the responsibility for a given set of forests over to other hosts in the cluster; if the failing host "loses" control of its' forests, control is not automatically given back when the master becomes available; failing forests has to happen manually.

      To fail a forest back (to "flip" control back to the master), if both the replica and master forests are in sync with each other, all that's needed is to restart the replica forest. This can be done using the admin API (Configure > Forests > Forest Name > Status > Restart), or XQuery (xdmp:forest-restart):

      https://docs.marklogic.com/xdmp:forest-restart

      flip-forests.xqy

      The above code is intended as a sample for something that could be used as a scheduled task that will automatically check for failed over forests and to flip them back where possible.

      It's also worth noting that this may not be something you'd want to do; in many cases, a failover event might be a warning of a problem that occurred that needs to be investigated (for example: a disk error), so if you are planning on managing failing back forests automatically, you may want to ensure that you are monitoring the ErrorLogs for evidence of failover events so you know that they're happening.

      Further reading

      Summary

      This article explores fragmentation policy decisions for a MarkLogic database, and how search results may be influenced by your fragmentation settings.

      Discussion

      Fragments versus Documents

      Consider the below example.

      1) Load 20 test documents in your database by running

      let $doc := <test>{
      for $i in 1 to 20 return <node>foo {$i}</node>
      }</test>
      for $i in 1 to 20
      return xdmp:document-insert ('/'||$i||'.xml', $doc)

      Each of the 20 documents will have a structure like so:

      <test>
          <node>foo 1</node>
          <node>foo 2</node>
                 .
                 .
                 .
          <node>foo 20</node>
      </test>
      

      2) Observe the database status: 20 documents and 20 fragments.

      3) Create a fragment root on 'node' and allow the database to reindex.

      4) Observe the database status: 20 documents and 420 fragments. There are now 400 extra fragments for the 'node' elements.

      We will use the data with fragmentation in the examples below.


      Fragments and cts:search counts

      Searches in MarkLogic work against fragments (not documents). In fact, MarkLogic indexes, retrieves, and stores everything as fragments.

      While the terms fragments and documents are often used interchangeably, all the search-related operations happen at fragment level. Without any fragmentation policy defined, one fragment is the same as one document. However, with a fragmentation policy defined (e.g., a fragment root), the picture changes. Every fragment acts as its own self-contained unit and is the unit of indexing. A term list doesn't truly reference documents; it references fragments. The filtering and retrieval process doesn't actually load documents; it loads fragments. This means a single document can be split internally into multiple fragments but they are accessed by a single URI for the document.

      Since the indexes only work at the fragment level, operations that work at the level of indexing can only know about fragments.

      Thus, xdmp:estimate returns the number of matching fragments:

      xdmp:estimate (cts:search (/, 'foo')) (: returns 400 :)

      while fn:count counts the actual number of items in the returned sequence:

      fn:count (cts:search (/, 'foo')) (: returns 20 :)


      Fragments and search:search counts

      When using search:search, "... the total attribute is an estimate, based on the index resolution of the query, and it is not filtered for accuracy." This can be seen since


      import module namespace search = "http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy";
      search:search("foo",
      <options xmlns="http://marklogic.com/appservices/search">
      <transform-results apply="empty-snippet"/>
      </options>
      )

      returns

      <search:response snippet-format="empty-snippet" total="400" start="1" page-length="10" xmlns:search="http://marklogic.com/appservices/search">
      <search:result index="1" uri="/3.xml" path="fn:doc(&quot;/3.xml&quot;)" score="2048" confidence="0.09590387" fitness="1">
      <search:snippet/>
      </search:result>
      <search:result index="2" uri="/5.xml" path="fn:doc(&quot;/5.xml&quot;)" score="2048" confidence="0.09590387" fitness="1">
      <search:snippet/>
      </search:result>
      .
      .
      .
      <search:result index="10" uri="/2.xml" path="fn:doc(&quot;/2.xml&quot;)" score="2048" confidence="0.09590387" fitness="1">
      <search:snippet/>
      </search:result>


      Notice that the total attribute gives the estimate of the results, starting from the first result in the page, similar to the xdmp:estimate result above, and is based on unfiltered index (fragment-level) information. Thus the value of 400 is returned.

      When using search:search:

      • Each result in the report provided by the Search API reflects a document -- not a fragment. That is, the units in the Search API are documents. For instance, the report above has 10 results/documents.
      • Search has to estimate the number of result documents based on the indexes.
      • Indexes are based on fragments and not documents.
      • If no filtering is required to produce an accurate result set and if each fragment is a separate document, the document estimate based on the indexes will be accurate.
      • If filtering is required or if documents aggregate multiple matching fragments, the estimate will be inaccurate. The only way to get an accurate document total in these cases would be to retrieve each document, which would not scale.

      Fragmentation and relevance

      Fragmentation also has an effect on relevance.  See Fragments.


      Should I use fragmentation?

      Fragmentation can be useful at times, but generally it should not be used unless you are sure you need it and understand all the tradeoffs. Alternatively, you can break your document into subdocuments instead. In general, the search API is designed to work better without fragmentation in play.

      What is DLS?

      The Document Library Service (DLS) enables you to create and maintain versions of managed documents in MarkLogic Server. Access to managed documents is controlled using a check-out/check-in model. You must first check out a managed document before you can perform any update operations on the document. A checked out document can only be updated by the user who checked it out; another user cannot update the document until it is checked back in and then checked out by the other user. 

      Searching across latest version of managed documents

      To track document changes, you can store versions of a document by defining a retention policy in DLS.  However, it is often the latest version of the document that most of the people are intereseted in. MarkLogic provides a function dls:documents-query which helps you access latest versions of the managed documents in the database. There are situations where there are performance overhead in using this function.  When the database has millions of managed documents you may see some performance overhead in accessing all the latest versions. This is an intrinsic issue related to because of large numbers of files and joining across properties.

      How can one improve the search performance?

      A simple workaround is to add your latest versions in a collection (say "latest"). Instead of the API dls:documents-query, you can then use a collection query on this "latest" collection. Below are two approaches that you can use - while the first approach can be used for new changes (inserts/updates), the second approach should be used to modify the existing managed documents in the database.

      1.) To add new inserts/updates to "latest" collection

      Below are two files, manage.xqy, and update.xqy that can be used for new inserts/updates.

      In manage.xqy, we do an insert and manage, and manipulate the collections such that the numbered document has the "historic" collection and the latest document has the "latest" collection. You have to use xdmp:document-add-collections() and xdmp:document-remove-collections() when doing the insert and manage because it's not really managed until after the transaction is done.

      In update.xqy, we do the checkout-update-checkin with the "historic" collection (so that we don't inherit the "latest" collection from the latest document), and then add "latest" and remove "historic" from the latest document. 

      (: manage.xqy :)
      xquery version "1.0-ml";
      import module namespace dls = "http://marklogic.com/xdmp/dls" at "/MarkLogic/dls.xqy";
      dls:document-insert-and-manage(
        "/stuff.xml",
        fn:false(),
        <test>one</test>,
        "created",
        (xdmp:permission("dls-user", "read"),
         xdmp:permission("dls-user", "update")),
        "historic"),
      xdmp:document-add-collections(
        "/stuff.xml",
        "latest"),
      xdmp:document-remove-collections(
        "/stuff.xml",  "historic")

      (: update.xqy :)
      xquery version "1.0-ml";
      import module namespace dls = "http://marklogic.com/xdmp/dls" at "/MarkLogic/dls.xqy";
      dls:document-checkout-update-checkin(
        "/stuff.xml",
        <test>three</test>,
        "three",
        fn:true(),
        (),
        ("historic")),
      dls:document-add-collections(
        "/stuff.xml",
        "latest"),
      dls:document-remove-collections(
        "/stuff.xml",
        "historic")

      2.) To add the already existing managed documents to the "latest" collection

      To add the latest version of documents already existing in your database to the "latest" collection you can do the following in batches.

      xquery version "1.0-ml";
      import module namespace dls = "http://marklogic.com/xdmp/dls" at "/MarkLogic/dls.xqy";
      declare variable $start external ;
      declare variable $end   external ;
      for $uri in cts:search(fn:collection(), dls:documents-query())[$start to $end]/document-uri(.) 
      return xdmp:document-add-collections($uri, ("latest"))

      This way you can segregate historical and latest version of the managed documents and then, instead of using dls:documents-query, you can use the "latest" collection to search across the latest version of managed documents.

      Note: Although this workaround may work when you want search across the latest version of managed documents, it does not solve all the cases. dls:documents-query is used internally in many dls.xqy calls so not all functionality will be improved.

      Summary

      This knowledge base discusses the various aspect of vulnerabilty found in glibc library (CVE-2015-7547) in respect to MarkLogic Server.

      Please note - We do not expect any changes to be done at MarkLogic Application software level to protect against vulnerability, but we highly recommend that affected Linux OS platform (using affected library version) get latest patch to protect against exposure. 

       

      1) MarkLogic Dependency 

      Application layer software like MarkLogic relies on underneath Operating System for various operations, critically Memory Managment. On Linux platform, glibc library is the prime lirbary package, providing different memory capability to Application layer.

      MarkLogic package installation depends upon the avaibility of glibc library from OS layer (Checking MarkLogic rpm for dependency).

      $ rpm -qpR MarkLogic-8.0-4.2.x86_64.rpm 
      lsb 
      gdb 
      libc.so.6(GLIBC_2.11)(64bit) 
      libgcc_s.so.1()(64bit) 
      libstdc++.so.6()(64bit) 
      libc.so.6(GLIBC_2.11) 
      cyrus-sasl 
      /bin/sh 
      /bin/sh 
      rpmlib(PayloadFilesHavePrefix) <= 4.0-1
      rpmlib(CompressedFileNames) <= 3.0.4-1
      rpmlib(PayloadIsXz) <= 5.2-1

      After Installation Dynamic Library Load for MarkLogic binary on Test Platform

      $ pwd
      /opt/MarkLogic/bin

      $ ldd MarkLogic | grep libc.so
      libc.so.6 => /lib64/libc.so.6 (0x000000316aa00000)

      $ ls -al /lib/libc.so.6 
      lrwxrwxrwx. 1 root root 12 Oct 28 2014 /lib/libc.so.6 -> libc-2.12.so 

       

      2) glibc library Vulnerability (CVE-2015-7547)

      The code that causes the vulnerability was introduced in May 2008 as part of glibc 2.9, and only present in glibc's copy of libresolv which has enhancements to carry out parallel A and AAAA queries. Therefore only programs using glibc's copy of the code have this problem.

      Please read further at - https://sourceware.org/ml/libc-alpha/2016-02/msg00416.html

       

      3) Patch for Red Hat Enterprise Linux 6 & 7 

      This issue does not affect the versions of glibc as shipped with Red Hat Enterprise Linux 3, 4 and 5.
      For Red Hat Enterprise Linux version 6 & 7, Red Hat has made latest packages with fix available as of - 02/16/2016 (below url)
      https://access.redhat.com/security/cve/cve-2015-7547

       

      Related Reading

      GHOST: glibc vulnerability (CVE-2015-0235) - https://access.redhat.com/articles/1332213

      US-CERT: https://www.us-cert.gov/ncas/current-activity/2016/02/17/GNU-glibc-Vulnerability

      Introduction

      MarkLogic stores Certificate files in security database. All user created Security files are stored along with template ID in Security Database.

      For example, new signed Certificate installed will be stored as uri -http://marklogic.com/xdmp/pki/certificates/160051481396114827.xml and it will have  template id value in it (<pki:template-id>13176215136521847243 </pki:template-id>)

      Reference for template ID is also stored in groups.xml of that App Server config file when Cert template is attached to a specific App Server.

      Template Id is only configuration value which has two way reference, one to value stored in groups.xml config file and other is value inside Security DB Cert URL document.

      Problem Statement

      When security database is restored, it replaces existing Certificate files in Security Database along with reference for old Template ID. Now, if Template ID is still referenced by any AppServer, previous SSL App Server which never detached Cert template prior to Security DB restore, then ‘groups.xml’ file will still have reference to nonexistence Template ID.  

      In that scenario, user will receive an HTTP 500 Internal server error. 

      500: Internal Server Error ADMIN-BADCERTTEMPLATE: (err:FOER0000) '18321675798544961903' is not a valid certificate template id In /MarkLogic/admin.xqy on line 15197 In validate-certificate-template-id("18321675798544961903", <xs:element name="ssl-certificate-template" type="ssl-certificate..." .../>) $value = "18321675798544961903" $typ = <xs:element name="ssl-certificate-template" type="ssl-certificate..." .../> $id = xs:unsignedLong("18321675798544961903") $template = ()

      How to avoid the situation from occurring?

      Best path is to remove all App Server to Template Id association by going through each AppServers before any Security Database restored. Once Security Database restore is done, AppServer to new Templates association based on restored Security can be done again to enable SSL for App Server.

      How to recover? 

      Workaround for this, will be to stop MarkLogic Service and remove Template ID from Config files as well. groups.xml Config file is located at /var/opt/MarkLogic/config.xml location,  and lingering Template ID can be found under App Server <ssl-certificate-template> tag which needs to be removed.

      Please follow below steps to replace the groups.xml on cluster. 

      1. Stop the cluster ->Stop service on each host, starting by bootstrap host first and then stop service on all other hosts(Ex: as root user to stop MarkLogic service ("$/sbin/service MarkLogic stop")
      2. Go to groups.xml, located in /var/opt/MarkLogic folder -> You can move existing groups.xml file to /tmp/groups.xml.
      3. Set the template to zero for all matching lines for <ssl-certificate-template>

               <ssl-certificate-template>0</ssl-certificate-template>

      1. Restart MarkLogic -> Restart service, starting with bootstrap host.
      2. You can enable App Servers with SSL again through Admin GUI (Admin API) again with available Templates.

      In latest version of MarkLogic, Warning message can be found about missing certificate template ID in Config file. However, there is further work that is still in progress to avoid issue from occurring all together, which requires certain redesign.

      Related MarkLogic Documentation

      Configuring SSL on App Servers

      Restoring Security Database

      Introduction

      This article discusses the capabilities of JavaScript and XQuery, and the use of JSON and XML, in MarkLogic Server, and when to use one vs the other.

      Details

      Can I do everything in JavaScript that I can do in XQuery? And vice-versa?

      Yes, eventually. Server-side JavaScript builds upon the same C++ foundation that the XQuery runtime uses. MarkLogic 8.0-1 provides bindings for just about every one of the hundreds of built-ins. In addition, it provides wrappers to allow JavaScript developers to work with JSON instead of XML for options parameters and return values. In the very few places where XQuery functionality is not available in JavaScript you can always drop into XQuery with xdmp.xqueryEval(...).

      When should I use XQuery vs JavaScript? XML vs JSON? When shouldn’t I use one or the other?

      This decision will likely depend on skills and aspirations of your development team more than the actual capabilities of XML vs JSON or XQuery vs JavaScript. You should also consider the type of data that you’re managing. If you receive the data in XML, it might be more straightforward to keep the data in its original format, even if you’re accessing it from JavaScript.

      JSON

      JSON is best for representing data structures and object serialization. It maps closely to the data structures in many programming languages. If your application communicates directly with a modern browser app, it’s likely that you’ll need to consume and produce JSON.

      XML

      XML is ideal for mark-up and human text. XML provides built-in semantics for declaring human language (xml:lang) that MarkLogic uses to provide language-specific indexing. XML also supports mixed content (e.g., text with intermingled mark-up), allowing you to "embed" structures into the flow of text.

      Triples

      Triples are best for representing atomic facts and relationships. MarkLogic indexes triples embedded in either XML or JSON documents, for example to capture metadata within a document.

      JavaScript

      JavaScript is the most natural language to work with JSON data. However, MarkLogic’s JavaScript environment also provides tools for working with XML. NodeBuilder provides a pure JavaScript interface for constructing XML nodes.

      XQuery

      XQuery can also work with JSON. MarkLogic 8 extends the XQuery and XPath Data Model (XDM) with new JSON node tests: object-node(), array-node(), number-node(), boolean-node(), and null-node(). One implication of this is that you can use XPath on JSON nodes just like you would with XML. XML nodes also implement a DOM interface for traversal and read-only access.

      Summary

      If you’re working with data that is already XML or you need to model rich text and mark-up, an XML-centric workflow is the best choice. If you’re working with JSON, for example, coming from the browser, or you need to model typed data structures, JSON is probably your best choice.

       

       

      Introduction

      This article discusses how JavaScript is implemented in MarkLogic Server, and how can modules be reused?

      Is Node.js embedded in the server?

      MarkLogic 8 embeds Google's V8 JavaScript engine, just like Node.js does, but not Node.js itself. Both environments use JavaScript and share the core set of types, functions, and objects that are defined in the language. However, they provide completely different contexts.

      Can I reuse code written for Node in Server-Side JavaScript?

      Not all JavaScript that runs in the browser will work in Node.js; Similarly, not all JavaScript that runs in Node.js will work in MarkLogic. JavaScript that doesn’t depend on the specific environment is portable between MarkLogic, Node.js, and even the browser.

      For example, the utility lodash library can run in any environment because it only depends on features of JavaScript, not the particular environment in which it’s running.

      Conversely, Node’s HTTP library is not available in MarkLogic because that library is particular to JavaScript running in Node.js, not built-in to the language. (To get the body of an HTTP request in MarkLogic, for example, you’d use the xdmp.getRequestBody() function, part of MarkLogic’s built-in HTTP server library.) If you’re looking to use Node with MarkLogic, we provide a full-featured, open-source client API.

      Will you allow npm modules on MarkLogic?

      JavaScript libraries that don’t depend on Node.js should work just fine, but you cannot use npm directly today to manage server-side JavaScript modules in MarkLogic. (This is something that we’re looking at for a future release.)

      To use external JavaScript libraries in MarkLogic, you need to copy the source to a directory under an app server’s modules root and point to them with a require() invocation in the importing module.

      What can you import?

      JavaScript modules

      Server-side JavaScript in MarkLogic implements a module system similar to CommonJS. A library module exports its public types, variables, and functions. A main module requires a library module, binding the exported types, variables, and functions to local “namespace” global variables. The syntax is very similar to the way Node.js manages modules. One key difference is that modules are only scoped for a single request and do not maintain state beyond that request. In Node, if you change the state of a module export, that change is reflected globally for the life of the application. In MarkLogic, it’s possible to change the state of a library module, but that state will only exist in the scope of a single request.

      For example:

      // *********************************************
      // X.sjs

      module.exports.blah = function() {
          return "Not Math.random";
      }

      // *********************************************
      // B.sjs

      var x = require("X.sjs");

      function bTest() {
          return x.blah === Math.random;
      }

      module.exports.test = bTest;

      // *********************************************
      // A.sjs

      var x = require("X.sjs");
      var b = require("B.sjs");

      x.blah = Math.random;

      b.test();

      // *********************************************
      // A-prime.sjs

      var x = require("X.sjs");
      var b = require("B.sjs");

      b.test();

      Invoking A.sjs returns true, but subsequently invoking A-prime.sjs still returns false.


      XQuery modules

      MarkLogic also allows server-side JavaScript modules to import library modules written in XQuery and call the exported variables and functions as if they were JavaScript.

       

      Introduction

      The notion of "flipping" back control (from failed-over replica forest back to the master forest) has been covered in previous Knowledgebase articles:

      https://help.marklogic.com/Knowledgebase/Article/View/427/0/scripting-failover-flipping-replica-forests-back-to-their-masters-using-xquery

      In this Knowledgebase article, we will discuss the pros and cons of leaving failed over forests as they are.  Should control be returned to the master forests after a failover event?

      Best Practices

      Can it be considered good practice to leave forests in their failed-over state?

      As long as the original configured master shows that it is in sync replicating state in the database status page, you know it's still ready to take over in the event that the configured replica (acting master) fails at a later time; this means that High Availability is still preserved across the cluster in spite of a prior failover event having taken place.

      In summary, the main reasons to fail back the forests to their initial configured state are as follows:

      • Your operating state will match your configured state, which could avoid surprises if you make assumptions based on configuration or naming of forests (e.g. someone somewhere may assume that forest-001-r is a replica forest and not check whether it is currently acting master due to a failover event that took place some time in the past). This is especially important if your team does not maintain a runbook for your MarkLogic cluster.
        • Additionally, if you restart your cluster in a failed-over state, the configured masters will take over again, so your running state will be different before and after a restart, which could complicate diagnosis of any problems you may have involving the restart (e.g. if the restart was in response to a problem, or if a problem surfaces after restart)
      • Both master and replica forests can process updates, although only master forests can process queries.  Presumably you sized your cluster and distributed your forests to spread the load; if you're in a failed over state, then the load is likely to be uneven across hosts in your cluster and you probably want to get back to that even load by failing those forests back to their respective masters.
      • There are likely to be implications with backup / restore if you have an unusual distribution of master/ acting master (replica) forests that could cause further work for you.  These issues are covered in the following Knowledgebase articles:

      Conclusion

      In the event of a forest failover, as long as your previous master forests are in their (expected) sync replicating state, the risk of leaving the forest in a failed over state is minimal; any disturbance that takes the active master forest offline (such as a forest restart) will cause failover to happen again so you still continue to have High Availability

      However, forest failover can be indicative of a larger symptom: a particular host that appears to be encountering issues for any number of possible reasons.  Keeping track of when forests fail over for a given host can be a useful first line of enquiry into a system that is showing early warning signs of a problem.

      From the perspective of system management, flipping failed-over forests back to their respective masters could be considered as part of an ongoing approach to managing and maintaining general cluster health.  

      In the event of a failover, if the failover details are logged, the forests are failed back to their respective masters, subsequent failover events should become more apparent at a glance; it's easy to quickly review the status tab of a given database to confirm that all the master forests are in their open state (with their replica forests all sync replicating).

      Adopting a policy of logging what happened and resolving the issue by failing the forests back makes the procedure of managing a failover an event that gets triaged and in the longer run will make future events easier to spot and - potentially - could provide data to give you advance warning of an inherent issue involving a given host in your cluster.

      SUMMARY

      Some MarkLogic Server sites are intalled in a 1GB network environment. At some point, your cluster growth may require an upgrade to 10GB ethernet. Here are some hints for knowing when to migrate up to 10GB ethernet, as well as some ways to work around it prior to making the move to 10GB.

      General Approach

      A good way to check if you need more network bandwidth is to monitor the network packet retransmission rate on each host.  To do this, use the "sar -n EDEV 5" shell command. [For best results, make sure you have an updated version of sar]

      Sample results:

      # sar -n EDEV 5 3
      ... 10:41:44 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:49 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:49 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:49 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:54 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:54 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:54 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:59 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:59 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s Average: lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


      Explanation of terms:

      FIELDDESCRIPTION
      IFACE LAN interface
      rxerr/s Bad packets received per second
      txerr/s Bad packets transmitted per second
      coll/s Collisions per second
      rxdrop/s Received packets dropped per second because buffers were full
      txdrop/s Transmitted packets dropped per second because buffers were full
      txcarr/s Carrier errors per second while transmitting packets
      rxfram/s Frame alignment errors on received packets per second
      rxfifo/s FIFO overrun errors per second on received packets
      txfifo/s FIFO overrun errors per second on transmitted packets

      If the value of txerr/s and txcarr/s is none zero, that means that the packets sent by this host are being dropped over the network, and that this host needs to retransmit.  By default, a host will wait for 200ms to see if there is an acknowledgment packet before taking this retransmission step. This delay is significant for MarkLogic Server and will factor into overall cluster performance.  You may use this as an indicator to see that it's time to upgrade (or, debug) your network. 

      Other Considerations

      10 gigabit ethernet requires special cables.  These cables are expensive, and easy to break.  If a cable is just slightly bent improperly, you will not get 10 gigabit ethernet out of it. So be sure to work with your IT department to insure that everything is installed as per the manufaturer specification. Once installed, double-check that you are actually getting 10GB from the installed network.

      Another option is to use bonded ethernet to increase network bandwidth from 1GB to 2GB and to 4GB prior to jumping to 10GB.  A description of Bonded ethernet lies beyond the scope of this article, but your IT department should be familiar with it and be able to help you set it up.

       

      Introduction

      The performance and resource consumption of E-nodes is determined by the kind of queries executed in addtion to the distribution and amount of data. For example, if there are 4 forests in the cluster and the query is asking for only the top-10 results, then the E-node would receive a total of 4 x 10 results in order to determine the top-10 among these 40. If there are 8 forests, then the E-node would have to sort through 8 x 10 results.

      Performance Test for Sizing E-Nodes:

      To size E-nodes, it’s best to determine first how much workload a single E-node can handle, and then scale up accordingly.

      Set up your performance test so it is at scale and so that it only talks to a single E-node. Start the Application Server settings with something like

      • threads = 32
      • backlog = 512
      • keep alive = 0

      Crank up the number of threads for the test from low to high, and observe the amount of resources being used on the E-node (cpu, memory, network). Measure both response time and throughput during these tests.

      • When the number of threads are low, you should be getting the best response time. This is what the end user would experience when the site is not busy.
      • When the number of threads are high, you will see longer response time, but you should be getting more throughput.

      As you increase the number of threads, you will eventually run out of resources on the E-node - most likely memory. The idea is to identify the number of active threads when the system's memory is exceeded, because that is the maximum number of threads that your E-node can handle.

      Addtitional Tuning of E-nodes

      Thrashing

      • If you notice thrashing before MarkLogic is able to reach a  memory consumption equilibrium, you will need to continue decreasing the threads so that the RAM/thread ratio is near the 'pmap total memory'/thread
      • The backlog setting can be used to queue up requests w/o chewing up significant resources.
      • Adjusting backlog along with some of the timeout settings might give a reasonable user experience comparable to, or even better than, what you may see with high thread counts. 

      As you continue to decrease the thread count and make other adjustments, the mean time to failure will likely increase until the settings are such that equilibrium is reached before all the memory resources are consumed - at which time we do not expect to see any additional memory failures.

      Swap, RAM & Cache for E-nodes

      • Make sure that the E-nodes have swap space equal to the size of RAM (if the node has less than 32GB of RAM) or 32 GB (if the node has 32GB or more of RAM)
      • For E-nodes, you can minimize the List Cache and Compressed Tree Cache  - set to 1GB each - in your group level configurations.
      • Your Expanded Tree Cache (group level parameter) should be at least equal to 1/8 of RAM, but you can further increase the Expanded Tree Cache so that all three caches (List, Compressed, Expanded) in combination are up to 1/3 of RAM.
      • Another important group configuration parameter is Expanded Tree Cache Partitions.  A good starting point is 2-3 GB per partition, but is should not be more than 12 GB per partition. The greater the number of partitions, the greater the capacity of handling concurrent query loads.

      Growing your Cluster

      As your application, data and usage changes over time, it is important to periodically revisit your cluster sizings and re-run your performance tests.

       

      The recommended way to run MarkLogic on AWS is to use the "managed" Cloud Formation template provided by MarkLogic:

      https://developer.marklogic.com/products/cloud/aws

      The documentation for it is here:

      https://docs.marklogic.com/guide/ec2/CloudFormation

      By default, the MarkLogic nodes are hidden in Private Subnets of a VPC and the only way to access them from the Internet is via the Elastic Load Balancer.

      This is optimal as it distributed the load and shields from common attack vectors.

      However, for some types of maintenance it may be useful, or even necessary to SSH directly into individual MarkLogic nodes.

      Examples where this is necessary:

      1. Configuring Huge Pages size so that it is correct for the instance size/amount of RAM: https://help.marklogic.com/Knowledgebase/Article/View/420/0/group-level-cache-settings-based-on-ram

      2. Manual MarkLogic upgrade where a new AMI is not yet available (for example for emergency hotfix): https://help.marklogic.com/Knowledgebase/Article/View/561/0/manual-upgrade-for-marklogic-aws-ami

       


      To enable SSH access to MarkLogic nodes you need to:

      I. Create an intermediate EC2 host, commonly known as 'bastion' or 'jump' host.

      II. Put it in the correct VPC and correct (public) subnet and ensure that it has public / Internet-facing IP address

      III. Adjust security settings so that SSH connections to bastion host as well SSH connection from bastion to MarkLogic nodes are allowed and launch the bastion instance.

      IV. Additionally, you will need to configure SSH key forwarding or a similar solution so that you don't need to store your private key on the bastion host.

      I. Creating the EC2 instance in AWS Console:

      1. The EC2 instance needs to be in the same region as the MarkLogic Cluster so the starting console URL will be something like this (depending on the region and your account):

      https://eu-west-1.console.aws.amazon.com/ec2/home?region=eu-west-1#LaunchInstanceWizard:

      2. The instance OS can be any Linux of your choice and the default Amazon Linux 2 AMI is fine for this. For most scenarios the jump host does not need to be powerful so any OS that is free tier eligible is recommended:

      Step1-AMI.png

      3.Choose instance size. For most scenarios (including SSH for admin access), the free tier t2.micro is the most cost-effective instance:

      Step2-Instance-type.png

      4. Don't launch the instance just yet - go to Step 3 of the Launch Wizard ("Step 3: Configure Instance Details").

      II. Put the bastion host in the correct VPC and subnet and configure public IP:

      The crucial steps here are:

      1. Choose the same VPC that your cluster is in. You can find the correct VPC by reviewing the resources under the Cloud Formation template section of the AWS console or by checking the details of the MarkLogic EC2 nodes.

      2. Choose the correct subnet - you should navigate to the VPC section of the AWS Console, and see which of the subnets of the MarkLogic Cluster has an Internet Gateway in its route table.

      3. Ensure that "Auto-assign Public IP" setting is set to "enable" - this will automatically configure a number of AWS settings so that you won't have to assign Elastic IP, routing etc. manually.

      4.Ensure that you have sufficient IAM permissions to be able to create the EC2 instance and update security rules (to allow SSH traffic)

      Step3-instance-details.png

      III. Configure security settings so that SSH connections are allowed and launch:

      1. Go to "Step 6: Configure Security Group" of the AWS Launch Wizard. By default, AWS will suggest creating "launch" security group that opens SSH incoming to any IP address. You can adjust as necessary to allow only a certain IP address range, for example.

      Step6-security.png

      Additionally, you may need to review the security group setting for your MarkLogic cluster so that SSH connections from bastion host are allowed.

      2.Go to "Step 7: Review Instance Launch" and press "Launch". At this step you need to choose a correct SSH key pair for the region or create a new one. You will need this SSH key to connect to the bastion host.

      ssh-keypair.png

      3. Once the EC2 instance launches, review its details to find out the public IP address.

      instance-publicIP.png

      IV. Configure SSH key forwarding so that you don't have permanently store your private SSH on the bastion host. Please review your options and alternatives here (for example using ProxyCommand) as key forwarding temporarily stores the private key on the bastion host, so anyone with root access to the bastion host could hijack your MarkLogic private key (when logged in at the same time as you).

      1. Add the private key, to SSH agent:

      ssh-add -K myPrivateKey.pem

      2. Test the connection (with SSH agent forwarding) to the bastion host using:

      ssh -A ec2-user@<bastion-IP-address>

      3. Once you're connected ssh from the bastion to a MarkLogic node:

      ssh ec2-user@<MarkLogic-instance-IP-address or DNS-entry>

      ssh-verify.png

      For strictly AWS infrastructure issues (VPC, subnets, security groups) please contact AWS support. For any MarkLogic related issues please contact MarkLogic support via:

      help.marklogic.com

      Introduction

      We discuss why MarkLogic server should be started with root priviledges.

      Details

      It is possible to install MarkLogic Server in a directory that does not require root priviledges.

      There's also a section in our Installation Guide (Configuring MarkLogic Server on UNIX Systems to Run as a Non-daemon User) that talks at some length about how to run MarkLogic Server as a user other than daemon on UNIX systems. While that will allow you to configure permissions for non-root and non-daemon users in terms of file ownership and actual runtime, you'll still want to be the root user to start and stop the server.

      It is possible to start MarkLogic without su priviledges, but this is strongly discouraged.

      The parent (root) MarkLogic process is simply a restarter process. It is there simply to wait for the non-root process to exit, and if the non-root process exits abnormally for some reason, the root process will fork and exec another non-root process. The root process runs no XQuery scripts, opens no sockets, and accesses no database files.

      We strongly recommend to start MarkLogic as root and let it switch to the non-root user on its own. When the server initializes, if it is root it makes some privileged kernel calls to configure sockets, memory, and threads. For example, it allocates huge pages if any are available, increases the number of file descriptors it can use, binds any configured low-numbered socket ports, and requests the capability to run some of its threads at high priority. MarkLogic Server will function if it isn’t started as root, but it will not perform as well.

      You can work around the root-user requirements for starting/stopping (and even installation/uninstallation) by creating wrapper scripts that call the appropriate script (startup, shutdown, etc.), providing sudo privileges to just the wrapper.  This helps to control and debug execution.

      Further reading

      Knowledgebase - Pitfalls Running Marklogic Process as Non-root User 

      Summary

      Quorum is used to either evict or keep a node in a cluster but is quorum requierd even while starting my cluster?

      What is Quorum?

      Each node in a cluster communicates with all of the other nodes in the cluster at periodic intervals. This periodic communication, known as a heartbeat, circulates key information about host status and availability between the nodes in a cluster. The cluster uses the heartbeat to determine if a node in the cluster is unavailable. This determination is based on a vote from each node in the cluster, based on each node's view of the current state of the cluster. To vote a node out of the cluster, there must be a quorum of nodes voting to remove a node. A quorum occurs if more than 50% of the total number of nodes in the cluster (including any nodes that are down) vote the same way.

      Depending on cluster configuration, this quorum may or may not be required even during startup of a cluster.

      On a cluster without forest level failover configured, No quorum is required to bring up the admin UI. If one brings up the server hosting the Security (Schemas and Modules) database then you can access the admin UI.

      On a cluster with shared disk failover configured, No quorum is required to bring up the admin UI. If one brings up the server hosting the Security (Schemas and Modules) database then you can access the admin UI.

      On a cluster with local disk failover configured, a quorum is required prior to starting operations (e.g. accessing Admin UI). If you do not have quorum, then the MarkLogic admin will have to perform some intervention to bring up the required number of hosts. In case of a power outage, it is expected that all hosts will be powered up simultaneously. The server is designed to handle this well, so there is no need to serialize server startup and in fact we would prefer a simultaneous startup of all hosts in a cluster. If there is any reason for wanting to serialize server startup (such as not wanting to overwhelm the SAN), this is OK too, just be aware that normal cluster operation will start at the point where you have a quorum.

      Why do we need to achieve Quorum of more than 50%?  Understanding network partitioning, or the "split brain" problem

      For failover to occur, you must have a quorum of particpant nodes (defined as "n/2 + 1"). This is what protects you against any risk of network partitioning; if a node can't communicate with more than half the hosts in a cluster, it will be unable to tell whether it's on the losing side of a network partition.  If you were to try to put N hosts in one data center and N hosts in another data center, neither one would be able to determine that it is the surviving data center in the event of a network problem. If you were to try to create a cluster that spans multiple data centers, you'd want at least one more machine in a 3rd location that the two data centers would use to break the tie.

      Read more on network partitioning at: https://en.wikipedia.org/wiki/Split-brain_(computing)

      Summary

      When attempting to send email from MarkLogic, from Ops Director, Query Console, or other application, you might encounter one of the following errors in your MarkLogic Server Error Log, or in the Query Console results pane.

      • Error sending mail: STARTTLS: 502 5.5.1 Error: command not implemented
      • Error sending mail: STARTTLS: 554 5.7.3 Unable to initialize security subsystem

      This article will help explain what these errors mean, as well as provide some ways to resolve it.

      What these Errors Mean

      These errors indicate that MarkLogic is attempting to send an SMTPS email through the relay, and the relay either does support SMTPS, or SMTPS has not been configured correctly.

      Resolving the Error

      One possible cause of this error is when the smtp relay setting for MarkLogic server is set to localhost.  The error can be resolved by using the Admin Interface to update the smtp relay setting with the organizational SMTP host or relay.  That setting can be found under Configure --> Groups --> [GroupName]: Configure tab, then search for 'smtp relay'.

      If this error occurs when testing the Email configuration for Ops Director, you can configure Ops Director to use SMTP instead of SMTPS by ensuring the Login and Password fields are blank.  These fields can be found under Console Settings --> Email Configuration in the Ops Director application.

      Alternatively, install/configure an SMTP server with SMTPS support.

      Related Reading

      https://en.wikipedia.org/wiki/SMTPS

      https://www.f5.com/services/resources/deployment-guides/smtp-servers-big-ip-v114-ltm-afm

      Introduction

      Stemming is handled differently between a word-query and value-query; a value-query only indexes using basic stemming.

      Discussion

      A word may have more than one stem. For example,

      cts:stem ('placing')

      returns

      place
      placing

      To see how this works with a word-query we can use xdmp:plan. Running

      xdmp:plan (cts:search (/, cts:word-query ('placing')))

      on a database with basic stemming returns

      <qry:final-plan>
      <qry:and-query>
      <qry:term-query weight="1">
      <qry:key>17061320528361807541</qry:key>
      <qry:annotation>word("placing")</qry:annotation>
      </qry:term-query>
      </qry:and-query>
      </qry:final-plan>

      Since basic stemming uses only the first/shortest stem, this is searching just for the stem 'place'.

      Searching with

      cts:search (/, cts:word-query ('placing'))

      will match 'a place of my own' ('placing' and 'place' both stem to 'place') but not 'new placings' ('placings' stems to just 'placing').

      However, on a database with advanced stemming the plan is

      <qry:final-plan>
      <qry:and-query>
      <qry:or-two-queries>
      <qry:term-query weight="1">
      <qry:key>17061320528361807541</qry:key>
      <qry:annotation>word("placing")</qry:annotation>
      </qry:term-query>
      <qry:term-query weight="1">
      <qry:key>17769756368104569500</qry:key>
      <qry:annotation>word("placing")</qry:annotation>
      </qry:term-query>
      </qry:or-two-queries>
      </qry:and-query>
      </qry:final-plan>

      Here you can see that there are two term queries OR-ed together (note the two different key values). The result is that the same cts:word-query('placing') now also matches 'new placings' because it queries using both stems for 'placing' ('place' and 'placing') and so matches the stemmed version of 'placings' ('placing').

      However, a search with

      cts:element-value-query(xs:QName('title'), 'new placing')

      returns

      <qry:final-plan>
      <qry:and-query>
      <qry:term-query weight="1">
      <qry:key>10377808623468699463</qry:key>
      <qry:annotation>element(title,value("new","placing"))</qry:annotation>
      </qry:term-query>
      </qry:and-query>
      </qry:final-plan>

      whether the database has basic or advanced stemming, showing that multiple stems are not used.

      The reason for this is that MarkLogic will only do basic stemming when indexing the keys for a value. Therefore there is a single key for the value.  If MarkLogic Server were designed to support multiple stems for values (which is does not), this would expand the indexes dramatically and slow down indexing, merging, and querying. Consider if each word had two stems, then there would be 2^N keys for N words. The size would grow exponentially for addtional stems. 

      More information on value-queries is available at Understanding Search: value queries.

       

      Summary

      When an SSL certificate is expired or out of date, it is necessary to renew the SSL certificates applied to a MarkLogic Application Server.   

      The following general steps are required to apply an SSL certificate.  

      1. Create a certificate request for a server in MarkLogic
      2. Download Certificate Request and send it to certificate authority
      3. Import signed certificate into MarkLogic

      Detailed Steps

      Before proceeding, please note that you dont need to create a new template to renew an expired certificate as the existing template will work.

      1. Creating a certificate request - A fresh csr can be generated from the MarkLogic Admin UI by navigating to Security -> Certificate Templates -> click [your_template] -> click the request tab -> Select radio button applicable for an expired/out of date certificate case. For Additional information, refer to the Generating and Downloading Certificate Requests section of our Administrators Guide.

      2. Download and Send to certificate authority - The certificate template Status page will display the newly generated request. You can download it and send it to your certificate authority for signing.

      3. Import signed certificate into MarkLogic - After recieving the signed certificate back from the certificate authority, you can import it from our Admin UI by navigating to Security-> Certificate Templates -> click [your_template] -> Import tab.  For Additional information, refer to the Importing a Signed Certificate into MarkLogic Server section of our Administrators Guide

      4. Verify - To verify whether the certificate has been renewed, please look at the summary of your certificate authority. The newly added certificate should appear in certificate authority. Detailed instructions for this are available in the Viewing Trusted Certificate Authorities section of our Administrators Guide

      If you are not able to view the certificate authority, then you may need to add the certificate as if it is a new CA. This can happen as if there was a change in CA certificate chain.

      • Click on the certificate template name and then import the certificate. You should already have this CA listed (as this was already there and only the certificate expired). However if there is a change in certificate authority then you will need to import it - you can do this by navigating in the Admin UI to Configure -> Security -> Certificate Authorities --> click on the import tab - this will be equivalent to adding a new CA certificate into MarkLogic. The CA certificate name will now appear in the list.

       

       

       

      Introduction

      This article will outline a general strategy for distributing a specific task across every node in a server.

      There are situations where you would like to execute queries against a number of hosts in a cluster - one such example would be to break a query down so it only operates on the forests on that particular node. Using the patterns described in this article, you will be able to build a mechanism to do just that.

      The problem

      Wouldn't it be useful if you could pass in options into xdmp:spawn() to allow the execution of code on a specific host in a cluster?

      While this has been filed as an RFE (2763) for consideration in a future release of the product, there are a few options open to you.

      From the top down

      1. Gather information about each host in your cluster

      For this you can use a call to xdmp:hosts(). This will give you a sequence of host ids - each corresponding with a node in your cluster. From here, you can get the xdmp:host-name() The snippet below demonstrates this:

      2. Create a call to an http endpoint on each host in a cluster

      We can build on the steps outlined in the first part to generate a list of URIs - each mapping to an endpoint (which would be serviced by a corresponding XQuery module to perform a particular task on that host). In the example below, we're using fn:concat() to generate the links for each host and then issuing a call to xdmp:document-get() to hit the same application server endpoint on each host.

      3. Isolate forests for a given host

      While the above technique might be useful for some purposes, you could allow for further precision by building a query which could operate exclusively on the forests managed by that node; using the technique above, this variation would allow you to "pre-screen" a databases forests to only operate against forests on that host:

      Summary

      This KB article has introduced some fairly simple patterns to allow you to programmatically direct requests to a particular host in a cluster. It also demonstrates a technique for preparing queries to operate at individual forest level.

      Such techniques can be useful for performing administrative tasks on an individual host, auditing the contents of an individual forest (or group of forests) and allow for even more flexibility when you consider bulk processing tools such as CoRB and XQSync - both of which allow you to select documents based on a custom query (which could be restricted by passing in a sequence of one or more forest ids).

      Additionally, as you have the ability to target a specific host in executing a task, you could also use the above techniques to write out a specific properties file to a writable partition on your system (such as /tmp) using a call to xdmp:save().

      Introduction

      This article discusses version differences of temporal documents in MarkLogic Server.

      Details

      MarkLogic Server 8 does not storing the "difference" for temporal documents.  Each version of the temporal document is a full document.

      At the time of this writing, MarkLogic Server does not provide any differencing tools to support diff/delta between versions of temporal documents.  It is possible to use tools external to MarkLogic Server to determine document differences.

       

       

      Introduction

      Interoperation of Temporal support with other MarkLogic features.

      Features that support Temporal collections

      MarkLogic’s Temporal feature is built-in to the server and is supported by many of MarkLogic’s power features: Search API, Semantics, Tiered Storage, and Flexibile Replication. Temporal queries can be written in either JSON or XQuery.

      Collections

      How are collections used to implement Temporal documents?

      Temporality is defined on a protected collection, known as a temporal collection. When a document is inserted into a temporal collection, a URI collection is created for that document. Additionally, the latest version of each document will reside in a latest collection.

      Why are collections used to group all revisions of a particular document vs storing it in the properties?

      This was done to avoid unnecessary fragmentation, enhance performance, and make best use of existing infrastructure.

      Does the Temporal implementation use the collection lexicon or just collections?

      It uses only collections. The collection lexicon can be turned on and utilized for applications.

      Won’t Temporal collections also be in the collection lexicon if the lexicon is enabled?

      Yes.

      See alsoTemporal, URI, and Latest Collections.

      Timezones

      The Temporal axes are based on standard MarkLogic dateTime range indexes.

      All timezone information is handled in the standard way, as for any other dateTime range index in MarkLogic.

      DLS (Library Services API)

      Temporal and DLS are aimed at solving different sorts of problems, so do not replace each other. They will coexist.

      Tiered Storage

      Temporal documents can be leveraged with our Tiered Storage capabilities.

      The typical use case is where companies will need to store years of historical information for various purposes regulations.

      Compliance. Either internal or external auditing can occur (up to seven years based on Dodd-Frank Legislation). This data can be deployed on commodity hardware at lower cost, and can be remounted when needed.

      Analytics. Many years of historical information can be cheaply stored on commodity hardware to allow data scientists to perform analysis for future patterns and backtesting against previous assumptions.

      JSON/JavaScript

      Temporal documents work with XML/XQuery as well as JSON/JavaScript.

      Java/search/REST/Node API

      Temporal is supported by all of our existing server-side APIs.

      MLCP

      You can specify a Temporal collection with the –temporal_collection option in MLCP.

      Normal document management APIs (xdmp:*)

      By default this is not allowed and an error will be returned.  Normally the temporal:* API should be used.  However, for more information, see also Managing and Updating Temporal Documents.

      Triples

      MarkLogic supports non-managed triples in a Temporal document.

      Introduction

      How do you find all versions of a temporal document?

      Details

      In MarkLogic Server, a temporal document is managed as a series of versioned documents in a protected temporal collection. In addition, each temporal document added creates another collection based on its URI, and all versions of the document will be in that collection.

      For example, if you have stored a temporal document at URI /orders/koolorder.xml then you can find all the versions of that document by using a collection query as

          cts:search (/, cts:collection-query ('/orders/koolorder.xml'))

      and the uris of all the versions of the document as

          cts:uris ((), (), cts:collection-query ('/orders/koolorder.xml'))

      Introduction

      Allen and ISO operators are comparison operators that can be used in temporal queries.

      Details

      Both operator sets are used to represent relations between two intervals.  ISO operators are more general and usually can be represented by a combination of Allen operators.  For example: iso_succeeds = aln_met_by || aln_after.

      Period Comparison Operators are discussed in more detail in Searching Temporal Documents.

       

      Introduction

      For terms stored in the index, the position list tracks where they appear within the document. Positions are used to resolve queries where distance between terms matter (for example near queries where a term can appear n words away from another term or phrase within a given element or set of search criteria). There are a number of index options involving positions of document terms. When these indexes are enabled, MarkLogic will record positions in a positions list for each term in the universal index. When positions lists get large, MarkLogic may take a long time to load them from disk.  To minimize the impact of large position lists, MarkLogic imposes a maximum size for these lists per term.

      MarkLogic 7 and above

      Each stand in a forest maintains its own index and its own positions list. The smaller the stands, the less likely you are to encounter maximum positions for a term as smaller stands likely results in smaller term lists.  A maximum stand size was introduced in MarkLogic 7.  By default, the maximum stand size restricts the size of individual stands to 32768 (32 GB).

      If you are running into the warnings message "Termlist will discard positions at 256MB", you may need to manage your data and forests to ensure the index sizes remain manageable.

      There is a positions-list-max-size configuration parameter (default is 256MB, with a maximum value of 512MB) where the term list is considered too large and unwieldy.  For example: A 512MB term list would take 15 seconds to load from disk at 20MB/sec, so increasing this value from the default may allow for a fast fix to a potential performance problem but it's probably not the most optimal change that could be made.

      In MarkLogic 7 & MarkLogic 8,  the default maximum stand size restricts the size of individual stands to 32768 (32 GB). With the maximum stand size setting, we expect it to be less likely for new customers to run into the large positions list problem; For a 32GB stand, a single 256MB term list would take almost 1% of all the disk space taken by that stand, which is unlikely.

      Scenario: Understanding what messages to look for

      In the ErrorLog file, you may notice messages appearing at the "Info" level which look like:

      2016-04-13 03:02:17.951 Info: Termlist for X in Y is 151 MB; will discard positions at 256 MB

      This message is just letting you know that the term list is getting large and the limit is getting near for a particular stand ("Y"). The term list is managed by each stand in all your forests so for each stand, a maximum size is allowed (the default is 256MB). If the positions list starts to exceed this maximum size, positions will be discarded by MarkLogic for that database.

      Settings

      The value is set at the database level: 

      Configure -> Databases -> [Database] -> positions list max size

      This value can be increased by changing this database-level setting although we do not recommend exceeding 512MB. The main reason for this is due to performance; larger positions lists take longer to load, so there is a performance implication with this setting.

      Newer releases of MarkLogic Server set the maximum size of a given stand for new databases to a default size of 32GB (32768). This setting is governed by the Merge Policy for the database:

      Configure -> Databases ->[Database] -> Merge Policy -> merge max size

      As each stand maintains its own positions list, one way to ensure that you don't hit the maximum size is to ensure your on-disk stands are smaller; having more stands has a performance implication as any query needs to traverse all stands in order to compute the result set of fragments in order to answer the query.

      In our example message above, the positions list is 151MB which is still a decent way off from the upper default per-stand limit of 256MB - so no immediate concern. 

      Steps to resolving the issue

      If it becomes necessary, in order to keep yourself on the best side of the positions-list-max-size limit, you have two choices:

      1. You can modify the positions list max size to make these lists larger (remember that the recommended upper limit is 512MB) and this is a single configuration change that is made at database level.

      2. You could modify the merge max size to ensure each of your on-disk stands are smaller.

      Either approach will have a performance impact so my suggestion would be that neither setting should be changed unless you really need to.

      Given the above example, if the largest value you are seeing is 151MB, there is still a decent amount of overhead for all the stands in this database.

      If you start to see the value getting closer to 256MB, the fastest resolution would be to increase the positions list max size to ensure that positions are not discarded and then to think about managing the maximum size of on-disk stands for your database.

      ErrorLog message escalation: understanding the risks

      If they only ever remain at Info level, they can be safely ignored. The severity level of the logging will escalate twice. However, log messages have been designed to escalate in severity so you know how to watch for warning signs.

      1. When you reach 2/3 of the discard threshold, these messages will appear at Notice level in the ErrorLogs.
      2. When you reach 3/4 of the discard threshold, these messages will appear at Warning level in the ErrorLogs.

      At the very least, keeping tabs on when these messages start to appear at Notice level should give you plenty of advance warning.

      Monitoring for Warning level messages should also catch this issue before it becomes a critical issue and starts to impact on search results.

      Further reading

      Introduction

      Binary documents often have various associated metadata. For example, an image may have metadata like a timestamp of when and where it was taken, and so on. MarkLogic Server server offers the ability to extract this metadata information from binary documents (e.g. Images, MS Office and Adobe PDF) using XQuery built-in functions and conversion pipelines using third party software.

      The following article gives details about the security vulnerabilities reported for text extraction and MarkLogic releases containing the resolution.



      Details

      MarkLogic Server's Admin API function xdmp: document-filter will allow you to extract metadata and text from binary documents as XHTML. Additionally, the server’s xdmp:pdf-convert() and Content Processing Framework (CPF) helps convert HTML, Adobe PDF and Microsoft Office documents to XML.

      However, these mechanisms utilize and rely on a third-party softwares like Iceni  "Argus PDF converter" and Perceptive Document Filters” from Lexmark to extract text and metadata from a wide variety of document formats. 

      Recently, both Iceni and Lexmark have issued security alerts for vulnerabilities in these product and have incorporated fixes into their most recent release. They have published the following CVEs:

      For Iceni:

      • CVE-2016-8333 and CVE-2016-8335
        • An exploitable stack-based buffer overflow vulnerability

      The latest version of Iceni (v6.6.5) patches the security issues listed above.

      For Lexmark:

      • CVE-2016-5646
        • An exploitable heap overflow vulnerability exists in the Compound Binary Format (CBFF) parser functionality of the Lexmark Perceptive Document Filters Library.
      • CVE-2016-4336
        • An exploitable out of bounds write vulnerability exists in the Bzip2 parsing of the Perceptive Document Filters
      • CVE-2016-4335
        • An exploitable buffer overflow vulnerability exists in the XLS parsing of the Perceptive Document Filters conversion functionality

      These are considered to be vulnerabilities of "High" severity based on CVSS base scores in excess of 7.0.  A carefully crafted pdf, CBFF, Bzip2, or XLS file could be used to cause a buffer overflow which can result in arbitrary code execution.

      The latest version of Lexmark Isys (v11.3) patches the security issues listed above.

       

      Resolution

      MarkLogic has issued an update which includes these fixes.

      The latest releases of MarkLogic Server versions 7 (7.0-6.8) and 8 (8.0-6) are available for download from our Community website that incorporates the latest fix for Iceni and Lexmark Isys.



      References

      • For more information on the Lexmark security issues, see

      http://support.lexmark.com/index?page=content&id=TE811&modifiedDate=08/26/16&userlocale=EN_US&locale=en

      • Further details on Iceni issues can be found at:

      https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2016-8333

      https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2016-8335

       

       

      Timezone information and MarkLogic

      Summary

      This article discusses the effect of the implicit timezone on date/time values as indexed and retrieved.

      Discussion

      Timezone information and indexes

      Values are stored in the index effectively in UTC, without any timezone information. When indexed, the value is adjusted to UTC from either the explicit timezone of the data or implicitly from the host timezone, and then the timezone is forgotten. The index data does not save information regarding the source timezone.

      When queried, values from the index are adjusted to the timezone specified in the query, or to the host's implicit timezone if none is specified.

      Therefore, dates and times in the implicit timezone do what would be expected in calculations, unless you have a particular reason for actually knowing the offset from UTC.


      Implicit timezone

      The definition of an implicit timezone is given at https://www.w3.org/TR/xpath20/#dt-timezone.

      The MarkLogic host implicit timezone comes into play when the document is indexed and when values are returned from the indexes.

      fn:implicit-timezone() can be used to show the implicit timezone for a host.


      Changing implicit timezone

      If you change the implicit timezone without reindexing, the implicit timezone at indexing time was different than the implicit timezone at query time, so values indexed with the implicit timezone are "wrong" in that they were indexed with a different implicit timezone.

      If you specify a timezone for the data when it is indexed and when it is queried, the implicit timezone will not be a factor.


      Examples

      First we create an dateTime element range index on element <dt>, then insert a document without timezone information:

      xdmp:document-insert ('/test.xml', <doc><dt>2018-01-01T12:00:00</dt></doc>)

      Using a server located in New York (timezone now -05:00), retrieving the value from the index via

      cts:element-values (xs:QName ('dt'), ())

      gives

      2018-01-01T12:00:00

      showing that the implicit timezone works as described above. To see the value stored in the index (as adjusted to UTC) you can specify the timezone on the value retrieved:

      cts:element-values (xs:QName ('dt'), (), 'timezone=+00:00')

      returns

      2018-01-01T17:00:00Z

      so 2018-01-01T17:00:00 is the value coming from the index.

      When the implicit timezone is -5 hours then the call without a timezone returns 12:00. However, if the implicit timezone changed, then the value returned for the query without a timezone would also change, even though the value stored in the index has not changed.

      Introduction

      XQuery modules can be imported from other XQuery modules in MarkLogic Server. This article describes how modules are resolved in MarkLogic when they are imported in Xquery.

      Details

      How modules are imported in code

      Modules can be imported by using two approaches-

      --by providing relative path

      import module namespace m = "http://example.edu/example" at "example.xqy";

      --Or by absolute path

      import module namespace m = "http://example.edu/example" at "/example.xqy";

       

      How MarkLogic resolves the path and loads the module

      If something starts with a slash, it is a non-relative path and MarkLogic take it as is, if it doesn't, it is a relative path and first it is resolved  relative to the URI of the current module to obtain a non-relative path. 
       
      Path in hand, MarkLogic always start by looking in the Modules directory. This is a security issue as we want to make sure that the MarkLogic created modules are the ones chosen. In general, users should NOT be putting their modules there. It creates issues on upgrade and if they open up permissions on the directory to ease deployment it creates a security hole. 
       
      Then, depending on whether the appserver is configured to use a modules database or the filesystem, we interpret the non-relative path in terms of the appserver root either on the file system or in the Modules database. 

       

      Debugging module path issue

      To Debug this you can also enable Module caching trace. This will check  how it resolves the paths. Enter "module" as the name of the event in the Diagnostics>Events and you should have a list of module caching events added. These will give you the working details of how module resolution is happening, and should provide enough information to resolve the issue.

      Be aware that diagnostic traces can fill up your ErrorLog.txt file very fast, so be sure to turn them off as soon when you no longer need them.

       

      Performance Hints

      1. Be sure that your code does not rely on dynamically-created modules. Although these may be convenient at times, they will make overall performance suffer. This is because every time a module changes, the internal modules cache is invalidated and must be re-loaded from scratch -- which will tend to hurt performance.

      2. if you are noticing a lot of XDMP-DEADLOCK messages in your log, be sure your modules are not mixing any update statements within what should be a read-only query. The XQuery parser looks for updates anywhere in the modules stack -- including imports -- and if it finds one, it assumes that any Uri that is gathered by the queries might potentially be updated. Thus, if the query matches 10 Uris, it will put a write lock on them, and if it matches 100000 Uris, it will lock all of them as well, and performance will suffer. To prevent this, be sure to isolate updates in their own transactions via xdmp:eval() or xdmp:spawn().

       

       

      Summary

      There are a number of options for transferring data between MarkLogic Server clusters. The best option for your particular circumstances will depend on your use case.

      Details

      Database Backup and Restore

      To transfer the data between two independent clusters, you may use a database backup and restore procedure, taking advantage of MarkLogic Server's facility to make a consistent backup of a database.

      Note: the backup directory path that you use must exist on all hosts that serve any forests in the database. The directory you specify can be an operating system mounted directory path, it can be an HDFS path, or it can be an S3 path. Further information on using HDFS and S3 storage with MarkLogic is available in our documentation:

      Further information regarding backup and restore may be found in our documentation and Knowledgebase:

      Database Replication

      Database Replication is another method you might choose to use to transfer content between environments. Database Replication will allow you to maintain copies of forests on databases in multiple MarkLogic Server clusters. Once the replica database in the replica cluster is fully synchronized with its master, you may break replication between the two and then go on to use the replica cluster/database as the master.

      Note: to enable Database Replication, a license key that includes Database Replication is required. You would also need to ensure that all hosts are: running the same maintenance release of MarkLogic Server; using the same type of Operating System; and Database Replication is correctly configured.

      Also note that for optimum efficiency, indexing information is not replicated over the network between the Master and Replica databases and is instead regenerated by the Replica database. The following Knowledgebase article contains further information on this:

      Further details on Database Replication and how it can be configured, may be found in our documentation:

      MarkLogic Content Pump (mlcp)

      Depending on your specific requirements, you may also like to make use of the MarkLogic Content Pump (mlcp), which is a command line tool for getting data out of and into a MarkLogic Server database. Using mlcp, you can export documents and metadata from a database, import documents and metadata to a database, or copy documents and metadata from one database to another.

      If required, you may use mlcp to extract a consistent database snapshot, forcing all documents to be read from the database at a consistent point in time:

      Note: the version of mlcp you use should be same as the most recent version of MarkLogic Server that will be used in the transfer.

      Also note that mlcp should not be run on a host that is currently running MarkLogic Server, as the Server assumes it has the entire machine available to it, including the CPU and disk I/O capacity.

      Further information regarding mlcp is available in our documentation:

      Further Information

      Related Knowledgebase articles that you may also find useful:

      Problem Statement

      You have an application running on a particular cluster (the source cluster), devcluster and you wish to port that application to an new cluster (the target cluster) testcluster. Porting the application can be divided into two tasks: configuring the target cluster and copying the code and data. This article is only about porting the configuration.

      In an ideal world, the application is managed in an "infrastructure as code" manner: all of the configuration information about that cluster is codified in scripts and payloads stored in version control and able to be "replayed" at will. (One way to assure that this is the case is to configure testing for the application in a CI environment that begins by using the deployment scripts to configure the cluster.)

      But in the real world, it's all too common for some amount of "tinkering" to have been performed in the Admin UI or via ad hoc calls to the Rest Management API (RMA). And even if that hasn't happened, it's not generally possible to be certain that's the case, so you still have to worry that it might have happened.

      Migrating the application

      The central theme in doing this "by hand" is that RMA payloads are re-playable. That is, the payload you GET for the properties of a resource is the same as the payload that you PUT to update the properties of that resource.

      If you were going to migrate an application by hand, you'd proceed along these lines.

      Determine what needs to be migrated

      An application consists (more or less by definition) of one or more application servers. Application servers have databases associated with them (those databases may have additional database associations). Databases have forests.

      A sufficiently complex application might have application servers divided into different groups of hosts.

      Applications may also have users (for example, each application server has a default user; often, but not always, "​nobody​").

      Users, in turn, have roles, and roles may have roles and privileges. Code may have amps that use privileges.

      That covers most of the bases, but beware that apps can have additional configuration that should be reviewed: security artifacts (certificates, external securities, protected paths or collections, etc.), mime types, etc.

      Get Source Configuration

      Using RMA, you can get the properties of all of these resources:

      • Application servers

        Hypothetically, the App-Services application server.

      curl --anyauth -u admin:admin \
         http://localhost:8002/manage/v2/servers/App-Services/properties?group-id=Default
      
      • Groups

        Hypothetically, the Default group.

      curl --anyauth -u admin:admin \
         http://localhost:8002/manage/v2/groups/Default/properties
      
      • Databases

        Hypothetically, the Documents database.

      curl --anyauth -u admin:admin \
         http://localhost:8002/manage/v2/databases/Documents/properties
      
      • Users

        Hypothetically, the ndw user.

      curl --anyauth -u admin:admin \
         http://localhost:8002/manage/v2/users/ndw/properties
      
      • Roles

        Hypothetically, the app-admin role.

      curl --anyauth -u admin:admin \
         http://localhost:8002/manage/v2/roles/app-admin/properties
      
      • Privileges

        Hypothetically, the app-writer execute privilege.

      curl --anyauth -u admin:admin \
         "http://localhost:8002/manage/v2/privileges/app-writer/properties?kind=execute"
      

      And the create-document URI privilege.

      curl --anyauth -u admin:admin \
         "http://localhost:8002/manage/v2/privileges/create-document/properties?kind=uri"
      
      • Amps

        Hypothetically, my-amped-function in /foo.xqy in the Modules
        database using the namespace http://example.com/.

      curl --anyauth -u admin:admin \
         "http://localhost:8002/manage/v2/amps/my-amped-function/properties\
         ?modules-database=Modules\
         &document-uri=/foo.xqy\
         &namespace=http://example.com"
      

      Create Target Configuration

      Some of the properties of a MarkLogic resource may be references to other resources. For example, an application server refers to databases and a role can refer to a privilege. Consequently, if you just attempt to POST all of the property payloads, you may not succeed. The references can, in fact, be circular so that no sequence will succeed.

      The easiest way to get around this problem is to simply create all of the resources using minimal configurations: Create the forests (make sure you put them on the right hosts and configure them appropriately). Create the databases, application servers, roles, and privileges. Create the amps. If you need to create other resources (security artifacts, mime types, etc.) create those.

      Finally, PUT the property payloads you collected from the source cluster onto the target cluster. This will update the properties of each application server, database, etc. to be the same as the source cluster.

      Related Reading

      MarkLogic Documentation - Scripting Cluster Management

      MarkLogic Knowledgebase - Transferring data between MarkLogic Server clusters

      MarkLogic Knowledgebase - Best Practices for exporting and importing data in bulk

      MarkLogic Knowledgebase - Deployment and Continuous Integration Tools

      Summary:

      MarkLogic allows the use of SSL certificates to be used when securing application servers.  This article explains some common issues seen when importing certificates, as well as methods to troubleshoot problems.

      Importing a certificate into MarkLogic:

      The general procedure for creating and importing a certificate into MarkLogic can be found in the docs here:  http://docs.marklogic.com/guide/admin/SSL#id_42684

      For a certificate to be successfully imported, the public key of the signed certificate must match a public key contained in the Certificate Template.  MarkLogic will create a new public/private key par for each Certificate Request that is generated within a Certificate Template.

      Troubleshooting:

      If you are having an issue where MarkLogic is not accepting the signed certificate you should first verify that your certificate is in PEM format.  If this is not the case, you can use openssl to convert your format to PEM.  Below are examples of how to convert between various formats using openssl.

      Convert a DER file to PEM: openssl x509 -inform der -in certificate.cer -out certificate.pem

      Convert a P7B file to PEM: openssl pkcs7 -print_certs -in certificate.p7b -out certificate.cer

      Convert a PKCS#12 file to PEM: openssl pkcs12 -in keyStore.pfx -out keyStore.pem -nodes

      If you are still experiencing issues when attempting to import a signed certificate, you should ensure that the public keys for the certificate request and signed certificate match.  This public key should also match with the key contained in the certificate template.

      Use the following commands to extract the public key from the certificate request and signed certificate.

      Certificate Request: openssl req -in request.csr -pubkey

      Signed Certificate: openssl x509 -in certificate.crt -pubkey

      To obtain the public key from the certificate request, you should use the following xquery script.  Note that this script will need to be run against the Security database by a user with admin rights.  The output of this command will also display Private key information.  If you need to provide the output of this command to support, please remove all data in the <pki:private-key> elements.

      xquery version "1.0-ml";
      import module namespace pki = "http://marklogic.com/xdmp/pki"
      at "/MarkLogic/pki.xqy";

      let $template-id := pki:template-get-id(pki:get-template-by-name("INSERT-TEMPLATE-NAME"))

      return
      cts:search(fn:doc(),
      cts:element-value-query(xs:QName("pki:template-id"), fn:string($template-id), "exact"))

      The output of this script will contain various <pki:public-key> elements.  One of these public keys needs to match with the public key contained in your signed certificate.

      Summary 

      This article is intended to help investigate certain Kerberos External Authentication issues, since most of the Kerberos Security authentication requires much more IT involvement, below are few areas we recommend to investigate before involving IT for Kerberos trouble.

      Keytab file location and permission

      MarkLogic Server requires a keytab file with the specific name "services.keytab" at the specified location within the MarkLogic Data directory.

      Note: The Permissions on the keytab must not be World or Group readable.

      [Location] $ pwd
      /var/opt/MarkLogic
      [Permission & Owner] $ ls -alt services.keytab
      -rw------- 1 daemon daemon 86 May  4 09:51 services.keytab

      Sample krb5.conf Configuration file 

      Kerberos configuration file are essential to Kerberos handshake, and below is a sample Kerberos file for a reference.

      $ cat /etc/krb5.conf
      [logging]
      default = FILE:/var/log/krb5libs.log
      kdc = FILE:/var/log/krb5kdc.log
      admin_server = FILE:/var/log/kadmind.log
       
      [libdefaults]
      default_realm = MLTEST1.LOCAL
      dns_lookup_realm = true
      dns_lookup_kdc = false
      ticket_lifetime = 24h
      renew_lifetime = 7d
      forwardable = true
       
      [realms]
      MLTEST1.LOCAL = {
         kdc = srv-202-1-vm1.colo.marklogic.com
         admin_server = srv-202-1-vm1.colo.marklogic.com
      }
      [domain_realm]
      .marklogic.com = MLTEST1.LOCAL
      marklogic.com = MLTEST1.LOCAL

       

      Configuring Client Browser to utilize Kerberos authentication 

      Most Web Browser by default are not enabled to utilize Kerberos authentication with WebServer. Making sure browser is properly configured to utilize Kerberson handshake will eliminate one more suspect during troubleshooting. Below is one good Microsoft blog detailing on Browser configuration in respect to Kerberos

      http://blogs.msdn.com/b/friis/archive/2009/12/31/things-to-check-when-kerberos-authentication-fails-using-iis-ie.aspx

      Browser Login Dialog Username

      When Web Broswer attempts to connect Kerberos enabled WebServer, Browser will throw user prompt dialog box to user. Kerberos handshake expects that user provide complete domain/realm along with username during login process.

      Example - UserName : "test1@MLTEST1.LOCAL"

      Case Sensitivity of Kerberos

      Kerberos username as well as domain/realm are case sensitive and they should match to domain/real configured in file krb5.conf. Not having correct correct case on complete username (including realm) can lead to error with limited debugging information.

      MarkLogic Trace Events

      We can enable Kerberos Trace event as below and then run a kerberos login test again for ErrorLog to capture Trace Events, which could provide more information on Kerberos handshake between MarkLogic and Kerberos Server.


      Add the "Kerberos GSS Negotiate" trace event in the Admin UI by navigating to -> Configure -> Groups -> {group-name} -> Diagnostics -> trace events activated = true; then Add "Kerberos GSS Negotiate"; press the “ok” button.  

      List of other potential issue and troubleshoot techniques (Well compiled 3rd party source)

      https://technet.microsoft.com/en-us/library/bb463167.aspx 

       

       

      Introduction

      With the introduction of Certificate Based Authentication in MarkLogic 9, users can now log into a MarkLogic without entering user/password credentials.

      Configuring a MarkLogic AppServer to support TLS Client Certificate Authentication is a little more complex than simple SSL Server based authentication and it may not always be apparent why connections are not working once configuration is completed.

      This Knowledgeabse article demonstrates some simple debugging techniques that should help to track down and identify issues encountered with Certificate Based authentication where things are not working as expected.

      What is the difference between Client and Server based authentication?

      Before starting down the path of troubleshooting it's worth ensuring that we understand what the differences are between TLS Server based authentication and TLS Client Authentication:

      With a standard HTTPS connection to a TLS-enabled Application Server, MarkLogic server will send a copy of its X509 Certificate to the client who will then verify the certificate against a list of known Trusted Root certificates installed within the browser or a Java KeyStore for a Java based application (such as MLCP).

      When TLS Client Authentication is enabled in MarkLogic for Certificate Based authentication, as well as sending a certificate to the client, MarkLogic Server will request that the client sends a certificate back to the server.

      The certificate returned by the client is then used to determine which Internal or External user is used within MarkLogic.

      How does the client know which certificate to send?

      A web browser can often have multiple client certificates installed so how does it know which certificate to present to MarkLogic Server?

      The certificate(s) that the Client can use are controlled by MarkLogic Server's application server settings.  Using the Admin GUI on port 8001, during configuration for Certificate Based authentication, you can specify that a client certificate is required (Configure > Groups > [Your Group Name] > App Servers > [Your App Server Name] > ssl require client certificate : true) and you can also select one or more Certificate Authorities under the ssl client certificate authorities section.

      Only Client certificates issued by one of these authorities will be permitted.

      If a browser has multiple Client certificates issued by one of the selected Certificate Authorities, the user will be prompted to select the appropriate client certificate to use.

      Note: In this case it is important to select the certificate that have been issued to you for use within MarkLogic.

      To verify that you have a valid certificate, you can either use a local system tool such as KeyChain Access (in Mac OS X) to check that the Issuer Name details for your client certificate match those of the Certificate Authorities configured in the MarkLogic Application Server settings as per the example above.

      Alternatively, if you have a PEM representation of your user certificate you can use the OpenSSL utility to display the Issuer information, e.g.

      >$ openssl x509 -in user1.pem -issuer -noout
      issuer= /O=MarkLogic/OU=Support/CN=RootCA

      Verifying the TLS Handshake

      The first stage of MarkLogic Certificate Based authentication requires a successful TLS Handshake to take place between the Client and MarkLogic Server.

      If the TLS Handshake fails at any stage, the session will be rejected.

      Recommendation: While it is not a required it is highly recommended that you have a working HTTPS Application Server configuration first (using basic authentication) before enabling certificate based authentication.

      This will ensure you have a valid TLS Server configuration before you enable TLS Client Authentication and should reduce the amount of troubleshooting required.

      The easiest way to view what is taking place during the TLS Handshake is at the TCP packet level using a tool such as Wireshark which has built-in support for decoding the TLS protocol.

      If Wireshark can easily be installed on a Client machine, it can be configured to capture TCP traffic to/from the MarkLogic AppServer port; the example below demonstrates capturing set up with a filter for all traffic on port 8010

      If it is not possible to install Wireshark on the Client machine, the same information can be captured on the MarkLogic Server using the tcpdump utility.  From there you can create a pcap file on a given port (in this example: 8010), by running the following command:

      tcpdump –i any –s 0 –w certauth.pcap port 8010

      Once you have run tcpdump long enough to have captured the failing transaction, you can attach the resulting pcap file to a MarkLogic support ticket for further analysis.

      If you are able to view the packet trace in Wireshark, you will first need to locate and select the packet where MarkLogic Server sends the Certificate Request

      In the Frame details panel, you can drill down to the list of Distinguished Names and check that there is an entry for the Certificate Authority configured in MarkLogic

      Having ensured that MarkLogic Server is sending a request for the correct client certificate you should locate the subsequent Certificate response being sent by the client

      In the Frame details panel, the first thing to check is whether a Certificate was actually returned by the client. If no Certificate was found by the client that satisfied the MarkLogic Server request, then a Zero Certificate (Certificates Length: 0) is returned

      If a Client sends a Zero Certificate the session will be terminated immediately with a TLS Handshake Failure.

      In this case you should check that you have correctly installed a client certificate that was issued by the Certificate Authorities configured in MarkLogic for this Application Server.

      If a valid certificate was found by the Client you will see the necessary information within the Frame details panel in Wireshark

      If the session is terminated at this point with a TLS Handshake Failure the most likely cause is that MarkLogic Server was unable to verify the client certificate to a valid chain of root certificates.

      This will typically occur if the Issuing Certificate Authority configured in MarkLogic is part of a chain of Root certificates, often referred to as an Intermediate CA certificate. In this case you should check that all CA certificates in the chain have been installed to the Trusted Store in the MarkLogic Security database.

      If no TLS Handshake Failure occurs, you should see a pair of Encrypted Handshake Message packets which indicate that secure encryption has been enabled by both the client and the server and the TLS Handshake has successfully completed

      I still get a 401 Unauthorized error

      Having first established that a successful TLS Client Authentication has taken place if you still get a 401 Unauthorized error then the likely cause is that MarkLogic has not been able to successfully map the supplied Client certificate to either an Internal or External User.

      The first check that MarkLogic Server will make is to look for an Internal User that matches the Common Name (CN) in the supplied Client certificate.

      Having checked the userid specified in the certificate Common Name, e.g.

      Check that a corresponding Internal MarkLogic userid exists in your Security database; in the Admin GUI in MarkLogic check for a matching user name (Configure > Security > Users > [your user name])

      Alternatively, if no internal user matches, MarkLogic will attempt to use the full Subject Distinguished Name in the Client certificate to map to an external security name within a previously defined MarkLogic user.

       In this scenario first check that you have a valid External Security definition configured to perform certificate based authentication (Configure > Security > External Security)

      Assign the External Security definition to the Application Server to map the external security name (Configure > Groups > [Your Group Name] > App Servers > [Your App Server Name] > external securities)

      Finally check that Internal MarkLogic User has an External Name that matches the Client Certificate Subject Distinguished Name (DN) (Configure > Security > Users > [your user name])

      Note: The ordering of the Subject DN in the External name is critical and should follow the highest to lowest level precedence, e.g.

      O=MarkLogic,OU=App Users, CN=user1

      And not

      CN=user1,OU=App Users,O=MarkLogic

      If you are unsure you can use the OpenSSL command below to list the Subject DN in the expected order:

      >$ openssl x509 -in user1.pem -subject -noout 
      subject= /O=MarkLogic/OU=App Users/CN=user1

      Further reading

      Summary

      File and semaphore errors (such as SVC-FILREM & SVC-SEMPOST) seen on MarkLogic Servers running on the Microsoft Windows platform can sometimes be attributed to Windows file system handling of MarkLogic Data files.

      This article covers a number of possible sources and how to troubleshoot.

      Windows File System and background services

      Systems running Microsoft Windows Servers are often running Virus Scanners, Corporate Security tools and non-MarkLogic backup programs (for example: Windows Shadow copy). To avoid file access conflicts, MarkLogic recommends that all MarkLogic data files be excluded from access by any background services. As a general rule, *only* MarkLogic Server should be maintaining MarkLogic Server data files.

      Troubleshooting Suggestions

      If MarkLogic Server is reporting file system or semaphore errors, here are some troubleshooting suggestions: 

      1. Make sure that MarkLogic Server is running under an account with adequate file-system permissions.  MarkLogic recommendation is that MarkLogic Server runs under the SYSTEM account (default).

      2. If Anti-virus software is installed, configure it so that it excludes all MarkLogic Server Data files from being scanned;

      3. If a shadow backup c