Solutions

MarkLogic Data Hub Service

Fast data integration + improved data governance and security, with no infrastructure to buy or manage.

Learn More

Learn

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Community

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

 
Knowledgebase : Administration
INTRODUCTION This article will show you how to add a Fast Data Directory (FDD) to an existing forest. DETAILS The fast data directory stores transaction journals and stands. When the directory becomes full, larger stands will be merged into the data...
SUMMARY Provide an answer to the question "What ports need to be open in my Security Group in order to run MarkLogic Server on Amazon's EC2?" DETAILS To run MarkLogic Server on Amazon's EC2, you'll need to open a port range from 7998-8002 in the a...
SUMMARY MarkLogic recommends that all production servers be monitored for system health. RECOMMENDATIONS For production MarkLogic Server clusters, the system monitoring solution should include the following features: * Enable [http://docs.marklog...
BEST PRACTICE FOR ADDING AN INDEX IN PRODUCTION SUMMARY It is sometimes necessary to remove or add an index to your production cluster. For a large database with more than a few GB of content, the resulting workload from reindexing your database can be ...
INTRODUCTION Sometimes you may find that there are one or more tasks that are taking too long to complete or are hogging too many server resources, and you would like to remove them from the Task Server. This article presents a way to cancel active ta...
INTRODUCTION This article discusses some of the issues you should think about when preparing to change the IP address' of a MarkLogic Server. DETAIL: If the hostnames stay the same, then changing IP addresses should not have any adverse side effe...
INTRODUCTION If you have an existing MarkLogic Server instance running on EC2, there may be circumstances where you need to change the size of available storage. This article discusses approaches to ensure a safe increase in the amount of available sto...
INTRODUCTION MarkLogic Server has shipped with full support for the W3C XML Schema [https://www.w3.org/XML/Schema] specification and schema validation capabilities since version 4.1 (released in 2009). These features allow for the validation of complet...
INTRODUCTION There may be situations where a MarkLogic cluster has evolved over time and there may be a need to create a _clone_ of that setup in order to create another separate development environment. In this Knowledgebase article we will outline th...
INTRODUCTION HAProxy (http://www.haproxy.org/ [http://www.haproxy.org/]) is a free, fast and reliable solution offering high availability, load balancing and proxying for TCP and HTTP-based applications. MarkLogic 8 (8.0-8 _and above_) and MarkLogic 9 ...
SUMMARY This article describes how to create a MarkLogic Support Request (commonly known as a _Support Dump_). To create a Support Request: CREATING A SUPPORT REQUEST 1. Use a web browser to log in to the server's Admin interface, which is typically f...
SUMMARY MarkLogic Server clusters are built on a distributed, shared nothing architecture. Typical query loads will maximize resource utilization only when database content is evenly distributed across D-node hosts in a cluster. That is, optimal perform...
INTRODUCTION From the documentation: > Indexing information is not replicated by the Master database and is > instead regenerated on the Replica system. If you want the option to > switch over to the Replica database after a disaster, the index > setti...
SUMMARY This article will provide steps to debug applications using the Alerting API that are not triggering an alert. DETAILS 1) Check that all required components are present in the database where alerting is setup: config, actions, rules. Run the...
SUMMARY This article will help MarkLogic Administrators to monitor the health of their MarkLogic cluster. By studying the attached scripts, you will learn how to find out which hosts are down and which forests have failed over, enabling you to take the n...
INTRODUCTION MarkLogic Server provides a variety of disaster recovery (DR) facilities including full backup, incremental backup, and journal archiving that when combined with other ML features can create a complete disaster recovery strategy. This paper...
INTRODUCTION This article talks about best practices for use of external proxies vs using rewriter rules in the Enhanced HTTP server. DETAILS Whether to use external proxies versus using rewriter rules in the Enhanced HTTP application server is an app...
[DEPRECATED] MarkLogic no longer supplies entity enrichment libraries.  
INTRODUCTION A "fast data directory [http://docs.marklogic.com/guide/performance/disk-storage#id_17115]" is configurable for each forest, and can be set to a directory built on a fast file system, such as one using SSDs. Refer to Using a mix of SSD and ...
Introduction Attached to this article is an XQuery module: "appserver-status.xqy", which will generate a report on all requests currently "in-flight" across ALL application servers in your cluster Usage Run this in Query Console (be sure to displa...
Introduction When configuring database replication, it is important to note that the _Connect Forests by Name_ field is true by default. This works great because, when new forests of the same name are later added to the Master and Replica databases, t...
SUMMARY This article discusses the MarkLogic group-level caches and Linux Huge Page configurations. GROUP CACHES MarkLogic utilizes caches to increase retrieval performance of frequently-accessed objects. In particular, MarkLogic caches: 1. Expand...
MARKLOGIC DEFAULT GROUP LEVEL CACHE AND HUGE PAGES SETTINGS The table below shows the default (and recommended) group level cache settings based on a few common RAM configurations for the 9.0-9.1 release of MarkLogic Server: TOTAL RAM LIST CACHE ...
Introduction MarkLogic Server has a notion of groups, which are sets of similarly configured hosts within a cluster. Application servers (and their respective ports) are scoped to their parent group. Therefore, you need to make sure that the hos...
INTRODUCTION OK, so you have written an amazing "killer App" using XQuery on MarkLogic Server and you are ready to make it available to the world. Before pressing the deploy button, you may want to verify that your application is not susceptible to hacke...
Introduction: Getting More Information About The Bugs Fixed Between Releases As a general recommendation, we encourage customers to keep the server up-to-date with patch releases at any case. If you would like a list of some of the published bugs tha...
SUMMARY MarkLogic Server has several different features that can help manage data across multiple database instances. Those features differ from each other in several important ways - this article will focus on high-level distinctions and will provide po...
INTRODUCTION Upgrading individual MarkLogic instances and clusters is generally very easy to do and in most cases requires very little downtime. In most cases, shutting down the MarkLogic instance on each host in turn, uninstalling the current release, i...
BACKGROUND A database consists of one or more forests. A forest is a collection of documents (mostly XML trees, thus the name), implemented as a physical directory on disk. Each forest holds a set of documents and all their indexes. When a new d...
Introduction With the release of MarkLogic 5, a new feature - "Journal Archiving" - was added to the product. This feature allows for point-in-time recoveries to be made to a given database (forest) at any time; essentially, this option allows the res...
INTRODUCTION Sometimes, when a host is removed from a cluster in an improper manner -- e.g., by some means other than the Admin UI or Admin API, a remote host can still try to communicate with its old cluster, but the cluster will recognize it as a "for...
INTRODUCTION For hosts that don't use a standard US locale (_en_US_) there are instances where some lower level calls will return data that cannot be parsed by MarkLogic Server. An example of this is shown with a host configured with a different locale w...
Summary Diagnostic trace events can be particularly useful in situations where you need access to more internal diagnostic information than is available in the standard MarkLogic ErrorLog or in the Operating System logs. The host / cluster can be c...
INTRODUCTION This article discusses the effects of the incremental backup implementation on Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO). DETAILS With MarkLogic 8 you can have multiple daily incremental backups with minimal impac...
SUMMARY MarkLogic Admin GUI is convenient place to deploy the Normal Certificate infrastructure or use the Temporary Certificate generated by MarkLogic. However for certain advance solutions/deployment we need XQuery based admin operations to configure M...
SUMMARY This article contains a high level overview of _TRANSPARENT HUGE PAGES_ and _HUGE PAGES_. It covers the configuration of _Huge Pages_ and offers advice as to when _Huge Pages_ should be used and how they can be configured. To the Linux kernel, ...
SUMMARY MarkLogic performs best if swap space is not used. There are other knowledge base articles that discuss sizing recommendations when configuring your MarkLogic Server. This article discusses the Linux swappiness setting that can limit the amount o...
SUMMARY MarkLogic Server maintains an access log for logging each HTTP application server request. However, the access log only contains summary information. In order to log additional HTTP request detail along with parameters, you can do so in the ...
INTRODUCTION If you have an existing MarkLogic Server cluster running on EC2, there may be circumstances where you need to upgrade the existing AMI with the latest MarkLogic rpm available. You can also add a custom OS configuration. This article assum...
SUMMARY All hosts in a MarkLogic cluster of two or more servers must run the same MarkLogic Server installation package. OPERATING SYSTEM ARCHITECTURE MarkLogic Server installation packages are created for each supported operating system architectur...
SUMMARY There are scenarios where you may want to restore a database from a MarkLogic Server backup that was taken from a database on a different cluster. EXAMPLES Two example scenarios where this may be appropriate: - For development or testing ...
INTRODUCTION This article details changes to the upgrade procedures for MarkLogic 9 AMIs. MarkLogic 9 now supports 1-click deployment in AWS Marketplace. This is an addition to existing options of manual launch of an AMI and launching MarkLogic cluster...
SUMMARY This article will help MarkLogic Adminiistrators and System Architects who need to understand how to provision the I/O capacity of their MarkLogic installation. MARKLOGIC DISK USAGE Databases in MarkLogic Server are made up of forests. Indivi...
INTRODUCTION This article provides a list of IP ports that MarkLogic Server uses. MARKLOGIC SERVER PORTS The following IP ports should be open and accessible on every host in the cluster: PORT 7997 (TCP/HTTP) is the default _HealthCheck_ application...
SUMMARY Version downgrades are NOT supported by MarkLogic Server. BACKUP YOUR CONFIGURATION FILES BEFORE YOU DO ANYTHING ELSE Please ensure you have all your current configuration files backed up. Each host in a MarkLogic cluster is configured usin...
INTRODUCTION The MarkLogic Monitoring History [http://docs.marklogic.com/guide/monitoring] feature allows you to capture and view critical performance data from your cluster. By default, this performance data is stored in the Meters database [http://doc...
MONITORING HISTORY The Monitoring History feature allows you to capture and view critical performance data from your cluster. Monitoring History capture is enabled at the group level. Once the performance data has been collected, you can view the data i...
SUMMARY: Prior to MarkLogic 4.1-5, role-ids were randomly generated. We now use a hash algothm that ensures that roles created with the same name will be assigned the same role-id. When attempting to migrate data from a forest created prior to MarkLog...
INTRODUCTION In this article, we discuss use of xdmp:cache-status in monitoring cache status, and explain the values returned. DETAILS Note that this is a relatively expensive operation, so it's not something to run every minute, but it may be valuabl...
UPDATE: Since the time this article was originally written, MarkLogic included Forest Rebalancing [http://docs.marklogic.com/guide/admin/database-rebalancing#chapter] and Forest Retiring [http://docs.marklogic.com/guide/admin/database-rebalancing#id_230...
SUMMARY Installation of Ops Director 2.0.0 on MarkLogic 9.0-9 fails with the error: This installer cannot install Ops Director on MarkLogic Server 9.0-9. This article will explain how to resolve the error. WHAT THE ERROR MEANS The Ops Director 2.0...
INTRODUCTION Administrators can achieve very fine granularity on restores when incremental backups are used in conjunction with log archiving. DETAILS Journal archiving can enable a restore to be performed to any timestamp since the last incremental b...
INTRODUCTION Seeing too many "stand limit" messages in your logs frequently? This article explains what this message means to your application and what actions should you take. WHAT ARE STANDS AND HOW THEIR NUMBERS CAN INCREASE? A stand holds a subs...
SUMMARY When used as a file system, GFS needs to be tuned for optimal performance with MarkLogic Server. RECOMMENDATIONS Specifically, we recommend tuning the demote_secs and statfs_fast parameters. The demote_secs parameter determines the amount of t...
SUMMARY Disk utilization is an important part of the hosts ecosystem. The results of filling the file system can have disastrous effects on server performance and data integrity. It is very important to ensure that your host always has an appropriate am...
INTRODUCTION In some situations an existing cluster node needs to be replaced. There are multiple reasons for this activity like hardware failure or hardware replacement. In this Knowledgebase article we will outline the steps necessary to replace the ...
INTRODUCTION In a multiple node cluster with local disk failover configured, there may be a need to replace a server with new hardware. This article explains how to do that while preserving the failover configuration. SAMPLE CONFIGURATION Consider a 3...
BACKWARDS COMPATIBILITY Newer versions of MarkLogic will support backups taken from older versions of the software. This restore may cause a reindex of the data in order to upgrade the database to the current feature release version. Information on backi...
INTRODUCTION There have been a number of reported incidents where database replication has been configured and where the main Schema database on the replica has been used alongside database replication; in a situation where MarkLogic's default Schema dat...
SUMMARY This article explores fragmentation policy decisions for a MarkLogic database, and how search results may be influenced by your fragmentation settings. DISCUSSION FRAGMENTS VERSUS DOCUMENTS Consider the below example. 1) Load 20 test docume...
SUMMARY Some MarkLogic Server sites are intalled in a 1GB network environment. At some point, your cluster growth may require an upgrade to 10GB ethernet. Here are some hints for knowing when to migrate up to 10GB ethernet, as well as some ways to work a...
INTRODUCTION We discuss why MarkLogic server should be started with root priviledges. DETAILS It is possible to install MarkLogic Server in a directory that does not require root priviledges. There's also a section in our Installation Guide (Configu...
SUMMARY When an SSL certificate is expired or out of date, it is necessary to renew the SSL certificates applied to a MarkLogic Application Server. The following general steps are required to apply an SSL certificate. * Create a certificate request...
SUMMARY It is important to have swap space configured on local disk as recommended on the computers in which MarkLogic runs. The general recommendation is 2x physical memory allocated for swap, although that can be relaxed for Linux systems (see below). ...
INTRODUCTION This article gives a brief summary of the MarkLogic Telemetry feature [https://docs.marklogic.com/guide/monitoring/telemetry#chapter] available in MarkLogic Server version 9 WHAT IS TELEMETRY USED FOR? Telemetry is a communication channel...
INTRODUCTION XQuery modules can be imported from other XQuery modules in MarkLogic Server. This article describes how modules are resolved in MarkLogic when they are imported in Xquery. DETAILS HOW MODULES ARE IMPORTED IN CODE Modules can be impor...
SUMMARY There are a number of options for transferring data between MarkLogic Server clusters. The best option for your particular circumstances will depend on your use case. DETAILS DATABASE BACKUP AND RESTORE To transfer the data between two indepe...
SUMMARY: MarkLogic allows the use of SSL certificates to be used when securing application servers. This article explains some common issues seen when importing certificates, as well as methods to troubleshoot problems. IMPORTING A CERTIFICATE INTO MAR...
SUMMARY Sometimes, following a manual merge, a number of deleted fragments -- usually small number -- are left behind after the merge completes. In a system that is undergoing steady updates, one will observe that the number of deleted fragments will go ...
UNDERSTANDING FOREST STATE TRANSITIONS WHILE PUTTING FOREST IN FLASH-BACKUP MODE When we transition a forest into flash-backup mode, the forest is unmounted and then remounted in _read-only_ mode so no updates can be made. During that process, the forest...
SUMMARY An XDMP-DBDUPURI error will occur if the same URI occurs in multiple forests of the same database. This article explains how this condition can occur and describes a number of strategies to help prevent and fix them. Under normal operating cond...
INTRODUCTION There are two ways of leveraging SSDs that can be used independently or simultaneously. FAST DATA DIRECTORY In the forest configuration for each forest, you can configure a Fast Data Directory. The Fast Data Directory is designed for fast...
Introduction Several customers have contacted support with questions regarding the use of a tool such as CoRB to "post-process" a large amount of data which they have already stored in MarkLogic. Here's a brief CoRB tutorial based on some of the qu...
INTRODUCTION                                                                                              MarkLogic Server is a highly scalable, high performance Enterprise NoSQL database platform. Configuring a MarkLogic cluster to run as virtual machi...
SUMMARY Below are some recommendations for packages that should be installed on all Linux hosts (RHEL/SUSE/CentOS) when installing MarkLogic Server, plus a brief description of what the package does and why we recommend installing it RECOMMENDED PACKAG...
In MarkLogic Server 5.0, database replication is compatible with local-disk failover, while flexible replication is compatible with both local- and shared-disk failover. In MarkLogic Server 4.2, flexible replication is compatible with both local- an...
SUMMARY Each node in a cluster communicates with all of the other nodes in the cluster at periodic intervals. This periodic communication, known as a heartbeat, circulates key information about host status and availability between the nodes in a cluster....