Knowledgebase:
MarkLogic on AWS FAQ
10 March 2022 03:41 PM

Question Answer Further Reading

How do I stand up a MarkLogic instance on AWS?

Launching a MarkLogic AMI via Cloud Formation templates (CFTs) is the best way to stand up MarkLogic instances on AWS as it helps you make use of the Managed Cluster feature, which is designed for easy and reliable cloud deployment.

You can also run MarkLogic without the Managed Cluster feature (with or without MarkLogic AMIs) - but it is not recommended due to the additional administrative complexity.

Documentation:

KB Article:

Video Tutorial:

What deployment or provisioning tools are supported?

MarkLogic supports Cloud Formation Templates

While not officially supported, we do have customers using tools like Terraform, Ansible and Packer

KB Articles:

What are the recommended instance types for MarkLogic deployments?

  • Unfortunately, there is no one single instance type that works for all MarkLogic deployments
  • Do note, however, that MarkLogic deployments generally have higher memory and storage I/O bandwidth requirements than legacy RDBMS deployments - so you'll likely want to start with Memory Optimized, Storage Optimized, or General Purpose instance types
  • The best instance type for your deployment will depend on your application code, workload, networking / system / cluster configurations, storage options, cloud architecture, etc. (not to mention the fact that AWS itself changes quickly and often)
  • We recommend doing extensive testing in lower environments before using a specific instance type in production
  • MarkLogic AMIs will not run on micro instances

Documentation:

Can we use Nitro instances?

  • One of the features of the AWS Nitro System instances is that it allows multiple EBS volumes to be attached to the instance in any order.
  • Unfortunately, this behavior doesn't work reliably with the MarkLogic’s Cloud Formation Templates
  • While not recommended due to the additional administrative complexity, if you were to use multiple EBS volumes per node, you should set up additional monitoring to ensure that the hosts rejoin the cluster correctly and that the multiple volumes are mounted correctly after an EC2 node termination

Does MarkLogic support AWS Graviton instances?

MarkLogic does not currently support ARM based processors, so AWS Graviton instances are also not supported

Documentation:

What is our recommendation around volume management? 

  • Use large EBS volumes as opposed to multiple smaller ones
    • Larger EBS volumes (gp2) have faster IO as described by the Amazon EBS Volume types
    • You have to keep enough spare capacity on each EBS volume to allow for merges
    • The recommendation is to have one large EBS data volume per node - while it’s possible to have multiple volumes per instance, we’ve found that’s not typically worth the additional administrative complexity
  • When resizing, adopt a vertical scaling approach (so growing into a single bigger EBS volume vs. adding multiple smaller volumes per node)
  • Note that S3 storage is eventually consistent, therefore S3 can only be used for backups or read-only forests in MarkLogic Server (otherwise you risk the possibility of data loss)

Documentation:

KB Article:

How do I change the size of EBS volumes attached to MarkLogic AWS EC2 instances?

In general, the best strategy is to follow Amazon user-guides and best-practices on how to increase storage size or any other kind of system changes.

Specific to the MarkLogic deployments stood up via the Cloud Formation templates provided by MarkLogic, the following approaches ensure a safe operation:

  • The recommended approach is to shut down the cluster, do the resize using snapshots and restart the cluster
  • You could also use multiple volumes and rebalance, if you wish to avoid downtime

It’s important to remember that if your cluster has grown enough to need more disk space, it will likely need additional resources, as well - such as CPU, RAM, storage and network bandwidth, etc.

KB Article:

What is the typical architecture for ensuring high availability (HA) and disaster recovery (DR) for MarkLogic on AWS?

 

  • Use high availability to protect against availability zone failure, and disaster recovery to protect against region failure
  • For high-availability:
    • Within a cluster, spread your nodes across three different availability zones within a single region, then use local disk failover to have copies of your data in each availability zone
  • For disaster recovery:
    • Place two different clusters in two different regions, then use database replication to have copies of your database in both regions
    • Be aware that cross region traffic is expensive - it may be more cost effective to have both primary and replica clusters in the same region, but keep in mind that you’re then vulnerable to a failure of that region

Documentation:

Can I run MarkLogic Server in just two Availability Zones?

  • The best practice is to distribute your nodes across three different Availability Zones (AZs) within a single region
  • If a region has only two AZs, you can’t spread your nodes across enough regions to survive an AZ failure - so consider placing all your nodes in a single AZ to save on inter-zone networking costs
  • Note that two AZs are not a supported configuration for MarkLogic Cloud Formation Templates

KB Articles:

 

For my AWS security group, what ports do I need if I’m using a Cloud Formation Template?

MarkLogic Server needs the same ports open as what you’d configure in an on-premise deployment.

KB Articles:

 

Best Practices for resizing a MarkLogic cluster on AWS

  • Resizing a MarkLogic Cluster on AWS can be done vertically or horizontally, similar to how it's done with on-premise deployments
  • Vertical scaling - changes the type of instances
    • You can change the instance type by using the update stack feature
    • Make sure you hibernate the cluster before and restart the cluster after the procedure
  • Horizontal scaling - changes the number of instances
    • Use the update stack feature  by changing the NodesPerZone setting on the CFT
    • Alternatively, use the auto-scaling groups
  • Similarly, data capacity can be resized in two different ways:
    • Resizing using AWS snapshots
    • Resizing using MarkLogic’s rebalancing feature
  • While vertical scale out is significantly easier on AWS vs. on-premise deployments, note that MarkLogic requires at least some degree of horizontal scaling as high availability (HA) requires at least three nodes in a cluster
  • Whether you are scaling nodes or data capacity, horizontally or vertically, it is recommended to:
    • Test your scale out procedure thoroughly before implementing
    • Take full backups of your data before making changes to your cluster

Documentation:

How do I upgrade MarkLogic on AWS?

  • In general, it’s important to understand that in on-premise deployments, you keep your machines, but change/upgrade the MarkLogic binary. In contrast, in AWS you keep your data/configuration, and instead change to a new instance with the new binary
  • If you have a MarkLogic AMI launched via a Cloud Formation Template (CFT):
    • If you want to upgrade MarkLogic alone, you must update the AMI IDs in your original CFT as you cannot upgrade your CFT to a different version.
    • If you want to upgrade both the MarkLogic and the CFT versions, you would instead set up a new cluster, then move your data and configuration to the new template, then after thorough testing - switch to the new cluster
  • If you have a custom AMI:
    • You’ll need to perform a manual upgrade or update your custom MarkLogic AWS AMI

Documentation:

KB Articles:

Which load balancers are used for MarkLogic deployments on AWS?

Classic Load Balancer - Used with MarkLogic Server 9.x, and 10.x until 10.0.-6.1. Also used for single availability zone deployments

Application Load Balancer - Starting 10.0-6.2 and if the deployment is across the recommended configuration of multiple availability zones

Network Load Balancer - Needed for ODBC connections

Documentation:

How do I monitor EC2 instances, EBS volumes etc?

 

Documentation:

How do I secure my Admin password for AWS Deployment?

  • It is not secure to store MarkLogic admin password in marklogic.conf file
  • Use secure S3 bucket in combination with a AMI Role that grants read-only access to the EC2 instances in the cluster

 

Documentation:

KB Articles:

How do we push data from MarkLogic to AWS SQS queue?

There are no direct functions to send messages to AWS SQS, but it should be possible to use the xdmp:http-post function as detailed below

https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-making-api-requests.html#structure-post-request

 

What do we do about the SVC-AWSCRED error?

What are 504 Timeout errors? How to resolve them?

Refer KB Article: MarkLogic Fundamentals FAQ - Common Error Messages

(10 vote(s))
Helpful
Not helpful

Comments (0)