MarkLogic 10 and Data Hub 5.0

Latest MarkLogic releases provide a smarter, simpler, and more secure way to integrate data.

Read Blog →


Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up →

Customizing MarkLogic on AWS with Packer and Terraform
31 August 2020 10:05 AM


Packer from HashiCorp is an open source provisioning tool, allowing for the automated creation of machine images, extending the ability to manage infrastructure to machine images. Packer supports a number of different image types including AWS, Azure, Docker, VirtualBox and VMWare.

These powerful tools can be used together to deploy a MarkLogic Cluster to AWS using the MarkLogic CloudFormation Template, using a customized Amazon Machine Image (AMI). The MarkLogic CloudFormation Template is the preferred method recommended by MarkLogic for building out MarkLogic clusters within AWS. By default the MarkLogic CloudFormation Template uses the official MarkLogic AMIs.

While this guide will cover a some portions of Terraform, the primary focus will be using Packer to customize an official MarkLogic AMI. For more detailed information on Terraform, we recommend reading Deploying MarkLogic to AWS with Terraform, which includes more detailed information on using Terraform, as well as the example files referenced later in this article.

Setting Up Packer

For the purpose of this example, I will assume that you have already installed the AWS CLI, with the correct credentials, and you have installed Packer.

Packer Templates

A Packer template is a JSON configuration file that is used to define the image that we want to build. Templates have a number of keys available for defining the machine image, but the most commonly used ones are builders, provisioners and post-processors.

  • builders are responsible for creating the images for various platforms.
  • provisioners is the section used to install and configure software running on machines before turning them into images.
  • post-processors are actions applied to the images after they are created.

Creating a Template

For our example, we are going to take the official MarkLogic AMI and apply some customizations before creating a new image.

Defining Variables

Variables help make the build more flexible, so we will utilize a seperate variables file, vars.json, to define parts of our build.

"vpc_region": "us-east-1",
"vpc_id": "vpc-06d3506111cea30d0",
"vpc_public_sn_id": "subnet-03343e69ae5bed127",
"vpc_public_sg_id": "sg-07693eb077acb8635",
"ami_filter": "release-MarkLogic-10*",
"ami_owner": "679593333241",
"instance_type": "t3.large",
"ssh_username": "ec2-user"

Creating Our Template

Now that we have some of the specific build details defined, we can create our template, base_ami.json. In this case we are going to use the build and provisioners keys in our build.

  "builders": [
      "type": "amazon-ebs",
      "region": "{{user `vpc_region`}}",
      "vpc_id": "{{user `vpc_id`}}",
      "subnet_id": "{{user `vpc_public_sn_id`}}",
      "associate_public_ip_address": true,
      "security_group_id": "{{user `vpc_public_sg_id`}}",
      "source_ami_filter": {
        "filters": {
        "virtualization-type": "hvm",
        "name": "{{user `ami_filter}}",
        "root-device-type": "ebs"
        "owners": ["{{user `ami_owner`}}"],
        "most_recent": true
      "instance_type": "{{user `instance_type`}}",
      "ssh_username": "{{user `ssh_username`}}",
      "ami_name": "ml-{{isotime \"2006-01-02-1504\"}}",
      "tags": {
        "Name": "ml-packer"
  "provisioners": [
      "type": "shell",
      "script": "./"
      "destination": "/tmp/",
      "source": "./marklogic.conf",
      "type": "file"
      "type": "shell",
      "inline": [ "sudo mv /tmp/marklogic.conf /etc/marklogic.conf" ]

In the build section we have defined the network and security group configurations and the source AMI details. We have also defined the naming convention (ml-YYYY-MM-DD-TTTT) for the our new AMI with ami_name and added a tag, ml-packer. Both of those will make it easier to find our AMI when it is time to use it with Terraform.


In our example, we are using the shell provisioner to execute a script against the machine, the file provisioner to copy the marklogic.conf file to the machine, and the shell provisioner to move the file to /etc/, all of which will be run prior to creating the image. There are also provisioners available for Ansible, Salt, Puppet, Chef, and PowerShell, among others.

Provisioning Script

For our custom image, we've determined that we need an additional piece of software installed, which we will do inside a script. We've named the script, and it is stored in the same directory as our packer template.

echo "**** Starting ****"
echo "Installing Git"
sudo yum install -y git
echo "**** Finishing ****"

Executing Our Build

Now that we've completed setting up our build, it's time to use packer to create the image.

packer build -debug -var-file=vars.json base_ami.json

Here you can see that we are telling packer to do a build using base_ami.json and referencing our variables file with the -var-file flag. We've also added the -debug flag which will disable parallelism and enable debug mode. In debug mode, packer will stop after each step and prompt you to hit Enter to go to the next step.

The last part of the build output will print out the details of our new image:

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:
us-east-1: ami-0100....

Terraform and the MarkLogic CloudFormation Template

At this point we have our image and want to use it when deploying the MarkLogic CloudFormation Template. Unfortunately there is no simple way to do this, as the MarkLogic CloudFormation Template does not have the option to specify a custom AMI. Fortunately Terraform has some functions available that we can use to make the changes to the Template.


First we want to add a couple entries to our existing Terraform variables file.

variable "ami_tag" {
  type = string
  default = "ml-packer"

variable "search_string" {
  type = string
  default = "ImageId: "

The first variable, ami_tag is the tag we added to AMI when it was built. The second variable, search_string will be described in the Updates to Terraform Root Module section below.

Data Source

To retrieve the AMI, we need to define a data source. In this case it will be an aws_ami data source. We are going to call the file

data "aws_ami" "ml_ami" {
  filter {
    name = "state"
    values = ["available"]

  filter {
    name = "tag:Name"
    values = ["${var.ami_tag}"]
  owners = ["self"]
  most_recent = true

So we are filtering the available AMIs, only looking at ones that are owned by our own account (self), tagged with the value that we defined in our variables file, and then if more than one AMI is returned, using the most recent.

Updates to Terraform Root Module

Now we are ready to make a couple of updates to our Terraform root module file to integrate the new AMI into our deployment. In our last example, we used the MarkLogic CloudFormation template from its S3 bucket. For this deployment, we are going to use a local copy of the template, mlcluster-template.yaml.

Replace the template_url line with the following line:

template_body = replace(file("./mlcluster-template.yaml"), "/${var.search_string}.*/","${var.search_string} ${}")

When we updated the variables in our Terraform variable file, we created the variable search_string. In the MarkLogic CloudFormation Template, the value for the Image ID is identified by the region and whether you are running the Essential Enterprise or Bring Your Own License version of MarkLogic Server. Here we are taking a regular expression, and using the replace function to manually update the line to reference the AMI we just created with Packer, which we have already retrieved already.

Deploying with Terraform

Now we are ready to run Terraform to deploy our cluster. First we want to double check that the template looks correct before we attempt to create the CloudFormation stack. The output of terraform plan will show the CloudFormation template that will be deployed. Check the output to make sure that the value for ImageId shows our desired AMI

Once we have confirmed our new AMI is being referenced, we can then run terraform apply to create a new stack using the template. This can be validated by opening a command line on one of the new hosts, and checking to see if Git is installed, and if /etc/marklogic.conf exists:

Wrapping Up

At this point, we have now customized the official MarkLogic AMI to create our own AMI using Packer. We have then used Terraform to update the MarkLogic CloudFormation Template and to deploy a CloudFormation stack based on the updated template.

(0 vote(s))
Not helpful

Comments (0)