Best Practices for Use of MarkLogic Server and Amazon's VPC
16 February 2021 02:41 PM
Although not exhaustive, this article lists some best practices for the use of MarkLogic Server and Amazon's VPC
- Nodes within a MarkLogic cluster need to communicate with one another directly, without the presence of a load balancer in-between them.
- Whether in the context of a VPC or not, before attempting to join a node to a cluster, one should verify whether each node is able to ping or to ssh from the one node to the other (or vice versa). If you're not able to ping or ssh from one machine to another, then issues seen during a MarkLogic cluster join is very likely to be localized to the network configuration and should be diagnosed at the network layer.
- The following items should be double-checked when using VPCs:
- If a private subnet is used for any MarkLogic instance, that subnet needs access to the public internet for the following situations:
- If Managed Cluster support is used, MarkLogic requires access to AWS services which require outbound connectivity to the internet (at minimum to the AWS service web sites).
- If foreign clusters are used then MarkLogic needs to connect to all hosts in the foreign cluster
- If Amazon S3 is used then MarkLogic needs to communicate with the S3 public web services.
- It is assumed that the creator of the VPC has properly configured all subnets which MarkLogic needs to be installed to have outbound internet. There are many ways that private subnets can be configured to communicate outbound to the public internet. NAT instances are one example [AWS VPC NAT]. Another option is using DirectConnect to route outbound traffic through the organization's internet connection.
- All subnets which host instances running MarkLogic in the same cluster need to be able to communicate via port 7999.
- Inbound ssh connectivity is required for command line administration of each server requiring port 22 to be accessible from either a VPN or a public subnet.
- With regard to application traffic (as opposed to intra-cluster traffic as seen during cluster joining) connectivity to the MarkLogic server(s) needs to be open to whatever applications for which it is required. Application traffic can be sent through an internal or external load balancer, a VPN, direct access from applications in the same subnet or routing through another subnet.