Knowledgebase:
Using maps and server fields
23 August 2016 01:49 PM

Introduction: MarkLogic's shared nothing architecture

Each node in your cluster maintains:

  • Its own set of group level caches;
  • A stack of application servers; 
  • All configuration files to allow it to understand the entire topology of the cluster.

If you execute a query on a given host in that cluster, that host will use its own resources (CPU, RAM) to run that query.

Server fields

In addition to this you can also store data in server fields.

This could be very useful if you want to "pre-compute" some data that may be required again but which has an up-front cost to create (for example: creating a large number of lookups that may load large numbers of documents from disk).

Maps are excellent for fast lookup and retrieval and can allow you to use MarkLogic to store intermediary data; this can be especially useful when you're working on a query that has to work through a lot of steps and may need to resolve some pieces of information multiple times throughout the lifecycle of the report.

Caveats

However, if you're planning on using server-side fields, there are some important points worth noting:

  • Server side fields exist only on the host evaluating on the query
  • Anything stored in a map will not survive an event where the MarkLogic process restarts.

Here's a simple example of how a MarkLogic host can be used to store data in server-side maps:

Generate some test data

The following example demonstrates the creation of 1,000,000 documents. These will be loaded into 20 separate groups.