Creating a Web Service for Monitoring MarkLogic Backups
15 April 2020 04:42 AM
If you're looking at the MarkLogic Admin UI on port 8001, you may have noticed that the status page for a given database displays the last backup dateTime for a given database.
We have been asked in the past how this gets computed so the same check can be performed using your own code.
This Knowledgebase article will show examples that utilise XQuery to get this information and will explore the possibility of retrieving this using the MarkLogic ReST API
XQuery: How does the code work?
The simple answer is in the forest status for each of the forests in the database (note these values only appear if you have created a backup already). For the sake of these examples, let's say we have a database (called "test") which contains 12 forests (test-1 to test-12). We can get the backup status for these using a call to our ReST API:
In the results returned, you should see something like this:
In generating that status page, what the MarkLogic code does is to create an aggregate: a database doesn't contain documents in MarkLogic; it contains forests and those forests contain documents.
Continuing the example above (with a database called "test" containing 12 forests) if I run the following:
This will return the forest status(es) for all forests in the database "test" and return the forest names using XPath, so in this case, we would see:
Our admin UI is interrogating each forest in turn for that database and finding out the metrics for the last backup. So to put that into context, if we ran the following:
This gives us:
The code (or the status report) doesn't want values for all 12 forests, it just wants the time the last forest completed the backup (because that's the real time the backup completed), so our code is running a call to fn:max:
Which gives us the max value (as these are all xs:dateTimes, it's finding the most recent date), which in the case of this example is:
The same is true for the last incremental backup (note all that we're changing here is the XPath to get to the correct element:
So we can get the max value for this by getting the most recent time across all forests:
This would give us 2016-02-12T12:37:29.161Z
Using the ReST API
The ReST API also allows you to get this information but you'd need to jump through a few hoops to get to it; the ReST API status for a given database would give you the names of all the forests attached to that database:
And from there you could GET the information for all of those forests:
Once you'd got all those values, you could do what MarkLogic's admin code does and get the max values for them - although at this stage, it might make more sense to write a custom endpoint that returns this information, something like:
Where you could make a call to that module to get the aggregates (e.g.):
This would return the database status for any given parameter-name that is passed in.