Programmatically finding the last backup date
26 July 2016 02:47 PM
Looking at the MarkLogic Admin UI, you may have noticed that the status page for a given database displays the last backup date and time for a given database. We have been asked in the past how this gets computed so the same check can be performed using your own code. This Knowledgebase article shows examples that utilise XQuery to get this information and explores the possibility of retrieving this using the MarkLogic ReST API
XQuery: How does the code work?
The simple answer is in the forest status for each of the forests in the database (note these values only appear if you have created a backup already). For the sake of these examples, let's say I have a database (called "test") which contains 12 forests (test-1 to test-12). I can get the backup status using a call to our ReST API:
In the results returned, you should see something like:
In generating that status page in the MarkLogic Admin UI code, we create an aggregate - a database doesn't contain documents in MarkLogic, it contains forests and those forests contain documents.
Continuing the example above (with a database called "test" containing 12 forests) if I run the following:
This will return the forest status(es) for all forests in the database "test" and return the forest names using XPath, so in my case, I would see:
The MarkLogic Admin UI interrogate each forest in turn for that database and finds the metrics for the last backup. To put that into context, if we ran the following:
This gives us:
The code (or the status report) doesn't want values for all 12 forests, it just wants the time the last forest completed the backup (because that's the real time the backup completed), so our code is running a call to fn:max:
Which gives us the max value (as these are all xs:dateTimes, it's finding the most recent date), which in the case of this example is:
The same is true for the last incremental backup (note all that we're changing here is the XPath to get to the correct element):
So we can get the max value for this by getting the most recent time across all forests:
This would give us 2016-02-12T12:37:29.161Z
Using the ReST API
The ReST API does allow you to get this information but you'd need to jump through a few hoops to get to it:
The ReST API status for a given database would give you the names of all the forests attached to that database:
And from there you could GET the information for all of those forests:
Once you'd got all those values, you could calculate the max values for them - but at this point, I think it would make more sense to write a custom endpoint that returns this information, something like:
Where you could make a call to that module to get the aggregates (e.g.):
This would return the database status for any given parameter-name that is passed in.