Solutions

MarkLogic Data Hub Service

Fast data integration + improved data governance and security, with no infrastructure to buy or manage.

Learn More

Learn

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Community

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

 
Knowledgebase:
Programmatically finding the last backup date
26 July 2016 02:47 PM

Introduction

Looking at the MarkLogic Admin UI, you may have noticed that the status page for a given database displays the last backup date and time for a given database. We have been asked in the past how this gets computed so the same check can be performed using your own code. This Knowledgebase article shows examples that utilise XQuery to get this information and explores the possibility of retrieving this using the MarkLogic ReST API

XQuery: How does the code work?

The simple answer is in the forest status for each of the forests in the database (note these values only appear if you have created a backup already).  For the sake of these examples, let's say I have a database (called "test") which contains 12 forests (test-1 to test-12).  I can get the backup status using a call to our ReST API:

http://localhost:8002/manage/LATEST/forests/test-1?view=status&format=html

In the results returned, you should see something like:

last-backup : 2016-02-12T12:30:39.916Z datetime
last-incr-backup : 2016-02-12T12:37:29.085Z datetime

In generating that status page in the MarkLogic Admin UI code, we create an aggregate - a database doesn't contain documents in MarkLogic, it contains forests and those forests contain documents.

Continuing the example above (with a database called "test" containing 12 forests) if I run the following:

This will return the forest status(es) for all forests in the database "test" and return the forest names using XPath, so in my case, I would see:

<forest-name xmlns="http://marklogic.com/xdmp/status/forest">test-1</forest-name>
[...]
<forest-name xmlns="http://marklogic.com/xdmp/status/forest">test-12</forest-name>

The MarkLogic Admin UI interrogate each forest in turn for that database and finds the metrics for the last backup.  To put that into context, if we ran the following:

This gives us:

<last-backup xmlns="http://marklogic.com/xdmp/status/forest">2016-02-12T12:30:39.946Z</last-backup>
[...]
<last-backup xmlns="http://marklogic.com/xdmp/status/forest">2016-02-12T12:30:39.925Z</last-backup>

The code (or the status report) doesn't want values for all 12 forests, it just wants the time the last forest completed the backup (because that's the real time the backup completed), so our code is running a call to fn:max:

Which gives us the max value (as these are all xs:dateTimes, it's finding the most recent date), which in the case of this example is:

2016-02-12T12:30:39.993Z

The same is true for the last incremental backup (note all that we're changing here is the XPath to get to the correct element):

So we can get the max value for this by getting the most recent time across all forests:

This would give us 2016-02-12T12:37:29.161Z

Using the ReST API

The ReST API does allow you to get this information but you'd need to jump through a few hoops to get to it:

The ReST API status for a given database would give you the names of all the forests attached to that database:

http://localhost:8002/manage/LATEST/databases/test

And from there you could GET the information for all of those forests:

http://localhost:8002/manage/LATEST/forests/test-1?view=status&format=html
[...]
http://localhost:8002/manage/LATEST/forests/test-12?view=status&format=html

Once you'd got all those values, you could calculate the max values for them - but at this point, I think it would make more sense to write a custom endpoint that returns this information, something like:

Where you could make a call to that module to get the aggregates (e.g.):

http://[server]:[port]/[modulename.xqy]?db=test

This would return the database status for any given parameter-name that is passed in.

(0 vote(s))
Helpful
Not helpful

Comments (0)