Understanding Locking in MarkLogic Using Examples
14 July 2022 02:03 PM
On a typical online transactional project it’s not uncommon, at the end of the project to discover when running at scale that simple tasks unexpectedly take much longer than expected. You’re surprised because your team know how to avoid writing ‘bad’ queries that retrieve lots of data from disk, and when you run the relevant requests through a profiler they seem to run efficiently. What’s going on?
It’s at this point that you may well start having conversations about locking. Although you maybe got told or read about how locking works in MarkLogic at the start of the project, you pushed it to the back of your mind as there were lots of other things to think about. Now it has your attention.
Suddenly you can see that locking is something you need to know about and given it seems to be causing problems it becomes something to be avoided. You may well start going through invoke or eval related contortions to avoid it, which in turn may create a fresh set of problems, which in turn give rise to workarounds and so on, leading to a crisis of confidence and paranoia concerning the platform itself.
The extremes of not really understanding and ignoring locking at the outset, and later overcompensating can be overcome by having a sound understanding at the start.
First of all, it’s worth taking the time to read the relevant section in the documentation
From a performance point of view the following key points are worth emphasizing.
You will only get contention between requests if they are both update requests and at least one of those requests updates a document that is either read or updated by another running request.
The converse of that statement is that you will not get contention if the documents you are updating are not being read or updated by other concurrently running requests. If you bear this principle in mind you should be able to build an application that runs just as well at scale as it does on a laptop.
Example: Locking without contention
Examples are instructive. We base ours in the ‘Documents’ database, although any database will do.
First clear your database. Then add a single document:
We’ll use this to show that locking is fine so long as there’s no lock contention.
In a Query Console window add this code:
This will update /thread-1-output.xml (requiring an exclusive write lock), and read /for-read-lock.xml, requiring a non-exclusive read lock. We deliberately hold the transaction open with a sleep statement for 20 seconds so we can see the effects of locking if they occur.
In a second Query Console window add:
This will update /thread-2-output.xml (requiring an exclusive write lock) and read /for-read-lock.xml, again requiring a non-exclusive read lock.
Now run the first block in the first window, and as quickly as you can, run the second block in the second window at the same time. You will see the second block returns almost immediately with something like:
The elapsed time shows the update returned almost instantly. However, the first update will not return for around 20 seconds. The point of this is that although they’re both updates, and they are both are reading the same document, /for-read-lock.xml , there is no contention. If there were, the second update would have to wait until the first update completed, and would therefore also take 20 seconds to complete.
Example : Locking with contention
Now we do the same thing, but using a different second thread.
Here the server will take a read lock on /thread-1-output.xml and will require an exclusive write lock on /thread-2-output.xml. However here we will have contention – thread 2 is trying to read something that’s being updated elsewhere.
If we again run the first block in the first window, and quickly run the second block in the second window, the second block will this time take around 20 seconds to complete:
The elapsed time shows it took 20 seconds to complete. This is because thread two blocks, waiting for read access on the exclusively locked /thread-1-output.xml
Using xdmp:transaction-locks to identify blocking locks
Now by inspection we can see in the code above that there is contention on /thread-1-output.xml. Sometimes the contention can be less clear. In version 9 MarkLogic introduced xdmp:transaction-locks which can help in troubleshooting problems. It requires a host name and transaction id as arguments. Add to this a small amount of XQuery and we can quickly use this to get more insight into locking problems.
As before, we run thread 1 and the ‘bad’ thread 2, followed by (in another window):
This iterates over all running transactions to show us our locks, sorting the longest running to the top. My output is:
The item <waiting>/thread-1-output.xml</waiting> in the second section shows I have a thread blocking on /thread-1-output.xml. Knowing this will aid me in diagnosing the source of my locking problem. Note that the problem could have been more subtle – perhaps I was reading all documents in a collection with thread one updating one document, and thread two another.
Whatever your requirements, with a little planning, it should be possible to avoid locking problems creating unexpected performance issues. Should you run into problems however, diagnostic tools should help you identify where the difficulties are. Finally, locks are ultimately a good thing, as without them we would not be able to write consistent and predictable applications. Understanding them allows you to benefit from their use, while avoiding unnecessary side effects.