Knowledgebase:
When submitting lots of parallel queries, some subset of those queries take much longer - why?
27 March 2014 03:09 PM

MarkLogic Server is designed to scale horizontally, and goes to great effort to make sure queries can be parallelized independently of one another. Nevertheless, there are occasions where users will run into an issue where, when invoked in parallel, some subset of their queries will take much longer than usual to execute. The longer running parallel invocations are almost always due to those some queries' runtime being informed by

a. the runtime of the query in isolation (in other words, absent any other activity on the database) but also

b. the time that query instance spends waiting on resources necessary for it to make forward progress.

Resolving this kind of performance issue requires a careful analysis of how queries are using resources as they work there way through your application stack. For example, if you have a web or application server in front of MarkLogic Server, how many threads do you have configured? How does that number compare to the number of threads configured on the MarkLogic application server to which its connected? If the number of MarkLogic application server threads is much smaller than the number of potential incoming requests, then some of your queries will be fast because all they need to do is execute - and some of your queries will be slower to run because, in addition to the time needed to execute, they'll also need to wait for a MarkLogic application server thread to free up. In this case, you'll want to bring the number of threads into better alignment with one another - either by reducing the number of threads on the web or application server in front of MarkLogic, or increasing the number of MarkLogic application server threads - or both.

You'll want to try and minimize the amount of time queries spend waiting for resources, in general. Application server threads are just one example of resources on which queries can sometimes wait. Queries can also wait for all sorts of other resources to free up - threads, RAM, storage I/O, read or write locks, etc. Ultimately, if you're seeing a performance profile where a single query invocation if fast but some subset of parallel invocations is fast and some slow (sometimes seen with higher query runtime averages and larger query runtime standard deviations), then you're very likely to have a resource bottleneck somewhere in your application stack. Resolving such a bottleneck will involve some combination of increasing the amount of available resource, reducing the amount of parallel traffic, or improving the overall efficiency of any one instance of your query.

(4 vote(s))
Helpful
Not helpful

Comments (0)