22 January 2024 10:26 AM
Update and delete operations can be performance intensive and have negative effects on search performance when done in a conventional way, where data is updated or deleted in-place. To avoid these performance impacts during update and delete operations, MarkLogic Server updates and deletes "lazily."
In MarkLogic Server, when you delete a document, it is not removed from disk immediately as that document's fragments are instead marked as "obsolete." Marking a document as obsolete tags its fragments for later removal, and also hides its fragments from subsequent query results. Updates happen in a similar way, where instead of updating in-place, MarkLogic Server marks the old versions of the fragments in an old stand as "obsolete" for later deletion, while also creating new versions of those fragments in a new stand (initially an in-memory stand, which is eventually written down as a new on-disk stand).
Eventually, merges occur to move any unchanged fragments from an old stand into a new stand. Old fragments marked obsolete are ultimately deleted after the merge creating the new stand finishes, where the old stands that were used as input into that merge are finally removed from disk. Merging is very important - this is the mechanism by which MarkLogic Server both frees up disk space and optimizes its on-disk data structures, as well as reduces the number of fragments evaluated during its queries and searches.
Note that for a merge-min-ratio of n, you can expect up to 1/(n+1) of a stand to be deleted fragments before the stand is automatically merged. See Overview of the Merge Policy Controls.
While lazy deletion results in faster updates and deletes, be aware that residual impacts can be seen in terms of both disk space and query performance if merges are not done in a timely manner.