Solutions

MarkLogic Data Hub Service

Fast data integration + improved data governance and security, with no infrastructure to buy or manage.

Learn More

Learn

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Community

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Company

Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

 
Knowledgebase:
How much free space is needed for the Archive Journal files in a backup?
25 September 2019 11:05 AM

Introduction

This Knowledgebase article is a general guideline for backups using the journal archiving feature for both free space requirements and expected file sizes written to the archive journaling repository when archive journaling is enabled and active.

The MarkLogic environment used here was an out-of-the box version 9.x with one change of adding a new directory specific to storing the archive journal backup files.

It is assumed that the reader of this article already has a basic understanding of the role of Journal Archiving in the Backup and Restore feature of MarkLogic Server. See references below for further details(below).

How much free space is needed for the Archive Journal files in a backup?

MarkLogic Server uses the forest size of the active forest to confirm whether the journal archive repository has enough free space to accommodate that forest, but if additional forests already exist on the same volume, then there may be an issue in the Server's "free-space" calculation as the other forests are never used in the algorithm that calculates the free space available for the backup and/or archive journal repositories. Only one forest is used in the free-space calculation.

In other words, if multiple forests exist on the same volume, there may not be enough free space available on that specific volume due to the additional forests; especially during a high rate of ingestion. If that is the case, then it is advised to provide enough free space on that volume to accommodate the sizes of all the forests. Required Free Space(approximately) = (Number of Forests) x (Size of largest Forest).

What can we expect to see in the journal archiving repository in terms of files sizes for specific ingestion types and sizes? That brings us to the other side.

How is the Journal Archive repository filling up?

1 MByte of raw XML data loaded into the server (as either a new document ingestion or a document update) will result in approximately 5 to 6 MBytes of data being written to the corresponding Journal Archive files.  Additionally, adding Range Indexes will contribute to a relatively small increase in consumed space.

Ingesting/updating RDF data results in slightly less data being written to the journal archive files.

In conclusion, for both new document ingestion and document updates, the typical expansion ratio of Journal Archive size to Input file size is between 5 an 6 but can be higher than that depending on the document structure and any added range indexes.

References:

(0 vote(s))
Helpful
Not helpful

Comments (0)