Knowledgebase:
Using mlstat to gather information on MarkLogic performance
03 February 2014 04:20 PM

Introduction

mlstat is a command line tool that monitors various aspects of MarkLogic Server performance. It is runs on the MarkLogic node itself and is modeled on the classic Unix tools like vmstat and mpstat. It is designed to be always on, running in the background redirecting it's output to a file. It has the ability to tag each line of output with an Epoch or timestamp so the data can be correlated with an event

Note: this command-line tool has been replaced by the Monitoring History functionality in MarkLogic 7 (and subsequent releases of the product). As such it is no longer under active development by MarkLogic.

You can learn more about the new Monitoring History features by following this link: http://docs.marklogic.com/guide/monitoring/history

Design

mlstat is a bash script that calls other tools at regular intervals, compares the data with its previous sample and normalizes it on a per second basis. The tools it uses are:

  • xquery script stats.xqy to get docs inserted, forests, stands, active_merges, merge_read_bytes, merge_write_bytes, journal_write_bytes, journal_write_rate, save_write_rate, save_write_bytesin memory-mb, on disk size
  • xquery script http-server-status.xqy to get stats about a HTTP or xdbc server
  • xquery get-hosts.xqy to get a list of hosts
  • iostat to get disk and cpu performance data
  • vmstat to get runnable and blocked processes, swap in/out and context switch performance data
  • pmap to get memory sizes for anon and mapped files
  • /proc to get memory sizes forthe MarkLogic process
  • ifconfig to calculate network bandwidth
  • The MarkLogic log for interesting events such as saves and merges

Assumptions

  • mlstat currently only runs on Linux machines
  • mlstat assumes that iostat, pmap and vmstat are available on the system ie the sysstat package has been installed
  • mlstat assumes that iostat, vmstat and ifconfig are in the users $PATH
  • mlstat assumes that xquery files stats.xqy, get-hosts.xqy and http-server-status.xqy have been installed in the MarkLogic/Admin directory. If not mlstat will exit.
  • To display database statistics obviously MarkLogic needs to be running

Options

Use the -h flag for options

Database stats
-d <database>		Database to monitor
-j			Journal stats
-s			Save Stats
-m			Merge stats
-a			In-memory and disk sizes
-g			Docs Ingested, Deletes, Re-indexes, stands
-q			Query mb
-c			Forest cache stats
-v			Verbose cache stats
-I			ML view of I/O
-B			Backup and Restore stats
-R			Replication Send and Receive stats
-l <file location>	Location of log for scraping
-b <filename>		Dump log events to a separate file
-L			Dump log events to stdout
-o <http-server>	Http server stats
-x <xdbc-server>	xdbc server stats
-r               	Dump stats for Replica forests not regular System stats
-y			Linux stats - cpu, runnables, blocked,swap
-n <network name>	Network interface to monitor
-k <disk name>		Stripe or disk to monitor
-A			Aggregate all the disk stats into 1 number
-M			Dump memory stats from pmap of the MarkLogic process (requires root)
-S			Dump memory stats from /proc of MarkLogic process Control
-U <name>		ML User name other than admin
-P <passwd>		ML Passwd other than admin
-f			Dump stats in comma seperated list for csv
-e			Include Epoch per line
-t			Include Timestamp per line
-i <interval>		Set interval, default 10
-p <count>		Number of samples to take
-H <hostname>		Dumps stats for just one host in cluster
-N <hostname>		Run mlstat on this node, default is localhost
-C <comment>		Prepend this comment to each line
-X			Suppress headers

Running mlstat

The only required flag is -d if you are tracking one of the database statistics. The -d parameter specifies which database to extract performance data from.

However no flags means no data; there is no set of default data.

By default mlstat prints on a 10 second interval. Use the -i flag to change this. mlstat measures the actual interval taken and uses this value for all rate stats calculated

It is recommended to add a timestamp to each line of mlstat output making it possible to plot results later and pinpoint performance issues (use the -e, -t or -D flags)

Due to the potential size of the Error Log, checking the log is not enabled by default. However if you specify the -s (saves) or -m (merges) flags the ErrorLog file will be scraped to get save and merge counts on this node.

Like other tools mlstat dumps a header every 10 samples, to suppress this header specify the -X flag

Also like other tools you can restrict the number of samples to collect with the -p flag

mlstat can be run on multiple nodes at the same time. In this mode it is highly recommended to use -H <node name> to collect the data for that particular node.

Generating CSV output from mlstat

mlstat with the -f option produces a comma delimited csv file that can be used to generate graphs in Excel or other tools.

Sections of mlstat output

  • If the -e flag is specified mlstat will print the Linux epoch at the start of every line. This is extremely useful for plotting data
  • Specifying -t will convert this epoch to timestamp from the Linux date command
  • Specifying -D will emit both a date a time, handy for tests that run over a number of days
  • By using the -b or -L flags Save and Merge events written to the MarkLogic log can be printed by mlstat
    • Using -b option to redirect this output to a file
    • Using -L prints this output to standard out
  • If the ErrorLog is being written to a different location use -l to indicate this to mlstat
  • if -m or -s flags are used then the log file will be scraped to get counts of merges and saves respectively
  • By specifying -H just the ML stats for the specified node will be printed. It is important to use the fully qualified nodename eg foobar.example.com as defined in the cluster or the data cannot be extracted
  • The -j option simply prints the MB/s of journal files written to disk. By default this is for all nodes -H specifies a particular node
  • The -s option prints the number of saves of in-memory stands to disk and the MB/s of Save data for the cluster. Again by default this is for all forests, -H for a single node
  • The -m option prints the number of Merges currently active (A-Mergs), completed in the last period (C-Mergs) and the MBs per second Reads and Writes for merges across the entire cluster (note the Merge-rMB/s usually does not equal Disk I/O and a good percentage of the reads will be satisfied by the Linux filesystem cache)
  • The -a option dumps the size the in-memory stands and the current size of the stands on disk. Again if -H is used then only the space for that node is displayed
  • The -g flag dumps Docs ingested, Deleted and Re-indexed per second and current Stands in the database. The stand count will include both in-memory stands and on-disk stands
  • The -q flag dumps the MB per second read from disk for queries. This is an approximation of query processing load
  • With the -c flag you can dump the hit rates of the List cache (LC) and Compressed Tree Cache (CTC)
  • The -I flag gives a view of I/O from inside the MarkLogic Server
  • The -B flag measures the MB/s for backup and restore. It dumps the 512KB Read and Write ops per second for both backup and restore. It also dumps MarkLogics internal measurement of load for these operations. Finally based on the cpu time spent on the operation it calculates latency
  • The -R flag dumps the send and receive KB per second for database replication. Note this does not represent network traffic for local disk replication
  • With -o or -x , stats from a HTTP or xdbc server can be dumped. By default the query rate, current count of outstanding requests, number of outstanding update requests, active threads in the server and the Extended Tree Cache (ETC) hit rate are dumped. Note we add the name of the HTTP/xdbc server to the heading of each field
  • Adding the -v flag dumps statistics for the other caches (fs program cache,db program cache,env program cache fs main module sequence cache, db main module sequence cache, fs library module cache, db library module cache) These caches do not tend to be an issue and are included for completeness.
  • The -y flag dumps the breakdown of cpu time spent, runnable and blocked processes, swap in and out and context switches per second for this node. For CPU breakdown There is percent user, nice, system,idle, iowait and steal. The nice percentage is an indication of how much cpu is being spent on merges
  • By specifying -k <disk-name> mlstat will dump the I/O statistics on this node. The device can be a stripe such as md0 or individual disks such as /dev/xvdl. (Users can specify multiple disks using the | character. Note the | must be escaped on Linux so the command would be -k xvdl\|xvdm)
  • If multiple disks are specified then the -A flag can be included to aggregate the data from all these disks.
  • By specifying -n mlstat will dump the Network Bandwidth in Kbits per second
  • The -M flag uses the Linux utility pmap to determine how much memory have been allocated to Anon and Memory mapped files in the MarkLogic process. For each it has two fields, memory in MB requested and the current Resident Set Size (RSS) of the allocation. The RSS indicates how much memory Linux has actually assigned.
  • On some Linux systems, notably Red Hat, you need root permission to use pmap on the MarkLogic daemon process. As an alternative most of the same data can be accquired via /proc. The -S flag uses /proc to collect RSS and process size information

Example usage

./mlstat -d Documents -g
Monitor the ingest rate and stand count for the Documents database
./mlstat -d Documents -g -t
Monitor the ingest rate and stand count for the Documents database with a timestamp on each line
./mlstat -d Documents -g -t -j -m -s
Monitor the ingest rate, Journal MB/s, Merge read and write MB/s, Save MB/s and stand count for the Documents database with a timestamp on each line
./mlstat -d Documents -g -t -j -m -s -y
Monitor the ingest rate, Journal MB/s, Merge read and write MB/s, Save MB/s and stand count for the Documents database. Add the cpu stats for this node. With a timestamp on each line
./mlstat -d Documents -g -t -j -m -s -y -i 60
Monitor the ingest rate, Journal MB/s, Merge read and write MB/s, Save MB/s and stand count for the Documents database. Add the cpu stats and for this node. With a timestamp on each line and set the interval to every 60 seconds
./mlstat -d Documents -g -t -j -m -s -y -i 60 -n eth0
Monitor the ingest rate, Journal MB/s, Merge read and write MB/s, Save MB/s and stand count for the Documents database. Monitor the cpu stats and network eth0 for this node. With a timestamp on each line and set the interval to every 60 seconds
./mlstat -d Documents -B
Monitor the Backup and Restore I/O for the Documents database
./mlstat -d Documents -R
Monitor the Database replication network traffic for the Documents database
./mlstat -d Documents -I
Dump MarkLogics view of I/O for the Documents database
./mlstat -k xvdl\|xvdm
Monitor two disks, xvdl and xvdm on this server
./mlstat -k xvdl\|xvdm -A
Monitor two disks, xvdl and xvdm on this server but accumulate their stats
./mlstat -M
Monitor the memory usage of the MarkLogic daemon using pmap
./mlstat -S
Monitor the memory usage of the MarkLogic daemon using /proc
./mlstat -d Documents -x 9000-xcc
Monitor the XDBC server 9000-xcc on the Documents database
./mlstat -d Documents -x 9000-xcc -v
Monitor the XDBC server 9000-xcc on the Documents database and add its cache hit rates

Download

You can download all the required files for mlstat in the zip file (mlstat.zip) attached to this KnowledgeBase article (see below)



Attachments 
 
 mlstat.zip (18.79 KB)
(0 vote(s))
Helpful
Not helpful

Comments (0)