Knowledgebase:
Should I upgrade to a 10GB network?
28 April 2015 01:13 PM

SUMMARY

Some MarkLogic Server sites are intalled in a 1GB network environment. At some point, your cluster growth may require an upgrade to 10GB ethernet. Here are some hints for knowing when to migrate up to 10GB ethernet, as well as some ways to work around it prior to making the move to 10GB.

General Approach

A good way to check if you need more network bandwidth is to monitor the network packet retransmission rate on each host.  To do this, use the "sar -n EDEV 5" shell command. [For best results, make sure you have an updated version of sar]

Sample results:

# sar -n EDEV 5 3
... 10:41:44 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:49 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:49 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:49 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:54 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:54 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:54 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 10:41:59 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:41:59 AM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s Average: lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


Explanation of terms:

FIELDDESCRIPTION
IFACE LAN interface
rxerr/s Bad packets received per second
txerr/s Bad packets transmitted per second
coll/s Collisions per second
rxdrop/s Received packets dropped per second because buffers were full
txdrop/s Transmitted packets dropped per second because buffers were full
txcarr/s Carrier errors per second while transmitting packets
rxfram/s Frame alignment errors on received packets per second
rxfifo/s FIFO overrun errors per second on received packets
txfifo/s FIFO overrun errors per second on transmitted packets

If the value of txerr/s and txcarr/s is none zero, that means that the packets sent by this host are being dropped over the network, and that this host needs to retransmit.  By default, a host will wait for 200ms to see if there is an acknowledgment packet before taking this retransmission step. This delay is significant for MarkLogic Server and will factor into overall cluster performance.  You may use this as an indicator to see that it's time to upgrade (or, debug) your network. 

Other Considerations

10 gigabit ethernet requires special cables.  These cables are expensive, and easy to break.  If a cable is just slightly bent improperly, you will not get 10 gigabit ethernet out of it. So be sure to work with your IT department to insure that everything is installed as per the manufaturer specification. Once installed, double-check that you are actually getting 10GB from the installed network.

Another option is to use bonded ethernet to increase network bandwidth from 1GB to 2GB and to 4GB prior to jumping to 10GB.  A description of Bonded ethernet lies beyond the scope of this article, but your IT department should be familiar with it and be able to help you set it up.

 

(0 vote(s))
Helpful
Not helpful

Comments (0)