Feature #1894

Replication Speed over WAN

Added by Paul Bucher about 9 years ago. Updated over 7 years ago.

Target version:
Estimated time:
Reason for Closing:
Reason for Blocked:
Needs QA:
Needs Doc:
Needs Merging:
Needs Automation:
Support Suite Ticket:
Hardware Configuration:



Replication setup via the GUI over a WAN is slow.


I've setup replication via the GUI for 3 different servers. 2 PUSH & 1 PULL. 1 PUSH server is local to the pull server and is able to clock over 800Mbs. The 2nd PUSH is remote and can only push 6Mbs(I've got a 100Mbs internet connection locals and remote is in a data center with multiple 40 Gbs fiber connections). I do nightly snapshots on the this server that produce about 50GB a data a night(so this isn't going to work out so good at 6Mbs). With this setup I'm using NAT on my pfSense appliance to hit [[FreeNAS]]. I do have the High Speed Cipher checkbox checked. TOP is showing virtually no CPU usage going on.

If I manually replicate using [[NetCAT]] from the CLI and run it through my VPN tunnel that exists between the 2 networking going through the same pfSense appliances I'm able to average about 24Mbs, so that cuts out most networking issues. Here are the commands I'm using to do this:

On Push:
zfs send -i san/softlayer/esxi_store@auto-20121016.2000-2m san/softlayer/esxi_store@auto-20121017.2000-2m |  nc 8023

On Pull:
nc -l 8023 |  zfs recv san/lancaster/backup/softlayer/esxi_store

Also is a set of pings show latency to the network.

64 bytes from icmp_seq=0 ttl=51 time=60.951 ms
64 bytes from icmp_seq=1 ttl=51 time=74.206 ms
64 bytes from icmp_seq=2 ttl=51 time=65.651 ms
64 bytes from icmp_seq=3 ttl=51 time=57.853 ms
64 bytes from icmp_seq=4 ttl=51 time=77.802 ms
64 bytes from icmp_seq=5 ttl=51 time=108.533 ms
64 bytes from icmp_seq=6 ttl=51 time=95.087 ms
64 bytes from icmp_seq=7 ttl=51 time=62.734 ms

Work Around:

Ideally it would be great if netcat could be a check box option for going over secured networks. Esp because my local replication is going over a patch cable between the servers using 10Gb ethernet and I'm CPU bound on the SSH.

Replication.PNG (40.1 KB) Replication.PNG pfsense graph Paul Bucher, 09/05/2013 05:06 PM


#1 Updated by Xin Li about 9 years ago


Using netcat would require a little bit more work, as we need a way to start it at remote side -- doable, of course. My impression however is that the encryption is not the bottleneck here, maybe the HPN changes that is available with newer [[FreeBSD]] versions would help? ( ) It does require both client and server side supporting it, but it would be worthy to try as it provides a more generic way to do things (and an option to disable encryption as well). Just my $0.02 here.

#2 Updated by Paul Bucher about 9 years ago

I sort of figured that about netcat, because I considered even trying to add it myself but alas I've got other programming projects. :(

Anyways HPN might be a very viable option and I could probably limp alone with my manual CLI stuff until 9.1 hits the streets.

My setup is 100% virtualized so I'd be happy to do some testing to see if HPN would do the trick. I can build a new VM and just attach the drive array to the new VM to test, while these are production SANs PUSH is a SAN holding backups and PULL is my backup/test/develop SAN so I can knock it down without too many issues.

#3 Updated by Paul Bucher about 9 years ago

I did a test using iperf to double check things and it shows the bandwidth is available. I did this both over the VPN and via the NAT I'm using for SSH. It does show that my VPN is my bottleneck when using netcat

iperf -p 2201 -c
Client connecting to, TCP port 2201
TCP window size: 32.9 KByte (default)
[  3] local port 10670 connected with port 2201
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   110 MBytes  92.1 Mbits/sec

 iperf -p 2201 -c
Client connecting to, TCP port 2201
TCP window size: 32.9 KByte (default)
[  3] local port 15733 connected with port 2201
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  29.0 MBytes  24.3 Mbits/sec

#4 Updated by Paul Bucher over 8 years ago

Since this goes with this request, I'm not going to open a 2nd ticket. You guys have enough tickets has it is.

For the 9.1.x series could you make sure that the AES-NI module gets included, it will add hardware crypto acceleration to [[OpenSSL]] running on newer intel CPUs with the AES instruction set.

As a side note: My VPN is now using this module along with [[OpenVPN]] which sits on top of [[OpenSSL]] and I can get 20Mbs of performance on using the netcat replication approach.

If I get time I'm going to try to build AES-NI into the current [[FreeNAS]] and do a new benchmark using the current version.

#5 Updated by Paul Bucher over 8 years ago

Never mind I see you built aesni into the kernel vs the standalone module(at least in 8.3.1 version). I will try some fresh benchmarks with 8.3.1 and will report back when I have some #s.

#6 Updated by Paul Bucher over 8 years ago

Ok got some results from some fresh testing...Not much as changed with 8.3.1. I was able to boost the speed some by tweaking sshd. Add the following options to the ssh service config:

Compression yes
Ciphers aes128-cbc,aes192-cbc,aes256-cbc,blowfish-cbc

This had the positive effect of boasting performance to a full 8Mbs and it dropped CPU utilization on both boxes in top to about 2% on average.

No amount of tweaking of various tuning parameters has yielded anything. I'll tryout the 9.1 alpha at some point and see what that buys me.

#7 Updated by Paul Bucher over 8 years ago

Upgraded both boxes to the most recent 9.1 Alpha build and removed my above tweaks and just clicked the new compression check box(great addition btw). I'm still only seeing a steady 6 Mbs on the WAN traffic and I'm guessing about 8Mbs worth of actual data transferred looking at the iostat of the pool. The debug output from doing a manual ssh connection shows the new hpn stuff kicking in.

debug1: Final hpn_buffer_size = 2097152
debug1: HPN Disabled: 0, HPN Buffer Size: 2097152
debug1: channel 0: new [client-session]
debug1: Enabled Dynamic Window Scaling

For what it's worth here's what top looks like on both boxes:


last pid: 64445;  load averages:  0.04,  0.14,  0.14                                     up 2+14:19:39  09:32:05
443 processes: 2 running, 428 sleeping, 13 waiting
CPU:  3.1% user,  0.0% nice,  3.9% system,  0.0% interrupt, 92.9% idle
Mem: 122M Active, 160M Inact, 15G Wired, 1116K Cache, 250M Buf, 4334M Free
ARC: 14G Total, 3157M MFU, 11G MRU, 3360K Anon, 119M Header, 28M Other
Swap: 62G Total, 62G Free

   11 root       155 ki31     0K    16K RUN     58.9H 83.98% idle
64012 root        21    0 69892K  6028K select   0:56  1.95% sshd
    0 root       -92    0     0K  4848K -        0:53  0.98% kernel{vmx3f0 taskq


last pid: 59408;  load averages:  0.04,  0.05,  0.06                                        up 2+13:41:16  09:32:54
62 processes:  2 running, 59 sleeping, 1 waiting
CPU:  1.2% user,  0.0% nice,  0.0% system,  0.0% interrupt, 98.8% idle
Mem: 132M Active, 159M Inact, 12G Wired, 3052K Cache, 198M Buf, 3486M Free
ARC: 11G Total, 717M MFU, 11G MRU, 1040K Anon, 69M Header, 25M Other

   11 root          1 155 ki31     0K    16K RUN     56.6H 98.97% idle

At this point I'm getting to the end of my rope. As per my original posting I'm able to see much higher throughput using netcat. I may try next using the highspeed cipher option and push it down my VPN tunnel to see what happens but I'm going to let the initial replication to finish at this point.

#8 Updated by Anonymous over 8 years ago

Can you describe your hardware configuration for the record?

I suspect your system is simply underpowered for your use case and you need to build a system with more CPU to get more encrypted data throughput. If that's the case, then this is not a bug and it should be closed.

#9 Updated by Paul Bucher over 8 years ago

Both ends are VMs running under ESXi 5.1u1. vSphere is showing an average of 8% cpu utilization, that along with the top results I posted above I doubt it's CPU related. Plus when I use netcat over my VPN tunnel I'm hitting pfsense based VMs doing the same crypto algorithm on the same box as the [[FreeNAS]] VMs.

Here are the results of sysctl -a | egrep -i 'hw.machine|hw.model|hw.ncpu':


hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
hw.ncpu: 1
hw.machine_arch: amd64


hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
hw.ncpu: 1
hw.machine_arch: amd64

#10 Updated by Anonymous over 8 years ago

Please try increasing your VMs to at least 2 cores, and I recommend using 4 cores. ZFS relies on having lots of parallelism and a single core may be creating too much contention and this may be killing your throughput.

Also ensure the backing store that the source & target filesystems are on (and the path to them) have sufficient bandwidth and is not constrained due to other VM traffic.

#11 Updated by Paul Bucher over 8 years ago

I'll give it a try, though I'm doubtful because my performance matches what I was getting from my 8.3 install when I was using 4 cores. I'm having an issue with interrupt storms since I upgraded to 9.1-alpha and I haven't had much luck with that issue(the tweaks I used on 8.3 to fix them aren't working, but going to a single core solves the issue and I haven't noticed too much of a performance hit at this point).

I should maybe mention I do replication from my production SAN onto the same Pull over my local network and I'm able to clock 440Mbs on that(which is actually CPU constrained and I should get my other cores back online). Also locally on Pull I've got 2 zpools and I can do send | recv between them and hit over 400MBytes per second so the underlying hardware is good for pushing some serious data around(haven't tried it with just 1 core yet).

#12 Updated by Anonymous over 8 years ago

Can you post the "interrupt storm" messages you are receiving?

#13 Updated by Paul Bucher over 8 years ago

Replying to [comment:12 dwhite]:

Can you post the "interrupt storm" messages you are receiving?

Sure I'm seeing:

interrupt storm detected on "irq18:"; throttling interrupt source
interrupt storm detected on "irq18:"; throttling interrupt source

irq18 has my LSI SAS controller card on it. See my ticket [] for my workaround to make the LSI SAS controller work under ESXi. The workaround is still needed for 9.1. Looking at the output below I think some of the issue could be is that MSI interupts are not getting handed out under 9.1 to my devices that can use them.

Here is the output of vmstat with 1 CPU:

vmstat -i
interrupt                          total       rate
irq1: atkbd0                           6          0
irq15: ata1                       166083          0
irq16: vmx3f1 ehci0             18816026         75
irq17: mpt0                        12849          0
irq18: uhci0 mps0               25062153        100
irq19: vmx3f0                   27989135        112
cpu0:timer                      28829849        115
Total                          100876101        404

vmstat with 4 CPU:

vmstat -i
interrupt                          total       rate
irq1: atkbd0                          10          0
irq0: attimer0                         1          0
irq15: ata1                          195          0
irq16: vmx3f1 ehci0                13440         51
irq17: mpt0                        10625         40
irq18: uhci0 mps0               10126515      38503
irq19: vmx3f0                       3856         14
cpu0:timer                         36007        136
cpu2:timer                         11015         41
cpu3:timer                         42037        159
cpu1:timer                         16120         61
Total                           10259821      39010

vmstat from my production box which is a twin of this box but is running 8.3:

vmstat -i                                                                                                           interrupt                          total       rate
irq1: atkbd0                           6          0
irq15: ata1                           41          0
irq17: mpt0                       134446          0
irq18: uhci0                    22216661          1
cpu0: timer                   5078077390        400
irq256: mps0                  4242405543        334
irq257: vmx3f0                  50777836          3
irq258: vmx3f1                6953842301        547
cpu1: timer                   5078137363        400
cpu3: timer                   5078131940        400
cpu2: timer                   5078131782        400
Total                        31581855309       2487

#14 Updated by Anonymous over 8 years ago

Just in case its related, please try disabling the interrupt storm suppression by setting the following tunable and rebooting:


It can also be applied at runtime via sysctl.

The storm suppression may not work properly in VM environments, which run with a low tick rate clock to lower VM CPU use but with a fast processor/hypervisor that can generate a high rate of legitimate events.

If it makes things worse then back it out and let us know.

I'm going to slot some more ESXi testing into my test schedule in case there is something lingering here. I've done lots of testing on ESXi 5.0 and 5.1 and can't say I've seen results like yours. We do have some anecdotal evidence of problems but their resolutions have been traced to NIC issues (bad NIC, bad cabling, bad tuning, etc.) or "went away" when shifting to a new VM host.

#15 Updated by Paul Bucher over 8 years ago

Ok that didn't work out so well...setting it and rebooting caused a kernel dump right after it mounted the [[FreeNAS]] file systems and was probably trying to mount the pools.

The [[FreeNAS]] file systems are on a plain old ESXi data store, the zpools are all drives attached to a LSI 2308 based SAS controller that is passed through to [[FreeNAS]].

It's fairly consistent to have issues with LSI cards being passed though and ESXi. A quick google will turn up plenty of hits. With a simple hw.pci.enable_msix=0 I've been running several VMed boxes very stable on 8.3 since the beta days of 8.2, including my twin production & testing servers which are connected via 10GB and throw around a few terabytes of data. My other twist is I load the vmxnet3 driver(mount the vmware tools CD, copy off the vmnext3.ko to /boot/modules and add the load statement) so I can have 10GB performance out of freenas(at least to the local VMs). I also updated my LSI firmware to 14.0 to match the driver in 9.1, to no avail.

Something is definitely different with 9.1, because I'm able to boot up the box without my msix tweeks and I'm not seeing the msi interrupt #s in vmstat.

Keep feeding me ideas, I'm trying not to nuke my box here, but I'm game for heavy testing of it. Next up I'm going to load up a stock [[FreeBSD]] 9.1 VM and pass it the LSI card to see what happens.

#16 Updated by Anonymous over 8 years ago

There's a LOT different in [[FreeNAS]] 9.1 -- its [[FreeBSD]] 9 instead of [[FreeBSD]] 8. Major kernel changes are present.

At this point we need a reproduction case. Its going to take some time to get this set up on my end.

The PCI passthrough is a big change and an important point. I thought it was an emulated mps device (i.e., targets backed by .vmdk files), not a real HBA.

Here is the system config as I understand it now, please correct it if I am wrong since I'm piecing it together from your posts (A summary post with your exact config would be nice to have):

  • 2 ESXi hosts using E5-26xx CPUs connected back-to-back by 10GbE for the transit network
  • 1 VM on each host running [[FreeNAS]] 9.1-ALPHA with both the root disk and the storage served from .vmdk's on locally attached storage via SAS HBA configured for PCI passthrough

#17 Updated by Paul Bucher over 8 years ago

You got it close, I'll spell it out more here:

Hosts 1,2,3: Are all [[SuperMicro]] servers of current vintage(X9 motherboards) with E5-26xx CPUs.

All [[FreeNAS]] VMs are virtual SANs on ESXi 5.1 servers with passed through HBAs and vmxnet3 virtual NICs. All VMs also have the floppy drive removed and all serial, parallel & floppy disk controllers are disabled in the BIOS of the VM along with the primary disk controller(since ESXi hangs the CDROM off the secondary). All LSI SAS cards are Firmware 13, except host #2 which I moved to Firmware 14 when I saw 9.1 had driver level 14 in it and was trying to cure my problems.

Host #1: Production: ESXi 5.1 patched & [[FreeNAS]] 8.3p1, LSI SAS HBA in pass-through with the equivalent of a ix Titan 445J hung off(including a STEC [[ZeusRAM]] SAS drive for a ZIL device) it and a 10GB connection to host #2. Also has a USB device passed in for a APC [[SmartUPS]] (4 CPU cores currently).

Host #2 Test: twin of #1 but with ESXi 5.1u1 & [[FreeNAS]] 9.1(1 CPU core currently)

Host #3 Remote: ESXi 5.1u1 & [[FreeNAS]] 9.1 Adaptec 58xx HBA with the back half of a 4U Supermicro server box connected to it(1 CPU core currently).

Host #4: House Server: Consumer Core i5 ivybridge whitebox with ESXi 5.1u1 & [[FreeNAS]] 8.3.1p2 & LSI SAS HBA with the SAS/SATA bays of a super micro box hooked to it(2 CPU cores currently).

I use the [[FreeNAS]] replication between Host #1 & #2 with no problem over the 10GB dedicated link between them.

Host 1 & 2 have a cable 100down/20 up internet connection. Host #3 is in a huge data center that can easily push a full 1GB directly form the NIC on host #3.

Replication from host #3 to #2 is my issue. In summary I can push almost a full 100Mb from #3 using netcat & the public internet. I get 20Mb going over my pfsense based [[OpenVPN]] tunnel and finally only 6Mb using the Freenas replication.

Back on the Interrupt Storm:
All 4 of my hosts have interrupt storm problems under 8.3 which was solved by adding hw.pci.enable_msix=0 to loader.conf(using the [[FreeNAS]] GUI). All servers have run without issue since the 8.3 beta stages and #4 was brought up on a beta of 8.2. I also noticed that adding the above caused my vmxnet & mps drivers to use a 3 digit interrupt(aka a MSI one). They used traditional interrupts before and would storm with multiple CPU cores being added to the VM.

There are a lot of folks doing the same thing I'm doing and it appears that it works with [[FreeBSD]] 9.1(see []. 1 thing I observed was even with my traditional hack the mps & vmxnet3 drives all show a 1 digit interrupt # on Host #2 & #3 under 9.1.

I've got a default install [[FreeBSD]] 9.1 VM setup on #2 and I'll shutdown [[FreeNAS]] tonight and try connecting the LSI card to it and see what I find. Gotta love the world of ESXi and being able to move hardware between different servers for testing and not having to physical pull the card and move it or change the boot drive to try out a different OS or version of something.

#18 Updated by Anonymous over 8 years ago

This ticket is starting to track two issues: the bandwidth issue and the interrupt count issue. Lets keep it on the bandwidth issue strictly and open a new ticket for the interrupt issue if a clear reproduction case can be produced.

The bandwidth numbers you quote don't sound so unreasonable based on the available b/w and assumed latency.

The bandwidth issue is going to take a lot more work to simulate since I will have to build a bandwidth simulator in addition to the endpoints. Unfortunately, this might put work on that issue out of reach of the resources I have available for [[FreeNAS]] testing.

#19 Updated by Paul Bucher over 8 years ago

Agreed I was thinking the same thing, let's use my Ticket #2056 for the interrupt issue, I had opened it to cut down the forum posts I answer on how to make LSI cards work in VMs and to save me some keystrokes when I upgrade my VMs.

#20 Updated by Paul Bucher over 8 years ago

On troubleshooting the bandwidth issue I'd be happy to put up a [[FreeNAS]] VM and open some ports to the net for you to use as a target to play with. Much easier then trying to simulate it.

I'd agree on the bandwidth/latency issue if I didn't get the 20Mbs over my [[OpenVPN]] tunnel when using netcat to pipe the replication between the servers and the fact that the new hpn stuff in 9.x should have boosted things to some degree.

#21 Updated by Paul Bucher about 8 years ago

Just an update. I've got 9.1.1 running on both ends now and I'm still clocking the same 6 mbs. I'm going to try the high speed cipher next option and see if that makes any difference.

#22 Updated by Paul Bucher about 8 years ago

The high speed cipher option made no difference at all. Most likely because I've got an intel cpu with AES-NI on it so the crypto is using almost no CPU usage(system is hanging around 98% idle while the replication is in process - regular cipher).

#23 Updated by Paul Bucher about 8 years ago


After giving up waiting for the initial replication to take place, tried some good old tcp tuning as described here Including building the congestion control modules with the freenas 9.1.1 source base and installing them. It might be helpful for some folks if this was done, but alas I didn't get any improvement from the tuning or the modules. So I feel back to using nc to get my initial replication done. I'm hoping that I can re-enable the built-in replication once the initial is done and it will simply pickup and move forward.

Anyways I'm attaching a snapshot from my pfsense box showing the default ssh replication, the unchanged ssh replication with the tuning, and then finally the nc based replication. The difference isn't even funny.

#24 Updated by Paul Bucher almost 8 years ago

9.2RC2 Update:

Still no joy :( I'm running pretty much stock 9.2 and I'm struggling to hit 6Mbps on my link. With the replication in process I was able to run the below iperf test, which shows I've got some head room even with the 2 processing causing major congestion on the link. The ssh daemon is using so little CPU time that it doesn't even show up in top expect for a blip once in a while when I hide the idle processes(see below I caught the blip).

[root@sanbox] ~# iperf -c -p 2201
Client connecting to, TCP port 2201
TCP window size: 32.9 KByte (default)
[  3] local port 50098 connected with 174.x.x.x port 2201
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  15.2 MBytes  12.8 Mbits/sec

last pid: 15751;  load averages:  0.07,  0.11,  0.10                                          up 0+19:16:30  08:55:41
588 processes: 5 running, 566 sleeping, 17 waiting
CPU 0:  0.4% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.6% idle
CPU 1:  0.4% user,  0.0% nice,  1.6% system,  0.0% interrupt, 98.0% idle
CPU 2:  1.9% user,  0.0% nice,  1.1% system,  0.0% interrupt, 96.9% idle
CPU 3:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
Mem: 167M Active, 109M Inact, 15G Wired, 1152K Cache, 166M Buf, 4284M Free
ARC: 14G Total, 4591M MFU, 9669M MRU, 4512K Anon, 54M Header, 30M Other
Swap: 30G Total, 30G Free

   11 root       155 ki31     0K    64K RUN     0  19.0H 100.00% idle{idle: cpu0}
   11 root       155 ki31     0K    64K CPU3    3  18.9H 100.00% idle{idle: cpu3}
   11 root       155 ki31     0K    64K CPU1    1  18.9H 100.00% idle{idle: cpu1}
   11 root       155 ki31     0K    64K CPU2    2  18.9H 100.00% idle{idle: cpu2}
 9036 root        20    0 72084K  6424K select  2   8:59  0.98% sshd

#25 Updated by Cyber Jock almost 8 years ago

One thing I've noticed is that zfs replication seems to be fairly sensitive to connections with more than about 20-30ms of latency. In your case, you are more than twice that. You have no control over latency obviously, but there may not be a good fix for you coming. :(

#26 Updated by Paul Bucher almost 8 years ago

Christmas Comes Early:

I took advantage of the TCP CC modules included with 9.2 and I did the following tuning to both ends:


Under extra options for ssh I added the following:

I'm now seeing an average of 25Mbs. I'm going to let this run and get my initial replication completed before I poke at the settings any further.

#27 Updated by Jordan Hubbard almost 8 years ago

  • Status changed from Unscreened to Resolved

Sounds like this was resolved with some tuning - marking as such.

#28 Updated by Jordan Hubbard over 7 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF