Skip to main content

Slow Download After Draytek 167 Migration — PPPoE Throughput on VyOS in Proxmox

After the Draytek 167 migration, download throughput dropped: ~26 MB/s vs. ~32 MB/s previously on the FritzBox (Telekom Super-Vectoring 250/40, line synced at 292/46 Mbit/s). Single-flow downloads from US servers capped near 22 MB/s. Sync, SNR margin (7.7 dB down), attenuation (3.8 dB), and CRC counters were all clean — the line itself was healthy. The bottleneck was the VyOS-on-Proxmox forwarding path, specifically the host's onboard e1000e NIC.


Symptoms

  • DSL line trains correctly. Actual Rate 292121 kbps down / 46719 kbps up, profile 35b, SNR margin 7.7 dB / 10.3 dB, 0 CRC errors.
  • PPPoE session up; public IP assigned; routing and NAT functional.
  • wget from VyOS itself to a German Hetzner mirror caps near 180–200 Mbit/s instead of the expected ~250 Mbit/s.
  • A LAN client behind VyOS sees even less (~26 Mbit/s in initial tests over a degraded path).
  • No core inside the VyOS VM hits 100%; CPU usage looks idle.
  • mpstat -P ALL 1 on the Proxmox host shows one core (CPU5 in this box) sustaining ~11–14% %soft while every other core sits at 0% — softirq is pinned to a single core.

Root Cause

The Proxmox host bridges WAN traffic through its onboard NIC:

ethtool -i nic0
driver: e1000e

e1000e (Intel I218/I219-class onboard NICs) is single-queueethtool -l nic0 returns Operation not supported because there is exactly one RX/TX queue and no RSS hash indirection table. Every packet on the WAN bridge is processed by one host CPU's softirq context. PPPoE adds per-packet overhead (encapsulation, no hardware offload on the pppoe0 interface), so a single TCP flow saturates that one core long before it saturates the 1 Gbit/s link.

Compounding factors found on the same box:

  • Default ring buffers RX 256 / TX 256 — far below the 4096 maximum the driver allows.
  • No multiqueue on the virtio NICs attached to the VyOS VM (net0/net2 had no queues= parameter).
  • No RPS (Receive Packet Steering) configured on the host, so no software fan-out across cores either.

The FritzBox didn't expose this because it is a dedicated appliance with no virtualization layer and a router ASIC handling PPPoE in hardware.


Fix

Apply on the Proxmox host unless noted otherwise.

1. Bump physical NIC ring buffers

ethtool -G nic0 rx 4096 tx 4096

Verify with ethtool -g nic0. Pre-set maximums was 4096/4096 on this hardware.

2. Enable RPS to spread softirq across cores

The Proxmox host has 6 logical CPUs in this deployment. The CPU mask is hex; each bit is one core. 0x3e = 0b00111110 selects cores 1–5 and skips core 0 (which handles other host work).

echo 3e > /sys/class/net/nic0/queues/rx-0/rps_cpus

Pitfall: echo fe > .../rps_cpus returns Value too large for defined data type on a 6-core box. The mask width must match the CPU count.

3. Enable RFS (flow-aware steering)

echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
echo 4096 > /sys/class/net/nic0/queues/rx-0/rps_flow_cnt

4. Raise softirq budget

sysctl -w net.core.netdev_budget=600
sysctl -w net.core.netdev_budget_usecs=8000

5. Multiqueue on the VyOS virtio NICs

Per WAN-side and LAN-side virtio NIC, set queues=N (match VM core count, typically 4):

qm set 101 -net2 virtio=BC:24:11:D1:B1:14,bridge=vmbr0,queues=4
qm set 101 -net0 virtio=BC:24:11:CD:AF:6B,bridge=vmbr2,queues=4

Reboot the VM. Inside VyOS, verify:

ethtool -l eth1
# Combined: 4

Persisting the Changes

The ethtool, echo, and sysctl commands above are not persistent across host reboots. Persist them:

/etc/sysctl.d/99-net-tuning.conf:

net.core.rps_sock_flow_entries = 32768
net.core.netdev_budget = 600
net.core.netdev_budget_usecs = 8000

/etc/network/interfaces — the WAN-bridge slave (nic0) needs an auto line so the stanza activates at boot, and the post-up hooks must be indented (tab or 4 spaces) under iface. The echo redirect needs bash -c because ifupdown runs hooks via /bin/sh, where stdout redirection from a built-in echo into /sys/... can fail under restricted shells:

auto nic0
iface nic0 inet manual
post-up ethtool -G nic0 rx 4096 tx 4096 || true
post-up bash -c 'echo 3e > /sys/class/net/nic0/queues/rx-0/rps_cpus'
post-up bash -c 'echo 4096 > /sys/class/net/nic0/queues/rx-0/rps_flow_cnt'

Apply without a host reboot:

ifdown nic0 && ifup nic0
sysctl --system

The ifdown nic0 briefly flaps the WAN bridge — PPPoE will reconnect within a few seconds. If that's disruptive, defer activation until the next planned host reboot; the file edits alone are enough to make the change persistent.

Verify after applying:

ethtool -g nic0 | grep -A1 "Current hardware" # RX/TX should show 4096
cat /sys/class/net/nic0/queues/rx-0/rps_cpus # should print: 3e
cat /sys/class/net/nic0/queues/rx-0/rps_flow_cnt
sysctl net.core.netdev_budget # 600
sysctl net.core.netdev_budget_usecs # 8000
sysctl net.core.rps_sock_flow_entries # 32768

The qm set ... queues=4 change is already persistent (lives in /etc/pve/qemu-server/101.conf).


Verification

From the VyOS shell, single-flow test against a topologically close server (avoid trans-Atlantic — long RTT caps single-flow TCP regardless of router):

wget -O /dev/null https://fsn1-speed.hetzner.com/10GB.bin

For an aggregate-throughput test, run four parallel streams:

for i in 1 2 3 4; do wget -q -O /dev/null https://ash-speed.hetzner.com/1GB.bin & done
wait

While the test runs, on the host:

mpstat -P ALL 1

Expectation after the fix: %soft is spread across cores 1–5 instead of pinned to one. Per-flow throughput recovers to within a few percent of the contracted rate.


Diagnostic Cheatsheet

QuestionCommand (where)
Is the DSL line healthy?Draytek WUI → Online Status → Physical Connection. Look for sync rate near attainable rate, SNR margin > 6 dB, 0 CRC.
Is the PPPoE session up and clean?show interfaces pppoe pppoe0 (VyOS) — check for errors/dropped.
Is the bottleneck the router or the line?wget from VyOS itself. If VyOS is slow, line/contract/router. If VyOS is fast but LAN clients are slow, LAN/forwarding path.
Is softirq saturating one host core?mpstat -P ALL 1 on the Proxmox host during a sustained transfer.
Does the physical NIC support multiqueue?ethtool -l nic0. Operation not supported ⇒ single queue.
Are virtio NICs multiqueue?ethtool -l eth1 inside VyOS. Combined should equal the queues= value in qm config.

Notes on Single-Flow TCP and Distance

A speedtest from a German connection to a US server (Hetzner Ashburn, RTT ~100 ms) is bandwidth-delay-product-limited: a single TCP flow with default window sizes tops out near 25 MB/s regardless of the local router. Always benchmark against the closest mirror (fsn1-speed.hetzner.com, nbg1-speed.hetzner.com) or use parallel streams when the goal is to characterise router throughput rather than wide-area TCP behaviour.


Long-Term Recommendation

e1000e is a known weak driver/NIC family for routing workloads. A multiqueue NIC (Intel i350-T2 is a cheap, well-supported choice) eliminates the single-softirq-core ceiling without needing RPS workarounds. Plan a future hardware change if sustained single-flow PPPoE throughput above ~250 Mbit/s is needed.