A saturated WAN feels like the entire network is broken — but the cause is usually one app, one host, or one runaway backup. This step-by-step guide takes you from "everything is slow" to the exact source using interface stats, NetFlow, DPI, and Wireshark.
$ show interfaces GigabitEthernet0/0 | include rate 5 minute input rate 942,184,000 bits/sec 5 minute output rate 47,221,000 bits/sec $ show ip cache flow | top 5 SrcAddr DstAddr Pkts Bytes App 10.10.42.31 → 151.101.65.69 1.2M 8.4 GB HTTPS 10.10.42.31 → 13.107.42.14 820K 5.1 GB HTTPS 10.10.40.15 → 140.82.121.4 94K 540 MB HTTPS [!] One host pushing 13 GB inbound on a 1 Gbps link. [→] Find the host. Find the app. Then decide what to do.
WAN saturation is almost always caused by one host or one application — a runaway backup, a Windows Update wave, or cloud sync consuming the link. To find it: run show interfaces on your WAN-facing router and check the 5-minute output rate against the link’s rated capacity. If it’s near line rate, run show ip cache flow to identify the top talkers by IP. One source almost always dominates. This guide walks each step from symptom to source.
A WAN link is saturated when the offered traffic exceeds what the link can carry. When that happens, the device’s egress queue fills, and once the queue overflows, packets get dropped. TCP retransmits, latency spikes, real-time apps stutter.
A few useful numbers to keep in mind:
Before chasing top talkers, prove the link is full. Pull live interface utilization:
# Cisco IOS / IOS-XE
show interfaces GigabitEthernet0/0 | include rate
# Look for these lines:
5 minute input rate 942000000 bits/sec, 78000 packets/sec
5 minute output rate 47000000 bits/sec, 31000 packets/sec
Convert to a percentage: 942 Mbps / 1000 Mbps = 94%. That’s saturated.
For a quick sanity check, also pull queue drops:
show interfaces GigabitEthernet0/0 | include drops|queue
Output queue: 0/40 (size/max)
Total output drops: 184,592
Output drops > 0 means packets are being dropped because the link can’t keep up. That’s the smoking gun.
If utilization is low but drops are high, you don’t have a saturation problem — you have a microburst or QoS misconfiguration problem. Different troubleshooting path.
Once you know the link is saturated, the next question is “by what?” NetFlow (Cisco) and sFlow (Juniper, Arista, HP) give you per-flow visibility — source, destination, port, byte count.
If you already have a flow collector (PRTG, ntopng, Plixer, SolarWinds), open it, filter to the WAN interface, and sort by bytes descending. The top 5 flows are your suspects.
If you don’t, you can still get useful data straight from the device:
# Cisco IOS — view active flows on a router with NetFlow enabled
show ip cache flow
# Top talkers (newer IOS / IOS-XE)
show flow monitor [name] cache aggregate ipv4 source address top 10
SrcIf SrcIPaddress DstIf DstIPaddress Pr Bytes
Gi0/1 10.10.42.31 Gi0/0 151.101.65.69 06 8.4G
Gi0/1 10.10.42.31 Gi0/0 13.107.42.14 06 5.1G
Gi0/1 10.10.40.15 Gi0/0 140.82.121.4 06 540M
If your routers don’t speak NetFlow/sFlow, a free port-mirrored capture into ntopng will give you the same picture in 10 minutes.
What you’re looking for:
| Pattern in the flow data | Likely cause |
|---|---|
| One source IP responsible for >50% of traffic | A user, server, or runaway process |
| Many sources, one destination | Inbound CDN burst, video stream, or update storm |
| Many sources, many destinations, all on port 443 | Generic “the internet is busy” — probably normal traffic at peak |
| Heavy traffic to a cloud storage IP range | Backup or sync job |
| Traffic on unusual ports | Exfiltration, P2P, or misconfigured app |
A top-talker IP gives you “host 10.10.42.31 is the problem” — but you still need to know what 10.10.42.31 is doing. Most modern WAN traffic is HTTPS, so just looking at port 443 doesn’t tell you which app.
A few ways to identify the application without DPI gear:
# Reverse DNS the destination
dig -x 151.101.65.69
# → fastly CDN — probably software updates or web app
# Whois the IP
whois 13.107.42.14
# → Microsoft — Teams, OneDrive, Exchange Online
# Check certificate SNI in a packet capture (works for HTTPS)
tshark -i <iface> -Y "tls.handshake.extensions_server_name" \
-T fields -e ip.src -e tls.handshake.extensions_server_name
If you have a real DPI-capable firewall (Palo Alto, Fortinet, Meraki), you already have an Application Visibility view — go look there. Filter to the offending source IP and you’ll see the app name directly.
For BYOD or unmanaged endpoints, sometimes the only way to know what’s running is to walk over and look. Cloud backup clients, OS update services, video conferencing apps, and game updaters are the four most common silent saturators.
If NetFlow points at a host but you still don’t know what it’s running, mirror the WAN port to a span port and capture for a few minutes.
Capture filter:
host 10.10.42.31
Display filters worth knowing:
# What domains is this host hitting? (TLS SNI)
tls.handshake.extensions_server_name
# DNS queries — shows you what the host is looking up before it connects
dns.qry.name
# HTTP user-agent (rare on WAN now, but useful when present)
http.user_agent
Within a couple of minutes you’ll have a list of the destinations and SNI hostnames the host is talking to. That tells you the app.
Some saturation is constant. Some is scheduled and invisible until you look for it. Pull the interface utilization graph over a 24-hour and 7-day window:
If you can correlate the user-reported “slow” times with a graph spike, you’ve found the window. Then a flow query during that window points at the source.
If your monitoring doesn’t graph history yet, that’s the gap to fix first — you can’t troubleshoot saturation without history. LibreNMS, PRTG, or Cloudflare’s free analytics will all do this.
Once you know who, what, and when, the fix usually falls into one of four buckets:
Backups, replication jobs, large pushes — move them to off-peak windows or cap their bandwidth at the source. This is the cheapest, fastest fix and works for ~60% of saturation cases.
If the saturator is legitimate but you need to keep voice and Teams smooth:
# Cisco IOS QoS — basic example, prioritize voice and limit bulk
class-map match-any VOICE
match dscp ef
class-map match-any BULK
match access-group name BULK_TRAFFIC
policy-map WAN-OUT
class VOICE
priority percent 30
class BULK
bandwidth percent 20
class class-default
fair-queue
interface GigabitEthernet0/0
service-policy output WAN-OUT
QoS doesn’t add bandwidth — it decides who suffers when there isn’t enough.
For Microsoft 365, Windows updates, or any large repeated download: a local caching server (BranchCache, Connected Cache, WSUS, Squid) can eliminate the duplication.
Upgrade if all of these are true:
If you can’t show a graph that justifies the upgrade, your boss won’t approve it — and rightly so.
| Pattern | Most likely cause | Fix path |
|---|---|---|
| One host >50% of traffic | Backup, sync, or rogue process | Find process, reschedule or rate-limit |
| Spike at 2 AM | Off-hours backup job | Reschedule or shape |
| Spike at 8–9 AM | Mass update / sync at login | Stagger updates, cache locally |
| Outbound saturated, inbound idle | Upload-heavy job (cloud sync, file dump) | Source rate-limit |
| Inbound saturated, outbound idle | Downloads, streaming, CDN burst | Identify source, cache if repeated |
| Heavy traffic to one cloud IP range | M365, AWS, OneDrive sync | App-aware QoS or caching |
| Many small flows, all 443, low per-flow | Genuine user demand at peak | QoS + plan capacity |
| Output drops with low utilization | Microburst | Per-class queue tuning |
User reports "WAN is slow / VPN choppy / Teams cutting out"
│
├─ show interfaces rate → above 80% sustained?
│ └─ Yes → saturation confirmed
│
├─ show interfaces drops → output drops increasing?
│ └─ Yes → packets being dropped, real impact
│
├─ NetFlow / sFlow → top 5 flows on the WAN
│ └─ Pin the source(s)
│
├─ Reverse DNS / whois / SNI capture → which app
│ └─ Now you know who and what
│
├─ Graph over 24h / 7d → when does it happen?
│ └─ Now you know when
│
└─ Fix: reschedule, rate-limit, QoS, cache, or upgrade
Saturation isn’t mysterious. It’s one of: a host, an app, a schedule, or a capacity gap — and a flow report plus a 24-hour graph almost always tells you which.
How do I know if my WAN is actually saturated?
Run show interfaces <WAN-interface> on your router and check the “5 minute input rate” and “5 minute output rate.” If either consistently exceeds 80% of the link’s rated capacity, the link is saturated — bursts that exceed 100% cause drops even when the average looks fine.
What’s the fastest way to find which device is consuming the most bandwidth?
On Cisco IOS, run show ip cache flow and look at the top entries by bytes. The leading source IP is almost always the culprit. If you have a NetFlow collector, filter by WAN interface and sort by bytes per second over the last 15 minutes for the same answer with more history.
Can I identify the top bandwidth user without NetFlow?
Yes. show ip cache flow on Cisco IOS gives a live snapshot of active flows without a dedicated collector. For a graphical view, mirror the WAN port to a span port and open it in Wireshark — sort the “Conversations” view by bytes to see the same picture.
How much WAN utilization is too much? A link consistently above 80% is effectively saturated. Above 70% on a real-time traffic path (voice, video) is where call quality problems begin, even if the average looks acceptable. Bursts regularly exceed 100% on links running at 80% average.
Once I find the source, what should I do? In order of effort: (1) reschedule high-volume jobs (backups, updates) to off-hours; (2) rate-limit the offending traffic source; (3) apply QoS to protect real-time traffic; (4) deploy a local cache for repeated downloads (WSUS, BranchCache); (5) upgrade the link once all other options are exhausted and you have a graph that justifies it.
Q: How do I find what is saturating my WAN link?
A: Enable NetFlow or sFlow on your edge router and collect data in a flow collector (LibreNMS, ntopng, or PRTG). Sort by top talkers over a 5–15 minute window during the saturation event. If you cannot enable NetFlow, check `show interfaces` output drops on your WAN interface as a quick congestion indicator.
Q: What tool shows top bandwidth consumers on my network?
A: LibreNMS with flow collection, ntopng, PRTG, or ManageEngine NetFlow Analyzer are the main options. For a quick free option, ntopng community edition gives per-host and per-protocol bandwidth breakdown in real time without requiring a license.
Q: How do I use NetFlow to identify bandwidth hogs?
A: Configure your router to export NetFlow v5 or v9 to a collector IP and port. In the collector, sort flows by bytes during the peak congestion window. Look for a single source IP consuming disproportionate bandwidth — nightly backups, Windows Update distribution, and rogue VMs running cloud syncs are the most common culprits.