When DNS breaks, everything looks broken — but the real cause is rarely obvious. This step-by-step guide takes you from "the internet is down" to root cause using nslookup, dig, and a handful of resolver checks.
$ nslookup app.example.com Server: 192.168.1.10 Address: 192.168.1.10#53 ** server can't find app.example.com: SERVFAIL $ nslookup app.example.com 8.8.8.8 Non-authoritative answer: Name: app.example.com Address: 203.0.113.42 [!] Internal resolver fails. Public resolver works. [→] The record is fine. Your DNS server isn't.
DNS failures show up as app errors, Outlook outages, and VPN authentication failures — almost never as “DNS is broken.” The fastest diagnosis: run nslookup <hostname> against your internal resolver, then against 8.8.8.8. If internal fails but public resolves, your DNS server is the problem — not the record. If both fail, the record is wrong or the authoritative nameserver is down. This guide walks the full step-by-step workflow from complaint to root cause.
Every time something on your network connects to a name — office.com, update.windows.com, an internal app — it makes a sequence of DNS lookups:
Any one of those four steps can fail. The trick is knowing which.
Before chasing resolvers, prove the symptom is name resolution. Ping the destination by name and by IP:
# Ping by name
ping app.example.com
# Ping by IP (use one you know works, like the gateway or a known internal host)
ping 192.168.1.1
Interpreting results:
| What you see | What it means |
|---|---|
| Name fails, IP works | DNS — your problem |
| Name works, IP works | Not DNS — look elsewhere |
| Both fail | Not DNS — connectivity issue, go troubleshoot that first |
| Name resolves but app still broken | DNS resolved, but maybe to the wrong record — keep reading |
If ping <name> returns “could not find host” or “request timed out” but the IP responds, you’ve confirmed DNS. Move on.
The client may be asking the wrong server. Check what resolvers it’s using:
# Windows
ipconfig /all | findstr "DNS Servers"
# Linux / Mac
cat /etc/resolv.conf
# Or, on systemd-resolved systems:
resolvectl status
You’re looking for two things:
ping 192.168.1.10
A common failure mode: the client has a stale DHCP lease pointing at a DNS server that no longer exists, or a manual override set during a long-ago troubleshooting session. If a listed resolver doesn’t ping, that’s already the problem.
Also flush the local cache before going further — stale negative caches will lie to you:
# Windows
ipconfig /flushdns
# Linux (systemd-resolved)
sudo resolvectl flush-caches
# Mac
sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder
This is the fastest way to isolate “is it the record or is it my DNS server?”
# Ask Google's resolver directly
nslookup app.example.com 8.8.8.8
# Ask Cloudflare's resolver
nslookup app.example.com 1.1.1.1
What the answers tell you:
| Result | Meaning |
|---|---|
| Public resolvers answer, internal doesn’t | Your internal DNS server is broken or filtering |
| Both return the same wrong IP | The record itself is wrong (authoritative side) |
| Both return SERVFAIL | The authoritative nameservers are down |
| Internal answers, public doesn’t | Internal-only record (expected behavior) |
| Public answers, internal returns NXDOMAIN | Internal resolver has a bad zone or stale forwarder config |
If public works and internal doesn’t, you’ve narrowed it to your resolver. Move to Step 4.
dig +trace to Find the Failing Stepdig shows you the full resolution chain. +trace makes it walk the tree manually — root, TLD, authoritative — so you see exactly which step fails.
dig +trace app.example.com
dig ships with most Linux distros and Mac. On Windows, install BIND tools or use Resolve-DnsName:
Resolve-DnsName app.example.com -DnsOnly
Resolve-DnsName app.example.com -Server 8.8.8.8
What +trace output reveals:
., com., example.com.) returns the nameservers for the next levelIf you see connection timed out; no servers could be reached partway through, that’s a network/firewall issue between you and that nameserver — not a DNS configuration problem.
DNS traffic uses UDP/53 for normal queries, TCP/53 for large responses (and zone transfers), and TCP/853 for DNS-over-TLS (DoT). A firewall blocking any of these breaks resolution in ways that aren’t always obvious.
Test port reachability from the client:
# Windows — test UDP/53 to your resolver
Test-NetConnection -ComputerName 192.168.1.10 -Port 53
# Linux — TCP/53 test
nc -zv 192.168.1.10 53
# DNS-over-TLS check (used by Android, modern resolvers)
nc -zv 1.1.1.1 853
Common firewall failures:
| Symptom | Likely cause |
|---|---|
| Most names resolve, big ones (DNSSEC) fail | TCP/53 blocked outbound (large responses fall back to TCP) |
| Some clients work, some don’t | Per-VLAN or per-host ACL blocking 53 |
| Resolution works on Wi-Fi but not VPN | Split-tunnel or DNS leak protection routing queries oddly |
| Mobile devices fail, desktops work | Android/iOS DoT (port 853) blocked |
If your firewall logs show drops to UDP/53 from the affected client, the answer is in front of you.
If the record itself is wrong, no amount of resolver troubleshooting fixes it. Query the authoritative nameservers directly:
# Find the authoritative nameservers for the zone
dig NS example.com
# Query each one directly
dig @ns1.example.com app.example.com
What to check:
8.8.8.8) may still hold the old value until the TTL expires.For zones you control: cut the TTL down (300 seconds is a common pre-change value) at least 24 hours before any planned record change. For zones you don’t control: wait, or query an authoritative server directly to bypass cache.
To see what’s currently cached at a public resolver:
dig app.example.com @8.8.8.8 +noall +answer
The TTL in the answer line shows how much time is left on the cached record.
| Symptom | Most likely cause | Where to look |
|---|---|---|
| Name fails, IP works | Resolver config or resolver itself | ipconfig /all, ping the resolvers |
| Public works, internal doesn’t | Internal DNS server / forwarder | DNS server logs, forwarder config |
| One zone broken, others fine | Zone replication or stale forwarder | Authoritative servers, conditional forwarders |
| Works on some clients, fails on others | Firewall ACL or DHCP option mismatch | Firewall logs, DHCP scope DNS option |
| Worked yesterday, fails today | TTL expired with bad upstream record | Query authoritative directly |
| Random intermittent failures | UDP fragmentation or rate limiting | Try TCP/53, check ISP rate-limit policy |
| App works locally, fails on VPN | DNS leak / split-horizon mismatch | VPN client DNS settings |
User reports "site/app/service is down"
│
├─ ping <name> vs ping <IP> → name fails, IP works?
│ └─ Yes → confirmed DNS, continue
│
├─ ipconfig /all → are the configured resolvers reachable?
│ └─ No → fix resolver config or DHCP
│
├─ nslookup <name> 8.8.8.8 → does a public resolver answer?
│ ├─ Yes → your internal resolver is the problem
│ └─ No → record itself is broken (go to authoritative)
│
├─ dig +trace → which level fails?
│ └─ Pinpoint the broken layer
│
├─ Test port 53 (UDP and TCP) → firewall blocking?
│ └─ Yes → firewall rule
│
└─ Query authoritative nameservers directly → record correct?
└─ No → zone file or replication problem
DNS has fingerprints. Follow them, and you’ll have your answer before the user finishes typing the ticket.
How do I tell if DNS is causing my problem?
Run nslookup <failing-hostname>. If it returns an error but resolves when you add 8.8.8.8 as the server (nslookup <hostname> 8.8.8.8), your internal DNS resolver is broken. If both fail, the record or authoritative nameserver is the problem — not your resolver.
What’s the difference between NXDOMAIN and SERVFAIL? NXDOMAIN means the record doesn’t exist. SERVFAIL means the resolver couldn’t complete the lookup — often because it can’t reach the authoritative nameserver or has a misconfigured zone. NXDOMAIN is a record problem; SERVFAIL is a server or network problem.
Why does nslookup work but my browser still fails?
Browsers cache DNS independently. Flush the OS DNS cache (ipconfig /flushdns on Windows, sudo resolvectl flush-caches on Linux) and clear the browser’s own DNS cache — in Chrome: chrome://net-internals/#dns. If the browser is using a different resolver than the system (some do), that’s also a cause.
How do I flush the DNS cache on Windows?
Run ipconfig /flushdns from an elevated command prompt. This clears the Windows resolver cache. If the problem returns immediately after flushing, the resolver itself is returning the wrong answer — the issue is upstream of the client.
Why does DNS work for some users but not others?
Different machines may be configured to use different resolvers. Run ipconfig /all (Windows) or cat /etc/resolv.conf (Linux) on both an affected and an unaffected machine and compare the “DNS Servers” entries. If they differ, the broken resolver is the cause.
Q: How do I diagnose DNS failures on my network?
A: Start with `nslookup <hostname> <dns-server>` or `dig @<dns-server> <hostname>` to test the specific resolver. If the resolver responds but slowly (over 100ms query time in the `+stats` output), the forwarder chain is the issue. If it does not respond at all, check that UDP and TCP port 53 is not blocked by a firewall rule.
Q: Why is DNS resolution slow on my corporate network?
A: Slow DNS is almost always a misconfigured or overloaded upstream forwarder. Run `dig @<your-resolver> corp.internal +stats` to measure query time at the resolver level. If the resolver is fast but clients are slow, verify clients are actually using the correct DNS server with `ipconfig /all` on Windows or `resolvectl status` on Linux.
Q: What is the dig command to test DNS from Linux?
A: Use `dig @<nameserver-ip> <hostname> +short` for a quick answer, or `dig @<nameserver-ip> <hostname> +stats` to include query latency and full response details. Example: `dig @192.168.1.1 corp.local +stats` tests your internal resolver directly and shows TTL, answer, and query time in milliseconds.