Lots of people have come up with alternate L3 protocols for the Internet. For example, check out [1] which goes up to 999.999.999.999.999, or [2] which gets updated with some typo fixes every 6 months each time it's about to expire.
Someone did post another one of these recently, but it's just a random person on the Internet. The IETF's draft database is more or less a pastebin service -- anyone can upload something there without it being a thing that's being taken seriously.
Sometimes you get drafts like [3] as a demonstration of that.
> we could have baked address extensions into the existing packet format's option fields and had a gradual upgrade that relied on that awful bodge that was (and is) NAT
We did and do have this. I wrote about the option fields part in [1] but we also have NAT as part of the migration, in the form of NAT64.
Not only was doing these things not enough for us to be done by now, they weren't even enough to stop you from moaning that we didn't do them! How could anything have been good enough if these are the standards it's judged by?
My point was meant purely as an intellectual exercise, not a critique of engineering choices made in the face of adverse practical realities. My apologies if it came across otherwise.
With the luxury of hindsight, allowing an admixture of 32-bit and 64-bit addresses strikes me as an obviously clean solution to the one real problem IPv6 solves. But in 1992, that was a complete non-starter.
But mine was that you don't need to do this as an intellectual exercise, because we got basically all the things you're asking for.
We have address extensions in v4 packets, we have NAT to help with partial upgrades, and we have a mix of 32-bit and 128-bit addresses (which should be just as obviously clean as a mix of 32-bit and 64-bit addresses, or rather more so due to 64-bits being too small). You don't need to think about whether any of this would have been doable, because we already went and did it.
You would also need something like O(N²) routing update messages to keep those tables updated, instead of the current... I'm guessing it grows more like O(log N) in the number of hosts. So everyone would need vast amounts of CPU and bandwidth to keep up with the announcements.
AAAA records have lower priority than A records if you don't have a v6 address assigned on your system. (Link-locals don't count for this).
You would only see a timeout to an AAAA record if the connection attempt to the A record already failed. Some software (looking at you, apt-get) will only print the last connection failure instead of all of them, so you don't see the failure to connect to the A record. I've seen people blame v6 for this even though they don't have v6 and it's 100% caused by their v4 breaking.
Run `getent ahosts example.com` to see the order your system sorts addresses into. `wget example.com` (wget 1.x only though) is also nice, because it prints the addresses and tries to connect to them in turn, printing every error.
I mean... adding v6 is the right thing to do either way, but "AAAA is higher priority than A" isn't the reason.
IPv6 aaaa timeout was shown to be the problem, adding `Acquire::ForceIPv4 "true";` fixed the problem on several hosts.
$ getent ahosts us.archive.ubuntu.com
91.189.91.81 STREAM us.archive.ubuntu.com
91.189.91.81 DGRAM
91.189.91.81 RAW
91.189.91.82 STREAM
91.189.91.82 DGRAM
91.189.91.82 RAW
91.189.91.83 STREAM
91.189.91.83 DGRAM
91.189.91.83 RAW
2620:2d:4002:1::101 STREAM
2620:2d:4002:1::101 DGRAM
2620:2d:4002:1::101 RAW
2620:2d:4002:1::102 STREAM
2620:2d:4002:1::102 DGRAM
2620:2d:4002:1::102 RAW
2620:2d:4002:1::103 STREAM
2620:2d:4002:1::103 DGRAM
2620:2d:4002:1::103 RAW
There are no non `fe80::` (link local addresses) on the host.
$ ip a | grep inet6
inet6 ::1/128 scope host noprefixroute
inet6 fe80::786a:e338:3957:b331/64 scope link noprefixroute
inet6 fe80::a10c:eae9:9a49:c94d/64 scope link noprefixroute
So to be clear, I removed my temporary ipv4 only apt config, but there are a million places for this to be brittle and you see people doing so with sysctl net.ipv6.conf.* netplan, systemd-networkd, NetworkManager, etc... plus the individual client etc....
> If an implementation is not configurable or has not been configured, then it SHOULD operate according to the algorithms specified here in conjunction with the following default policy table:
One could argue that GUA's without a non-link-local IPv6 address should just be ignored...and in a perfect world they would.
But as covered int the first link in this post this is not as easy or clear as expected and people tend to error towards following rfc6724 which states just below the above refrence:
> Another effect of the default policy table is to prefer communication using IPv6 addresses to communication using IPv4 addresses, if matching source addresses are available.
I am not an IPv6 hater...just giving observations that when you introduce a breaking change, and add additional friction, it dramatically reduces adoption.
Many companies I have been at basically just implement enough to meet Federal Government requirements and often intentionally strip it out of the backend to avoid the brittleness it caused.
I am old enough to remember when I could just ask for an ASN and a portable class-c and how nice that was, in theory IPv6 should have allowed for that in some form...I am just frustrated with how it has devolved into an intractable 'wicked problem' when there was a path.
The fact that people don't acknowledge the pain for users, often due to situations beyond their control is a symptom of that problem. Ubuntu should never have even requested an IPv6 aaaa in the above system, and yes it only does because of politics and RFC requirements.
(Long post was long so I split it into two short... shortish... uh... here, enjoy two walls of text instead of one.)
> I am not an IPv6 hater...just giving observations that when you introduce a breaking change, and add additional friction, it dramatically reduces adoption.
It's not like we had a choice. We needed to increase the available address space but v4 doesn't support doing that, so there's your breaking change. (v6 did the work to introduce family-agnostic socket API calls, so applications can now use new address families without breaking, but those calls didn't exist before v6).
Also... v6 suffers from massive double standards. When people hit a problem in v4 they treat it as a problem to fix, but when they hit a problem in v6 -- or a problem with v4 that causes a colon to be printed --- they skip trying to find and fix the problem and just go "oh my god look how shit v6 is disable it now".
Computers break all the time. "It's always DNS" is a meme, so clearly things that aren't v6 break too. But if people are willing to forgive the other things for problems and fix them but refuse to do either with v6, and will blame v6 for problems it reveals in other things, then v6 could be far more reliable than v4 and people would still be moaning about it breaking all the time.
We're in this situation because the people who designed v4 made it too small. It sucks but we need to deal with it, and the sooner we do that the sooner we can stop being annoyed by it. Dragging our feet on v6 just maximizes the amount of time we need to deal with transitioning to it.
> Ubuntu should never have even requested an IPv6 aaaa in the above system, and yes it only does because of politics and RFC requirements.
getaddrinfo() has the AI_ADDRCONF flag for this. I don't know why it doesn't pass it here, but it could.
The getent output shows that addresses are being sorted properly for a machine without v6. apt-get and other properly-written software will try the addresses in the order listed there, i.e. all v4 addresses first and only then the v6 addresses.
So... I don't think `Acquire::ForceIPv4 "true";` could fix the problem, because apt-get wouldn't have even tried the v6 addresses if any of the v4 ones were working. If you run `wget http://us.archive.ubuntu.com/ubuntu` while the problem is happening it should give you clearer log messages.
Another possibility is a DNS failure that causes your DNS queries to go missing sometimes. If this is the issue then it probably affects both your A and AAAA queries, but you wouldn't notice it on the AAAA queries if you don't have v6. You would only notice when the A queries go missing and the lookup only returns AAAAs; programs will try v6 "first" if this happens, since v6 is all there is to try.
> And how "::/0" > "::ffff:0:0/96"
It has higher precedence, but DNS results are sorted by "do the labels match?" first and by precedence second (rules 5 and 6 in section 6). The idea is to prefer connecting to addresses where the kernel would select the same type of address (as identified by the labels) for the source address. In your case, the algorithm is looking at something like this:
The first three go first because of the matching label, then the last 3 go last because of the differing label. The two groups of three would then each be sorted by precedence, which you can't see here because both groups are homogeneous.
Note the label sort is the only one that considers your source addresses. If that step wasn't there, the sort order would be the same on machines with and without v6, which would be bad.
user@ubuntu-server:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 25.10
Release: 25.10
Codename: questing
user@ubuntu-server:~$ uname -a
Linux ubuntu-server 6.17.0-7-generic #7-Ubuntu SMP PREEMPT_DYNAMIC Sat Oct 18 10:10:29 UTC 2025 x86_64 GNU/Linux
user@ubuntu-server:~$ getent ahosts us.archive.ubuntu.com
91.189.91.82 STREAM us.archive.ubuntu.com
91.189.91.82 DGRAM
91.189.91.82 RAW
91.189.91.81 STREAM
91.189.91.81 DGRAM
91.189.91.81 RAW
91.189.91.83 STREAM
91.189.91.83 DGRAM
91.189.91.83 RAW
2620:2d:4002:1::102 STREAM
2620:2d:4002:1::102 DGRAM
2620:2d:4002:1::102 RAW
2620:2d:4002:1::101 STREAM
2620:2d:4002:1::101 DGRAM
2620:2d:4002:1::101 RAW
2620:2d:4002:1::103 STREAM
2620:2d:4002:1::103 DGRAM
2620:2d:4002:1::103 RAW
user@ubuntu-server:~$ ip --oneline link | grep -v lo: | awk '{ print $2 }'
enp0s3:
user@ubuntu-server:~$ ip addr | grep inet6
inet6 ::1/128 scope host noprefixroute
inet6 fe80::5054:98ff:fe00:64a9/64 scope link proto kernel_ll
user@ubuntu-server:~$ fgrep -r -e us.archive /etc/apt/
/etc/apt/sources.list.d/ubuntu.sources:URIs: http://us.archive.ubuntu.com/ubuntu/
user@ubuntu-server:~$ sudo apt-get update
Hit:1 http://us.archive.ubuntu.com/ubuntu questing InRelease
Get:2 http://security.ubuntu.com/ubuntu questing-security InRelease [136 kB]
<snip>
Get:43 http://security.ubuntu.com/ubuntu questing-security/multiverse amd64 c-n-f Metadata [252 B]
Fetched 2,602 kB in 3s (968 kB/s)
Reading package lists... Done
I didn't think to wrap that in 'time', but it only took a few seconds to run... more than two and less than thirty.
The IPv6 packet capture running during all that reveals that it never tried to reach out over v6 (but that my multicast group querier is happily running):
I even manually ran unattended-upgrade, which looks to have succeeded. Other than unanswered router solicitations and multicast group query membership chatter, there continued to be no IPv6 communication at all, and none of the messages you reported appeared either in /var/log/syslog or on the terminal.
You aren't running it during an external transitive failure that happened on April 15th.
The problem isn't the happy path, the problem is when things fail, and that linux, in particular made it really hard to reliably disable [0]
Once that hits someone's vagrant or ansible code, it tends to stick forever, because they don't see the value until they try to migrate, then it causes a mess.
The last update on the original post link [1] explains this. The ipv4 host being down, not having a response, it being the third Tuesday while Aquarius is rising into what ever, etc... can invoke it. It causes pains, is complex and convoluted to disable when you aren't using it, thus people are afraid to re-enable it.
> ...linux, in particular made it really hard to reliably disable
Section 10.1 of that Archi Wiki page says that adding 'ipv6.disable=1' to the kernel command line disables IPv6 entirely, and 'ipv6.disable_ipv6=1' keeps IPv6 running, but doesn't assign any addresses to any interfaces. If you don't like editing your bootloader config files, you can also use sysctl to do what it looks like 'ipv6.disable_ipv6=1' does by setting the 'net.ipv6.conf.all.disable_ipv6' sysctl knob to '1'.
> You aren't running it during an external transitive failure...
I'll assume you meant "transient". Given that I've already demonstrated that the only relevant traffic that is generated is IPv4 traffic, let's see what happens when we cut off that traffic on the machine we were using earlier, restored to its state prior to the updates.
We start off with empty firewall rules:
root@ubuntu-server:~# iptables-save
root@ubuntu-server:~# ip6tables-save
root@ubuntu-server:~# nft list ruleset
root@ubuntu-server:~#
We prep to permit DNS queries and ICMP and reject all other IPv4 traffic:
And we do an apt-get update, which fails in less than ten seconds:
root@ubuntu-server:~# apt-get update
Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
Ign:2 http://us.archive.ubuntu.com/ubuntu questing InRelease
<snip>
Could not connect to security.ubuntu.com:80 (91.189.92.23). - connect (111: Connection refused) Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
<snip>
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/questing-security/InRelease Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
W: Some index files failed to download. They have been ignored, or old ones used instead.
root@ubuntu-server:~#
In this case, the IPv6 traffic I see is... an unanswered router solicitation, and the multicast querier chatter that I saw before. [0] What happens when we change those REJECTs into DROPs
root@ubuntu-server:~# iptables -D OUTPUT -o enp0s3 -j REJECT
root@ubuntu-server:~# iptables -D INPUT -i enp0s3 -j REJECT
root@ubuntu-server:~# iptables -A OUTPUT -o enp0s3 -j DROP
root@ubuntu-server:~# iptables -A INPUT -i enp0s3 -j DROP
root@ubuntu-server:~#
...and then re-run 'apt-get update'?
root@ubuntu-server:~# apt-get update
Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
Err:1 http://security.ubuntu.com/ubuntu questing-security InRelease
Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4002:1::103). - connect (101: Network is unreachable) <v6 addrs snipped> Could not connect to security.ubuntu.com:80 (91.189.92.24), connection timed out <long line snipped>
<redundant output snipped>
W: Some index files failed to download. They have been ignored, or old ones used instead.
root@ubuntu-server:~#
Exactly the same thing, except it takes like two minutes to fail, rather than ~ten seconds, and the error for IPv4 hosts is "connection timed out", rather than "Connection refused". Other than the usual RS and multicast querier traffic, absolutely no IPv6 traffic is generated.
However. The output of 'apt-get' sure makes it seem like an IPv6 connection is what's hanging, because the last thing that its "Connecting to..." line prints is the IPv6 address of the host that it's trying to contact... despite the fact that it immediately got a "Network is unreachable" back from the IPv6 stack.
To be certain that my tcpdump filter wasn't excluding IPv6 traffic of a type that I should have accounted for but did not, I re-ran tcpdump with no filter and kicked off another 'apt-get update'. I -again- got exactly zero IPv6 traffic other than unanswered router solicitations and multicast group membership querier chatter.
I'm pretty damn sure that what you were seeing was misleading output from apt-get, rather IPv6 troubles. Why? When you combine these facts:
* REJECTing all non-DNS IPv4 traffic caused apt-get to fail within ten seconds
* DROPping all non-DNS IPv4 traffic caused apt-get to fail after like two minutes.
* In both cases, no relevant IPv6 traffic was generated.
the conclusion seems pretty clear.
But, did I miss something? If so, please do let me know.
[0] I can't tell you why the last line in the 'apt-get update' output is only IPv6 hosts. But everywhere there were IPv6 hosts, the reported error was "Network is unreachable" and for IPv4 the error was "Connection refused".
This part is exactly the problem I was talking about:
root@ubuntu-server:~# apt-get update
...
Could not connect to security.ubuntu.com:80 (91.189.92.23). - connect (111: Connection refused) Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
<snip>
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/questing-security/InRelease Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
W: Some index files failed to download. They have been ignored, or old ones used instead.
Well... in this case the output does show the failure to connect to 91.189.92.23, but that looks like a different kind of message to the "W:" lines, so maybe it doesn't show up on all setups or didn't make it into the logs on disk, or got buried under other output.
If you look at just the W: lines, it mentions a v6 address but the machine doesn't have v6 and the actual problem is the Connection Refused to the v4 address. The output is understandably misleading but ultimately the problem here has nothing to do with v6.
> ...ultimately the problem here has nothing to do with v6.
I agree... more or less. The remainder of this message is a reply to nyrikki, but I'm sticking it under your comment because you might also appreciate how weird it looks like this guy's setup is.
nyrikki: The rest of this message is directed directly at you:
============================
Actually, what's up with your link-local addresses? They have really odd flags on them.
The only way I can figure that you got into that configuration was to remove the kernel-generated link-local address and add a new one with the arguments 'scope link noprefixroute'. Even if a router on your network advertised a fe80::/64 prefix, that does nothing at all, as hosts are supposed to [0] ignore advertised prefixes that are link-local.
Yeah. After playing around with this for a bit, I can see that your network is at either least as misconfigured as one would be if -say- your DHCP server was giving leases with an invalid default gateway, or it is very, very specially configured for very special reasons.
Starting with the ubuntu-server host in the "IPv4 traffic is REJECTed" configuration from my last comment, we do this on the host to delete the kernel-supplied link-local address and instruct the OS to create an address in the link-local address space that can be used for global addresses.
root@ubuntu-server:~# ip addr del fe80::5054:98ff:fe00:64a9/64 dev enp0s3
root@ubuntu-server:~# ip addr add fe80::5054:98ff:fe00:64aa/64 noprefixroute dev enp0s3
root@ubuntu-server:~#
We then configure our upstream router to either
* Send RAs on the local link without a prefix
or
* Send RAs on the local link with a link-local prefix (so they're ignored by the Ubuntu host)
or we hard-code the address of a next-hop router on our host. One (or more) of these three things sets up the host with a default route. If you do none of them, you don't get a default route, and global traffic goes nowhere.
Then -because either you or something running on the host deleted the kernel-provisioned link-local address, and then explicitly instructed the kernel to create a link-local address that can be used to reach global addresses- the local host starts emitting IPv6 traffic with a link-local source address and a global destination address.
When presented with this sort of traffic, my router immediately sends back a ICMP6 "destination unreachable, beyond scope", which immediately terminates the connection attempt on the host, so the behavior ends up being exactly the same as when the host didn't have a misconfigured link-local address. But. You claim to be having trouble.
So, there are one or more things that might be going on that explain your trouble.
1) You have a firewall on this host that is dropping important ICMP6 traffic, causing it to miss the "this destination address is beyond your scope" message from the router. Do. Not. Do. This. ICMP is network-management traffic which tells you important things. Dropping important ICMP traffic is how you have mysterious and annoying failures.
2) Your router is configured to ignore link-local traffic with non-link-local destination addresses, rather than replying that the destination is out of scope. On the one hand, this seems stupid to me, but on the other hand, we got here through a misconfiguration that seems very unlikely to me to happen often, [1] so the router admin might not have thought about it when making "locked down" firewall rules.
3) There's some middlebox on the path to the router that's dropping your traffic because not all that many folks would expect to see link-local source and global destination, and middleboxes are widely known for dropping stuff that's even a little bit abnormal.
Investigating your misconfigured host (and maybe also connected network) has been interesting. I'd love to try to figure out if SystemD can be misconfigured to produce the host configuration that we're seeing (or if this misconfiguration is 100% bespoke), but I hear a hot burrito calling my name. Maybe I'll get bored and do more investigation later.
Also, you might object to my conclusion with "But this couldn't happen on IPv4! Clearly IPv6 is too complicated!". I would reply with "What would happen if your host couldn't get a lease from a DHCPv4 server, autoconfigured an address in the IPv4 link-local (169.254.0.0/16) address range, and the network's upstream router was configured to silently drop traffic from that subnet? At least the IPv6 link-local address range is prohibited from sending traffic off the local link [2] and fails the transmission attempt immediately."
[0] ...and Ubuntu questing does ignore such prefixes...
[1] ...that is, a link-local address that has been configured to handle global traffic...
[2] ...unless -as we've discovered- you specifically tell the OS otherwise...
> Actually, what's up with your link-local addresses? They have really odd flags on them.
They were probably configured by one of the fancy network config daemons (systemd-networkd, dhcpcd or similar). They like to take over RA processing, and they add IPs with "noprefixroute" so they can add the route themselves separately.
RAs have nothing to do with link-locals, but I bet one or the other of those daemons also takes over configuring link-local addresses and does the same thing there. If you looked in the routing table, there'll be a prefix route for fe80::/64 that was added by the daemon.
This wouldn't affect how DNS replies are sorted though. On machines without non-link-local v6, AAAA records aren't handled by trying them first and then expecting them to quickly fail. They're handled by pushing them to the bottom of the list so that the A records are tried first.
> They were probably configured by one of the fancy network config daemons (systemd-networkd, dhcpcd or similar). They like to take over RA processing, and they add IPs with "noprefixroute" so they can add the route themselves separately.
Makes sense, yeah.
While I don't see a way to do this with dhcpcd, I have no clue what Lovecraftian horrors systemd-networkd generates, so maybe it's the culprit. And whatever is doing this, this behavior is not configured by default on Ubuntu Server version Questling. Out of the box, I get regular kernel-assigned link-local addresses.
But I don't understand why you'd want to do this for link-local addresses... not automatically, anyway. It looks like doing this has the disadvantage that it erases the baked-in "This shouldn't be used for global-scope transmissions. Send back 'Network is unreachable' in those cases." rule that you get for free with the kernel-generated address. Sheesh. I wonder if there's some additional logic in a stupid daemon somewhere that manages a firewall rule that restores the "Network is unreachable" ICMPv6 response to outbound global-scope packets that come from the link-local address... just to add more moving parts that can get out-of-sync.
> This wouldn't affect how DNS replies are sorted though.
Yeah.
It's a pity that I don't work with OP. I'd rather like to take a look at this system and the network it's hooked to.
> It looks like doing this has the disadvantage that it erases the baked-in "This shouldn't be used for global-scope transmissions.
I tried with the kernel-generated LL and my kernel does attempt to use a link-local source when connecting to GUA addresses if it has no other address to connect from. And it works:
(...so long as the destination is on the local network. In this case I assigned 2001:db8::1 to the router, but the router will issue an ICMPv6 redirect for other IPs on the network, which is awkward for me to test but should also work.)
I note that you didn't run `ip route add fe80::/64 dev enp0s3` after adding the LL with noprefixroute, which... seems to break surprisingly little? Because the packet gets sent to the router, which does still have a route for fe80::/64 to the same network, so it issues an ICMPv6 redirect and the client ends up doing NDP anyway.
> So, there are one or more things that might be going on that explain your trouble.
Ah, there's secret option #4:
4) This rather weird configuration has been deliberately set up by the sysadmin that manages this system and network and ordinarily works fine, but the "external transitive failure that happened on April 15th." affected both IPv4 and IPv6 traffic (which, duh, that happens frequently)... but it was an intermittent failure so unrelated changes made by OP caused him to come to the wrong conclusions and point the blame cannon at the wrong part of the system.
Are you talking about reaching the devices from inside the network, or outside?
If inside then you don't need NAT66 and ULA, you just need ULA. Use both ULA and the ISP GUAs on the network, and do your internal connections over ULA. If outside, then NAT66+ULA doesn't help because connections from outside will still fail until you update DNS for the new prefix.
NAT66 doesn't help in either situation, so why do you think you need to use it here?
> automatically updating the firewall rules
You can probably structure your firewall rules to not rely on the prefix, e.g. by doing "connections from WAN to LAN where the address matches ::42/-64" -- you might to write it with a mask instead (::42/::ffff:ffff:ffff:ffff), which looks awful but works fine. There's no point in putting a specific prefix into the rule if you're just going to change it to match the network anyway.
It'll prefer ULAs when connecting to hosts without A records. Programs will use ULAs if you connect to an IP literal, or if connecting to the A records fails. Also, Linux/glibc will prefer ULAs if you have a ULA assigned to the machine, and so will anything using the update to RFC 6724. So "ULA won't be used at all" is definitely not correct.
I don't think v6 is the absolute pinnacle of protocol design, but whenever anybody says it's bad and tries to come up with a better alternative, they end up coming up with something equivalent to IPv6. If people consistently can't do better than v6, then I'd say v6 is probably pretty decent.
> they end up coming up with something equivalent to IPv6
Not just that. Almost every single thing people think up that's "better" is something that was considered and rejected by the IPv6 design process, almost always for well-considered reasons.
The converse also happens: people look at something IPv6 supports and says "that's crazy, why would that be allowed/designed for", without knowing that IPv4 does it too.
In retrospect I think just adding another 16 or 32 bits to V4 would have been fine, but I don’t disagree with you. V6 is fine and it works great.
All the complaints I hear are pretty much all ignorance except one: long addresses. That is a genuine inconvenience and the encoding is kind of crap. Fixing the human readable address encoding would help.
If you add new bits to v4 you invent an incompatible protocol, and you should add a lot of bits so you'll never have to invent another incompatible protocol again. You can also fix the minor annoyances in v4.
True future-proofing would require representing address length as an arbitrary-precision nonzero unsigned integer.
Since allowing a zero-length network address format would serve no purpose other than to pointlessly complicate standards definitions, you could trivially and without loss of generality interpret zero to denote some extended-length address length representation to be defined in a future version of the standard.
Hardware implementations typically do not like variable-size fields. Not just because the total header size becomes unpredictable, but because it means any following fields no longer have a fixed offset, and that complicates parsing.
IPv4 was designed with extension headers: it boggles my mind that simply using the headers to extend the address was never seriously considered. It was proposed: https://www.rfc-editor.org/rfc/rfc1365.html
It still would have been a ton of work, but we could have just had what IPv6 claimed to be: IPv4 with bigger addresses. Except after the upgrade, there'd be no parallel system. And all of DJB's points apply: https://cr.yp.to/djbdns/ipv6mess.html
The people involved in core Internet protocol design were used to the net being a largely walled garden of governments, corporations, universities, and a small number of BBSes and niche ISPs.
Major protocol upgrades had happened before, not just for the core protocol but all kinds of other then-core services.
It had been a while but not that long, I think less than 20 years, and last time it was pretty easy. They assumed they could design something better and phase it in and all the members of the Internet community would just do the right thing.
That’s probably what made them feel they could push a more radical upgrade.
Unfortunately they started this right as the massive tsunami of Internet commercialization hit. Since V6 was too new, everyone went with V4. Now all the sudden you had thousands of times more nodes, sites, and personnel, and all of them were steeped in IPv4 and rushing to ship on top of it. You also lost the small town atmosphere of the early net where admins were a club and could coordinate things.
Had V6 launched five years earlier V4 would probably be dead.
V6 usage will probably keep creeping up, but as it stands we will likely be dual stack forever. Once the installed user base and sunk cost is this high the design is fixed and can never be changed without a hard core heavy handed measure like a government mandate.
> Had V6 launched five years earlier V4 would probably be dead.
Not a chance. IPv6 ate way more memory than IPv4 and memory was expensive back in 1995. Even IPv4 proliferation was chewing up memory and that was why the IETF introduced Classless Inter-Domain Routing (CIDR) in 1993 which gave us subnet masking.
Memory cost was a problem in routing tables until after both the DotBomb and the TeleBomb.
They weren't all that wrong. NAT was an incompatible protocol upgrade - that's why it broke protocols that made pre-NAT assumptions, like FTP - but it kept most of them working. DNS64 is also an incompatible protocol upgrade that breaks protocols that make pre-DNS64 assumptions, like hardcoding addresses - but it keeps most keeps of them working.
In DNS64, whenever your DNS resolver encounters an IPv4-only site, it translates it to an IPv6 address under a translator prefix, and returns that address to the client. The client connects to the translator server via that address, and the translator server opens an IPv4 connection to the website. Your side of the network is IPv6-only, not even running tunneled v4.
This only breaks things to about the same small extent that the introduction of NAT did.
iOS is benefitted from a heavy handed mandate so that it and all of its apps sing on IPv6 only networks. They just need to expose IPv4 internet as IPv6 addresses.
I said "whenever anybody says it's bad and tries to come up with a better alternative, they end up coming up with something equivalent to IPv6", and that's what you did here. And as predicted, it was 6to4 you reinvented.
v4 extension headers are well known to get your packets dropped on the Internet, so they're a non-starter, but there's another extension mechanism you can use: you can set the "next protocol" field to a special value, then put the extended address at the start of the payload, followed by the original payload. This is functionally identical to using extension headers, but using a mechanism that doesn't get your packets dropped.
Far from not being seriously considered, this approach was adopted in v6 as RFC 3056.
> Except after the upgrade, there'd be no parallel system.
No. You get a parallel system because v6 addresses are too big to work with v4. Even if you used extension headers, v6 addresses would still be too big to work with v4. Whatever you do, v6 addresses are too big to work with v4. You WILL get a parallel system, and there's no way around this other than not making the addresses bigger.
The hopes were for a converged software stack, but the candidates were all parallel protocols competing with IPv4. A full transition would end with the extinction of IPv4. Upgrading IPv4, quite apart from the brass tacks of the wire format, would have entailed variable-length addresses and even the idea of starting a new protocol with 64-bit addresses with an upgrade path was considered far too scary at the time. That was only one of a slew of non-technical requirements imposed from above for future proofing, NIH paranoia, vague security promises and politics in general.
A decade later, when IPv6 had real-world deployments was far to late for 6to4 to save the day: entirely because a swath of non-6to4 addresses existed and needed to be reachable. Given no strategic gain apparent for upgrading the commercial core, aligning financial interests by upgrading past the edge instead would absolutely have made sense. Unfortunately the hard parts the engineers anticipated in the early 90s were not the ones that held IPv6 back.
Yes, of course they were all parallel protocols -- because your problem here is that v4 doesn't _have_ variable-length addresses. It's trivial to imagine a version of v4 that does, but that version would also be a parallel protocol to the version of v4 we actually have.
> even the idea of starting a new protocol with 64-bit addresses with an upgrade path was considered far too scary at the time
No it wasn't? Every proposal had an upgrade path. Having one was a mandatory requirement.
You can read the requirements document yourself: https://datatracker.ietf.org/doc/html/rfc1726. To me, it looks like these requirements were decided by the community rather than being imposed from above, but either way you can see that having a simple transition from v4 is listed right there.
> A decade later, when IPv6 had real-world deployments was far to late for 6to4 to save the day: entirely because a swath of non-6to4 addresses existed and needed to be reachable
What I'm hearing is that the compatibility with v4 that 6to4 provides wasn't considered important, and not by people in any position of authority but rather by the actual people choosing what to deploy on their own networks. Even though there were more 6to4 hosts than non-6to4 ones, and even though 6to4 doesn't prevent you from reaching those non-6to4 hosts, people still didn't want it.
Blame the WHATWG for that. They're the reason that v6 addresses in URLs are such a mess. http://[fe90::6329:c59:ad67:4b52%8]:8081/ should work, but doesn't because they refuse to allow a % there. (This is really damned frustrating, because link-locals are excellent for setting up routers or embedded machines, or for recovering from network misconfigurations.)
If it's on the same machine then just use http://[::1]:8081/. Dropping the interface specifier (http://[fe90::6329:c59:ad67:4b52]:8081/) works if the OS picks a default, which some will. curl seems to be happy to work. Or just use one of the non-link-local addresses on the machine, if you have any.
The other frustrating part of this is that it makes it impossible to come up with your own address syntax. An NSS plugin on Linux could implement a custom address format, and it's kind of obvious that the intention behind the URL syntax is that "[" and "]" enter and exit a raw address mode where other URI metacharacters have no special meaning. In general you can't syntax validate the address anyway because you don't know what formats it could be in (including future formats or ones local to a specific machine), so the only sane thing to do is pass the contents verbatim to getaddrinfo() and see if you get an error.
But no, they wrote the spec to only allow a subset of v6 addresses and nothing else.
I very much didn't test it, but this patch might do the job on Firefox (provided there's no code in the UI doing extra validation on top):
On Mozilla Firefox after reenabling the separation into URL and searchbar it reports: "Invalid URL – Hmm. That address doesn’t look right. \n Please check that the URL is correct and try again." What does the '%' mean in there?
For link-local addresses, the part after % identifies the link. It's platform-specific - in Linux it's the interface name and in Windows it's an ID number.
You would have ended up with a protocol identical to IPv6, but with fewer address bits.
If you add *any* address bits you've already broken protocol compatibility and you need to upgrade the entire world. While you're already upgrading the entire world, you should add so many address bits that we'll never need more, because it costs the same, and you may as well fix those other niggling problems as well, right?
IPv4 is absolutely fine. Consumers can be behind NAT. That's fine. Servers can be behind reverse proxies, routing by DNS hostname. That's also fine. IPv4 address might be a valuable resource, shared between multiple users. Nothing wrong with it.
Yes, it denies simple P2P connectivity. World doesn't need it. Consumers are behind firewalls either way. We need a way for consumers to connect to a server. That's all.
No, they're not. That's other weird policies specific to your ISP.
With IPv4 + NAT, you have a public IP address. That public address goes to your router. Your router can forward any port to any machine on your LAN. I used to run Minecraft servers from a residential connection on IPv4, it was fine. Never had to call the ISP.
That's a fair point. In my mind, residential ISPs give out public IP addresses and CGNAT is just for cell phones. But I recognize that the philosophy of, "we don't need to solve IP address exhaustion, we just need to keep people able to access Facebook" leads to CGNAT or multi level NAT.
Still, I do think that the solution of, "one IPv4 address per household + NAT" is a perfectly good system. I view the IPv6 mentality of giving each computer in the world a globally unique IPv6 address as a non-goal.
Even if you go with one IPv4 per household + 1 per company you're going to be hard stretched to find room for that in 32 bits, at least after you add the routing infrastructure.
For one, businesses and other entities also need Internet access. Cloud companies in particular needs a ton of addresses. That's gonna eat up a fair chunk of the remaining 50%.
Two, humanity is still growing, governments across the world are building new housing. That's gonna eat up another chunk.
Three, routing is hierarchical, and infrastructure organisations and ISPs are assigned blocks of addresses, not individual addresses. We can't just have a pool of free IP addresses and assign any address to any house in the world as needed. So even having 50% of IP addresses free wouldn't really be enough.
So in my mind, an IP addresses to household ratio of 0.5 means residential CGNAT is inevitable, even if we ignore legacy issues like individual universities and other institutions owning gigantic /8 or /16 ranges.
Hm? The ISP gives one IP address to a router in a house, that router uses NAT to let all the computers inside that house use the Internet through the one single shared public IP address. That's NAT, isn't it?
Well, in a strict sense, it is "you" who chooses to run a nat'ing router there, you could just have one single computer per ISP connection.
Or have it run a proxy for you, or nat.
I mean, I understand that this feels normal today, that 10-20-50 devices need internet and that the way to manage that is to nat the connections, but your ISP isn't doing nat, it is you.
Nope, CGNAT means I need to call my ISP. We now have 2 levels of NAT because the IPv4 address situation has gotten so bad they can't even give every residence its own public IP. If your ISP hasn't adopted it yet its likely they got lucky and bought a ton of IPv4 addresses a long time ago when they were cheap and have decided using them is cheaper than upgrading their network to support CGNAT.
Nope. If you get assigned a routable IPv4 IP, you just have a shit ISP. I led the rollout one of the larger O365 implementations. Outlook and the office stack needed like 10-16 ports per user. We served like 150k people with 30 outbound IPs. If you have an IP, you have 64k+ ports to use.
I also deployed it as a pilot on an internal network. Other than getting direct IPv6 connectivity to some services, which sometimes gave us better performance, it conferred no advantage to us.
IPv6 is great for phones where you don't expect any inbound traffic. Even then, every US carrier is using Carrier NAT to route and proxy traffic for their own purposes.
The “don’t” was missing. Honestly, I give up with Siri dictation. Either my voice has changed or it’s changed in a way that it doesn’t like my cadence or diction.
IPv4 usage in its current state would've been much more limited and annoying in a world without IPv6. Therefore, IPv4 exists as-is thanks to others adopting IPv6.
> Yes, it denies simple P2P connectivity. World doesn't need it.
Worth pointing out that this article was written by the now-CEO of Tailscale. I don't know if "The world doesn't need P2P connectivity" is a compelling take.
With the obligatory caveat that I am but a single datapoint, I use various P2P apps through multiple levels of NAT without issue and I very intentionally prevent devices on my local LAN from being publicly reachable. So it rings true to me.
I do wish ISPs would refrain from intentionally breaking things though. It ought to be illegal for them to block specific ports or filter specific sorts of traffic absent a pressing and active security concern.
This comment exemplifies my worst fear and reinforces my somewhat incomplete idea that IPv4 is perhaps overall safer for the world, and that "worse is better" depending on what you're optimizing for.
Roughly, it's my belief that an IPv6 world makes it easier for centralizing forces and harder for local p2p or p2p-esque ones; e.g. an IPv6 world would have likely made it easier to do bad things like "charge for individual internet user in a home."
The decentralization of "routing power" is more a good thing than bad, what you pay for in complexity you get back in "power to the people."
> easier to do bad things like "charge for individual internet user in a home."
This idea comes up in every HN conversation about IPv6, and so I suppose this time it's my turn to point out RFC 8981[0]. tl;dr: typically, machines which receive IPv6 address assignment via SLAAC (functional equivalent of DHCP) periodically cycle their addresses. Supposed to offer pretty effective protection against host-counting.
You know that's not what he meant. the world is always changing. it was designed in 1998 by networking gear companies, with their own company needs in mind. It wasn't engineered with end user, or even network administrators and app developers in mind.
The only reason it's around is because of sunken cost fallacy and people stuck in decades old tech-debt. A new protocol designed today will be different, much the same as how Rust is different than Ada. SD-WAN wasn't a thing in 1998, the cost of chips and the demand of mobile customers wasn't a thing. supply/demand economics have changed the very requirments behind the protocol.
Even concepts like source and destination addressing should be re-thought. The very concept of a network layer protocol that doesn't incorporate 0RTT encryption by default is ridiculous in 2026. Even protocols like ND, ARP, RA, DHCP and many more are insecure by default. Why is my device just trusting random claims that a neighbor has a specific address without authentication? Why is it connecting to a network (any! wired,wireless, why does it matter, this is a network layer concern) without authenticating the network's security and identity authority? I despise the corporatized term "zero trust" but this is what it means more or less.
People don't talk about security, trust, identity and more, because ipv6 was designed to save networking gear vendors money, and any new costly features better come with revenue streams like SD-WAN hosting by those same companies. There are lots and lots of new things a new layer-3 protocol could bring to the scene. But security aside, the main thing would be replacing numbered addressing with identity-based addressing.
It all comes down to how much money it costs the participants of the RFC committees. given how dependent the world is on this tech, I'm hoping governments intervene. It's sad that this is the tech we're passing to future generations. We'll be setting up colonies on mars, and troubleshooting addressing and security issues like it's 2005.
> it was designed in 1998 by networking gear companies
That's false. Firstly, rfc1883 was published in 1995 which means work started some time before that, and the RFC process included operating system vendors and RIR administrators. The primary author of rfc1883 worked at Xerox Parc, and the primary author of rfc1885 worked at DEC. Neither were networking gear companies.
No, I think proposed, draft and internet standard all have specific meanings we don't need to debate over. Your claim that IPv6 was first proposed in 1995 is correct, as is my claim that it was first accepted in 1998. No one actually uses a proposed standard, but when it is draft people start implementing it and giving feedback over issues until it is fully ratified is my understanding (correct me if that's wrong please).
>There are lots and lots of new things a new layer-3 protocol could bring to the scene. But security aside, the main thing would be replacing numbered addressing with identity-based addressing
I don't know much about MPLS and only know IP routing, but that quote above sounds very hand-waving. How do you route "identity based addressing"?
Not to mention authenticated identity-based routing would mean embedding trusted centralized authorities into even deeper network layers. That is such a mess for TLS, after CAs started going rogue we've basically ended up with Google, a shitty ad company, deciding who should be trusted because they control Chrome.
Not at all, it doesn't even need to be PKI. But if it was, your routers would be the CA. Or more practically, whatever device is responsible for addressing, also responsible as the authority over those addresses. Your DHCP server would also be the CA for your LAN. Even a simple ND/ARP would require a claim (something like a short byte value end-devices can lookup/cache) that allows it to make that "the address x.x.x.x is at <mac>" statement. Smarter schemes might allow the network forwarder (router) to translate claims to avoid end devices looking up and caching lots of claims locally (and it would need to be authorized to do so).
You wouldn't need TLS. this scheme i just thought would actually decentralize/federate PKI a lot more. If you have a public address assigned, your ISP is the IP-CA. I don't want to get into the details of my DNS replacement idea, but similar to network operators being authorities over the addresses they're responsible for, whoever issued you a network name is also the identity authority over that name (so DNS registrars would be CA's). Ideally though, every device would be named, and the people that have logical control over the address will also be responsible for the name and all identity authentication and claims over those addresses and names. You won't have freaking google and browsers dictating which CA root to trust, it will instead be the network you're joining that does that (be it your DHCP server, or your ISP is up for debate, but I prefer the former). Ideally, your public key hash is your address. How others reach you would be by resolving your public key from your identity, the traffic will be sent to your public key (or see my sibling comment for the concept of cryptographic identity). All names would of course be free, but what we call "DNS" today will survive as an alias to those proper names. so your device might be guelo.lan123.yourisp.country but a registrar might sell you a guelo.com alias that points to the former name.
The implications of this scheme are wild, think about it!
Rogue trust providers will be a problem, but only to their domain. right now random CA roots can issue domains for anything. with the scheme I proposed, your country can mess with its own traffic, as can your isp, as can you over your lan. You won't be able to spoof traffic for a different lan, or isp using their name.
It wouldn't be a good idea to spell out an entire protocol in a comment section, but the key part is that it would cost a lot.
It is far from hand-waving. Right now we have numeric addressing, where routers look at bits and perform ASIC-friendly bitwise (and other) operations on that number to forward a lot of traffic really fast for cheap.
Identity and trust establishment won't be part of the regular data flow, but at network connection time, each end-device will discover the network authority it has connected to, and build trust that allows it to validate identities in that network, including address assignments, neighbor discovery, name resolution and verification, authorized traffic forwarders (routers) and more.
After the connection is established and the network is trusted, as part of the connection establishment, the network authority designates how addressing should be done. If Alice's Iphone wants to connect to Bob's server, it will encrypt the data, and as part of a very slim header designate Bob's server's cryptographic identifier, destination service identifier, and its own cryptographic identifier for the first packet. To reduce overhead, subsequent traffic can use a simple hash of the connection identifers mentioned earlier.
When devices come online in the network, their cryptographic identifers will become known to the entire network, including intermediate routers. Routing protocols work with the identity authority of the network to build forwarding tables based on cryptographic identifiers, and for established sessions, session ids.
"Cryptographic identifier" is also not a hand-wavy term. what it means must be dynamic, so as to avoid protocol updates like v4 and v6 over addressing. V6 presumed just having lots of bits is enough. An ideal protocol will allow the network itself to communicate the identifier type and byte-size. For example an FQDN, or an IPv4 address alike could be used directly, or a public key hash using a hash algorithm of the network's choice can be used. So long as the devices in the network support it, and the end device supports it, it should work fine.
Internet addressing can use a scheme like this, but it doesn't need to. IPv6 took the wrong approach with NAT, it got rid of it instead of formalizing it. we'll always need to translate addresses. But the internet is actually well-positioned for this, due to the prevalence of certificate authorities, but it will require rethinking foundational protocols like DNS, BGP, and PKI infrastructure.
But my original point wasn't this, it was that tech has come far, our requirements today are different than 30 years ago. Even the OSI layered model is outdated, among other things.
This is just my proposal that I just thought of as I'm typing this, smarter people that can sit down and think through the problem can think of better protocols. I only proposed it to demnostrate the concept isn't hand-wavy or ridiculous.
IPv6 was relatively rushed to meet the address shortage issue of IPv4 while at the same time solve lots of other problems. The next network layer protocol (and we do need one) should have the goal of making networking as a whole adaptable to new and unforeseen requirements (that's why I suggested the network authority be the one to dictate the addressing scheme, and with it, be responsible for translating it if needed). We're being held back, not just in tech but as a species, because of this short-sighted protocol design! exaggerated as that statement might sound, it is true.
I'll reserve further discussion on the topic for when it is required, but I hope this prevents more dismissive responses.
It's honestly not that hard. Tell your router to reject new inbound connections from the WAN interface, and you're done.
You have to do the exact same thing to make sure inbound connections aren't possible on v4 (even with NAT in the picture), so you might well have already done this or got it from the default ruleset. Plus it's trivial to test, by attempting to connect from another network.
Lots of people have come up with alternate L3 protocols for the Internet. For example, check out [1] which goes up to 999.999.999.999.999, or [2] which gets updated with some typo fixes every 6 months each time it's about to expire.
Someone did post another one of these recently, but it's just a random person on the Internet. The IETF's draft database is more or less a pastebin service -- anyone can upload something there without it being a thing that's being taken seriously.
Sometimes you get drafts like [3] as a demonstration of that.
[1] https://datatracker.ietf.org/doc/html/draft-eromenko-ipff-05
[2] https://datatracker.ietf.org/doc/draft-chen-ati-adaptive-ipv...
[3] https://datatracker.ietf.org/doc/html/draft-meow-mrrp
reply