AfNOG - Advanced Routing - Day 2

Diarmuid O'Briain
11/06/2019, version 1.0

Last updated: 11-06-2019 23:45



The IPv6 Protocol & IPv6 Standards

IPv6 does not interoperate with IPv4:

It is a separate protocol working independently of IPv4 which was a deliberate design intention. IPv6 has simplified IP headers to remove unused or unnecessary fields from IPv4. It has fixed length headers to make it easier for chip designers and software engineers.

IPv6 has expanded address space with address length quadrupled to 16 bytes. Header Format Simplification with fixed length, optional headers are daisy-chained and the IPv6 header is twice as long (40 bytes) as IPv4 header without options (20 bytes). There is no checksum at the IP network layer or hop-by-hop fragmentation but path MTU discovery.

IPv6 Header

Extension Headers

0 Hop-by-hop option (extension)
2 ICMP (payload)
6 TCP (payload)
17 UDP (payload)
43 Source routing (extension)
44 Fragmentation (extension)
50 Encrypted security payload (extension, IPSec)
51 Authentication (extension, IPSec)
59 Null (No next header)
60 Destination option (extension)

Order is important because:

At the destination fragmentation has to be processed before other headers. This makes header processing easier to implement in hardware.

How was the IPv6 Address Size Chosen?

Some wanted fixed-length, 64-bit addresses which is adequate for 1012 sites, 1015 nodes, at .0001 allocation efficiency which is 3 orders of magnitude more than IPv6 requirement. This minimizes growth of per-packet header overhead and is efficient for software processing.

Some wanted variable-length, up to 160 bits to make it compatible with Open Systems Interconnection (OSI) Network Service Access Point (NSAP) addressing plans. This is big enough for auto-configuration using IEEE 802 addresses. Could start with addresses shorter than 64 bits and grow later.

Compromise was settled on fixed-length, 128-bit addresses.

IPv6 Address Representation

16 bit fields in case insensitive colon hexadecimal representation.

  2031:0000:130F:0000:0000:09C0:876A:130B
  

Leading zeros in a field are optional.

  2031:0:130F:0:0:9C0:876A:130B
  

Successive fields of 0 represented as ::, but only once in an address.

  2031:0:130F::9C0:876A:130B   # OK
  2031::130F::9C0:876A:130B    # Not OK
  

Loopback and unspecified addresses.

  0:0:0:0:0:0:0:1 → ::1      
  0:0:0:0:0:0:0:0 → ::      
  

In a URL, it is enclosed in brackets (Request For Comment (RFC) 3986)

  http://[2001:db8:4f3a::206:ae14]:8080/index.html
  

Cumbersome for users, mostly for diagnostic purposes, so the use of Fully Qualified Domain Names (FQDN) using the Domain Name System (DNS) is essential.

IPv7 prefix representation is always in slash / notation, i.e. /48 /64.

IPv6 Addressing

Pv6 Addressing rules are covered by multiple RFCs, the architecture defined by RFC 4291.

Address Types are :

A single interface may be assigned multiple IPv6 addresses of any type (unicast, anycast, multicast). There is no Broadcast Address → Use Multicast.

Type Binary Hex
Unspecified 000...0 ::/128
Loopback 000...1 ::1/128
Global Unicast Address 0010 2000::/3
Unique Local Unicast Address 1111 1100 1111 1101 FC00::/7
Link Local Unicast Address 1111 1110 10 FE80::/10
Multicast Address 1111 1111 FF00::/8

Address blocks are delegated by IETF to IANA for distribution to the Regional Internet Registry (RIR) and on to the users of the public Internet. The Global Unicast Address block is 2000::/3, this is 1/8 th of the entire available IPv6 address space.

Unique-Local Addresses (ULAs)

Unique-Local Addresses (ULA) are NOT routable on the Internet. These are isolated IPv6 networks that never need public Internet connectivity and don't need assignment from RIR or Internet Service Provider (ISP). The L-bit is set to 1 which means the address is locally assigned. ULAs are used for:

Local devices such as printers, telephones, etc. Connected to networks using Public Internet but the devices themselves do not communicate outside the local network. They are useful for site network management systems connectivity as well as infrastructure addressing using dual Global and Unique-Local addressing or public networks experimenting with IPv6-to-IPv6 Network Prefix Translation (NPTv6) (RFC6296). One to one IPv6 to IPv6 address mapping.

Link-Local Addresses

Link-Local Addresses used for communication between two IPv6 device (like Address Resolution Protocol (ARP) but at Layer 3) and Next-Hop calculation in Routing Protocols. Automatically assigned by Router as soon as IPv6 is enabled, it is a manditory address. It has only Link Specific scope and the remaining 54 bits could be Zero or any manual configured value.

Multicast

Multicast Addresses used for one to many communication. The 2nd octet is reserved for Lifetime and Scope. The remainder of address represents the Group ID. This is a substantially larger range than for IPv4 which only had 224.0.0.0/4 for Multicast.

Global Unicast

The Internet Assigned Numbers Authority (IANA) is allocating out of 2000::/3 for initial IPv6 unicast use. The following are the allocation blocks for the various levels on the Internet.

EUI-64 conversion of 48 bit MAC

  00 90 27 17 FC 0F                                # 48-bit MAC address
  00 90 27<--   -->17 FC 0F                        # Split in the centre
  00 90 27<-- FF FE  -->17 FC 0F                   # Insert FF FE
  00 90 27 FF FE 17 FC 0F
  
  00 --> 0000 0000 --> 0000 0010 --> 02            # Universal scope
  
  02 90 27 FF FE 17 FC 0F                          # EUI-64 address
  

Autoconfiguration in IPv6

The eui-64 keyword can be used to indicate that the router is to construct the host portion of the IPv6 address using the prefix.

  Router(config)# interface Ethernet0
  Router(config-if)# ipv6 address 2001:db8:213:1::/64 eui-64
  

Duplicate Address Detection (DAD)

During the IPv6 creation process the router carries out a Solicited Node Multicast Duplicate Address Detection (DAD) process. It is looking to see if the address is configured on the network already. In that case an error like this is reported and the interface is automatically disabled for IPv6 traffic (IPv4 traffic is unaffected). This has impact for backbone links (IPv6 traffic takes alternative path) and external peering links (IPv6 peering down, IPv4 peering okay).

  Aug 23 09:18:41.263: %IPV6_ND-6-DUPLICATE_INFO: DAD attempt
  detected for 2001:DB8:0:3:: on Serial1/1
  

Nibble Boundaries

In IPv6 it makes sense to use nibble boundaries, consider as an example the address prefix 2001:db8:0:10::/61.

The range of addresses in this block are:

  2001:0db8:0000:0010:0000:0000:0000:0000
  2001:0db8:0000:0017:ffff:ffff:ffff:ffff
  

This subnet only runs from 0010 to 0017. The adjacent block is 2001:db8:0:18::/61.

  2001:0db8:0000:0018:0000:0000:0000:0000
  2001:0db8:0000:001f:ffff:ffff:ffff:ffff
  

The address blocks do not use the entire nibble range. Now consider the address block 2001:db8:0:10::/60, the range of addresses in this block are:

  001:0db8:0000:0010:0000:0000:0000:0000
  2001:0db8:0000:001f:ffff:ffff:ffff:ffff
  

In this case the subnet uses the entire nibble range, 0 to f, which makes the numbering plan for IPv6 simpler. This range can have a particular meaning within the ISP block (for example, infrastructure addressing for a particular Point of Presence (POP)).

Stateless Address Auto-Configuration (SLAAC)

Stateless Address AutoConfiguration (SLAAC) is defined in RFC4862. First the booting node configures its own Link-Local address by generating an EUI-64 address from its MAC address and carries out the DAD process to confirm the address has not been used already. It then prepends FE80:: to it to form the Link-Local address. Taking the earlier example:

  FE80::0290:27FF:FE17:FC0F/64 
  

The booting node uses this address to send a router solicitation message to request a router advertisement in order to obtain global scope prefix information. Assuming it receives the prefix 2001:0db8::/64 it generates an interface address using the EUI-64 address.

  2001:0db8::0290:27FF:FE17:FC0F/64

IPv6 Generation

The perl script tool ip6gen is very useful for assigning address blocks from a prefix. The ISP is assigned a /32 prefix from AfriNIC and first breaks it down into its constituent /48 prefixes. Assume the assigned prefix is 2001:db8::/32.

Assigned a 2001:db8::/32 from AfriNIC. The /32 contains 65536 /48 prefixes.

  $ ipv6gen 2001:db8::/32 48 | wc -l
  65536
  
  $ ipv6gen 2001:db8::/32 48 | head
  2001:0DB8:0000::/48  # Loopbacks and infrastructure 
  2001:0DB8:0001::/48  # Customers
  2001:0DB8:0002::/48
  2001:0DB8:0003::/48
  2001:0DB8:0004::/48
  
  $ ipv6gen 2001:db8::/32 48 | tail -5
  2001:0DB8:FFFB::/48
  2001:0DB8:FFFC::/48
  2001:0DB8:FFFD::/48
  2001:0DB8:FFFE::/48
  2001:0DB8:FFFF::/48
  

Take the first /48, 2001:0DB8:0000::/48 for loopbacks and infrastructure and the remaining 65535 /48 prefix blocks are available for assignment to customers.

Now consider the 2001:0DB8:0000::/48 prefix reserved for loopbacks and infrastructure. Break it down into /64 prefixes. Again there are 65536 prefixes available. Extract the first for loopbacks and the remaining 65535 are available for infrastructure.

  $ ipv6gen 2001:db8:0::/48 64 | wc -l
  65536
  
  $ ipv6gen 2001:db8:0::/48 64 | head -5
  2001:0DB8:0000:0000::/64                # Extract for loopbacks
  2001:0DB8:0000:0001::/64
  2001:0DB8:0000:0002::/64
  2001:0DB8:0000:0003::/64
  2001:0DB8:0000:0004::/64
  
  $ ipv6gen 2001:db8:0::/48 64 | tail -5
  2001:0DB8:0000:FFFB::/64
  2001:0DB8:0000:FFFC::/64
  2001:0DB8:0000:FFFD::/64
  2001:0DB8:0000:FFFE::/64
  2001:0DB8:0000:FFFF::/64
  

Loopbacks

Loopbacks are assigned a /128 mask. There is more than adequate host addresses in the 2001:0DB8:0000:0000::/64 to accommodate any sized ISP.

  $ ipv6gen 2001:DB8:0::/48 64 | wc -l
  18446744073709551616
  
  $ ipv6gen 2001:0DB8:0000:0000::/64 128 | head -5
  2001:0DB8:0000:0000:0000:0000:0000:0000/128
  2001:0DB8:0000:0000:0000:0000:0000:0001/128
  2001:0DB8:0000:0000:0000:0000:0000:0002/128
  2001:0DB8:0000:0000:0000:0000:0000:0003/128
  2001:0DB8:0000:0000:0000:0000:0000:0004/128
  
  $ ipv6gen 2001:0DB8:0000:0000::/64 128 | tail -5
  2001:0DB8:0000:0000:FFFF:FFFF:FFFF:FFFB/128
  2001:0DB8:0000:0000:FFFF:FFFF:FFFF:FFFC/128
  2001:0DB8:0000:0000:FFFF:FFFF:FFFF:FFFD/128
  2001:0DB8:0000:0000:FFFF:FFFF:FFFF:FFFE/128
  2001:0DB8:0000:0000:FFFF:FFFF:FFFF:FFFF/128
  

Point to Point addresses

Point to Point (P2P) links typically follow the pattern used in IPv4. /127 mimicking the IPv4 /31 or /126 mimicking the IPv4 /30. All the prefixes from 2001:0DB8:0000:0001::/64 to 2001:0DB8:0000:FFFF::/64 are available in this example.

IPv6 configuration

To enable IPv6 the following global commands are required:

  Router(config)# ipv6 unicast-routing
  

Also enable IPv6 CEF (not on by default):

  Router(config)# ipv6 cef
  

Also disable IPv6 Source Routing (enabled by default):

  Router(config)# no ipv6 source-routing
  

To configure a global or unique-local IPv6 address the following interface command should be entered:

  Router(config-if)# ipv6 address X:X..X:X/prefix
  

To configure an EUI-64 based IPv6 address the following interface command should be entered:

  Router(config-if)# ipv6 address X:X::/prefix eui-64
  

EUI-64 is not helpful on a router and is not recommended.

If no global IPv6 address is required on an interface, yet it needs to carry IPv6 traffic:

Enable IPv6 on that interface using:

  Router(config-if)# ipv6 enable
  

Which will result in a link-local IPv6 address being constructed automatically

FE80:: is concatenated with the Interface ID to give:

  FE80::interface-id
  

Configuring an IPv6 address (whether global or unique-local) will also result in a link-local IPv6 address being created

  Router1# configuration terminal
  Router1(config)# no ipv6 source-routing
  Router1(config)# ipv6 unicast-routing
  Router1(config)# ipv6 cef
  Router1(config)# interface FastEthernet 0/0
  Router1(config-int)# ipv6 enable
  Router1(config-int)# ^Z
  

Static Routing

The following is a static route for the network 2001:db8::/64 to a networking device at 2001:DB8:0:ABCD::1.

  Router1(config)# ipv6 route 2001:DB8::/64 2001:DB8:0:ABCD::1 150
  

Dynamic Routing in IPv6

Dynamic Routing in IPv6 is unchanged from IPv4:

Configuring routing

Support for IPv6 in the major routing protocols.

Open Shortest Path First version 3 (OSPFv3)

Open Shortest Path First version 2 (OSPFv2) is an IGP protocol for IPv4; however, it does not support IPv6. Open Shortest Path First version 3 (OSPFv3) was developed for IPv6 and in cases where IPv4 and IPv6 coexist then if OSPF is the IGP it is necessary to run both routing protocol versions.

  Router1(config)# ipv6 router ospf 1
  Router1(config)# router-id 1.1.1.1
  
  Router1(config)# interface Ethernet0
  Router1(config-if)# ipv6 address 2001:db8:1:1::1/64
  Router1(config-if)# ipv6 ospf 1 area 0
  
  Router1(config)# interface Ethernet1
  Router1(config-if)# ipv6 address 2001:db8:2:2::2/64
  Router1(config-if)# ipv6 ospf 1 area 1
  

IS-IS

RFC5308 adds IPv6 address family support to IS-IS. RFC5120 defines Multi-Topology concept and permits IPv4 and IPv6 topologies which are not identical, this permits roll out of IPv6 without impacting IPv4 operations.

  Router1(config)# interface ethernet 1
  Router1(config-if)# ip address 10.1.1.1 255.255.255.0
  Router1(config-if)# ipv6 address 2001:db8:1::a/64
  Router1(config-if)# ip router isis
  Router1(config-if)# ipv6 router isis
  
  Router1(config)# interface ethernet 2
  Router1(config-if)# ip address 10.2.1.1 255.255.255.0
  Router1(config-if)# ipv6 address 2001:db8:2::a/64
  Router1(config-if)# ip router isis
  Router1(config-if)# ipv6 router isis
  
  Router1(config)# router isis
  Router1(config-router)# net 42.0001.0000.0000.072c.00
  Router1(config-router)# metric-style wide
  

Multi-Protocol BGP for IPv6

RFC2545 defines Multiprotocol BGP. This is an enhanced BGP that is capable of carrying routing information for multiple network layer protocol address families, such as, IPv4 and IPv6 address families and for IP multicast routes.

  Router1(config)# interface Ethernet0
  Router1(config-if)# ipv6 address 2001:db8:2:1::f/64
  
  Router1(config)# router bgp 65001
  Router1(config-router)# bgp router-id 10.10.10.1
  Router1(config-router)# no bgp default ipv4-unicast
  Router1(config-router)# neighbor 2001:db8:2:1::1 remote-as 65002
  Router1(config-router)# address-family ipv6
  Router1(config-router-af)# neighbor 2001:db8:2:1::1 activate
  Router1(config-router-af)# neighbor 2001:db8:2:1::1 prefix-list bgp65002in in
  Router1(config-router-af)# neighbor 2001:db8:2:1::1 prefix-list bgp65002out out
  Router1(config-router-af)# exit-address-family
  

BGP Scaling

Original BGP specification from the early 1990s didn’t scale.

Current Best Practice Scaling Techniques:

Deprecated Scaling Techniques:

Route Refresh

BGP peer reset required after every policy change because the router does not store prefixes which are rejected by policy. A hard BGP peer reset tears down BGP peering & consumes CPU and severely disrupts connectivity for all networks. During a soft BGP peer reset (or Route Refresh) BGP peering remains active and impacts only those prefixes affected by policy change. This facilitates non-disruptive policy changes. No configuration is needed as automatically negotiated at peer establishment. There is no additional memory is used. Peering routers are required to support route refresh capability – RFC2918.

Tell peer to resend full BGP announcement:

  Router1# clear ip bgp x.x.x.x [soft] in
  

Resend full BGP announcement to peer:

  Router1# clear ip bgp x.x.x.x [soft] out
  

Only hard-reset a BGP peering as a last resort. Consider the impact to be equivalent to a router reboot.

Cisco’s Peer Groups

Problem – how to scale iBGP, large iBGP mesh slow to build. iBGP neighbours receive the same update and router CPU is wasted on repeat calculations.

A solution – peer-groups. Group peers with the same outbound policy and updates are generated once per group.

The advantage of peer groups is:

Configure a Cisco peer group

Note how 2.2.2.2 has different inbound filter from peer-group. An explicit configuration overrides a configution within the peer group.

Here is an example for iBGP.

  Router1(config)# router bgp 100
  Router1(config-router)# neighbor ibgp-peer peer-group
  Router1(config-router)# neighbor ibgp-peer remote-as 100
  Router1(config-router)# neighbor ibgp-peer update-source loopback 0
  Router1(config-router)# neighbor ibgp-peer send-community
  Router1(config-router)# neighbor ibgp-peer route-map outfilter out
  Router1(config-router)# neighbor 1.1.1.1 peer-group ibgp-peer
  Router1(config-router)# neighbor 2.2.2.2 peer-group ibgp-peer
  Router1(config-router)# neighbor 2.2.2.2 route-map infilter in
  Router1(config-router)# neighbor 3.3.3.3 peer-group ibgp-peer

Here is an example for eBGP.

  Router1(config)# router bgp 100
  Router1(config-router)# neighbor external-peer peer-group
  Router1(config-router)# neighbor external-peer send-community
  Router1(config-router)# neighbor external-peer route-map set-metric out
  Router1(config-router)# neighbor 160.89.1.2 remote-as 200
  Router1(config-router)# neighbor 160.89.1.2 peer-group external-peer
  Router1(config-router)# neighbor 160.89.1.4 remote-as 300
  Router1(config-router)# neighbor 160.89.1.4 peer-group external-peer
  Router1(config-router)# neighbor 160.89.1.6 remote-as 400
  Router1(config-router)# neighbor 160.89.1.6 peer-group external-peer
  Router1(config-router)# neighbor 160.89.1.6 filter-list infilter in
  

Configure a Cisco update group

Update-groups is an internal IOS coding, taking over the performance gains introduce by peer-groups.

  Router1# show ip bgp 10.0.0.0/26
  BGP routing table entry for 10.0.0.0/26, version 2
  Paths: (1 available, best #1, table default)
    Advertised to update-groups:
       1
    Refresh Epoch 1
    Local
      0.0.0.0 from 0.0.0.0 (10.0.15.241)
      Origin IGP, metric 0, localpref 100, weight 32768, valid...
  

Always configure peer-groups for iBGP even if there are only a few iBGP peers as they are easier to scale network in the future and makes configuration easier to read.

Consider using peer-groups for eBGP as they are especially useful for multiple BGP customers using same AS (RFC2270). It is also useful at Exchange Points where ISP policy is generally the same to each peer. For a Route Server where all peers receive the same routing updates.

Route Reflectors

Route reflectors are used for scaling the iBGP mesh. The help in large networks to avoid an ½n(n-1) iBGP mesh. For example if n=1000 then there are nearly half a million iBGP sessions.

Solutions:

Reflector receives path from clients and non-clients and selects best path. If best path is from client, reflect to other clients and non-clients. If best path is from non-client, reflect to clients only. Non-meshed clients. Described in RFC4456.

Route Reflector Topology

Loop Avoidance

Route Reflector: Benefits

Most new service provider networks now deploy Route Reflectors from day one.

Configuring a Route Reflector

  Router1(config)# router bgp 100
  ...
  Router1(config-router)# neighbor 1.2.3.4 remote-as 100
  Router1(config-router)# neighbor 1.2.3.4 route-reflector-client
  Router1(config-router)# neighbor 1.2.3.5 remote-as 100
  Router1(config-router)# neighbor 1.2.3.5 route-reflector-client
  Router1(config-router)# neighbor 1.2.3.6 remote-as 100
  Router1(config-router)# neighbor 1.2.3.6 route-reflector-client
  ...

BGP Confederations

BGP Confederations described in RFC5065. Divide the AS into sub-AS with eBGP between sub-AS, but some iBGP information is kept. Preserve NEXT_HOP across the sub-AS (IGP carries this information) and preserve LOCAL_PREF and MED. Usually a single IGP.

It is visible to outside world as single AS Confederation Identifier. Each sub-AS uses a number from the private space (64512-65534). iBGP speakers in sub-AS are fully meshed, the total number of neighbors is reduced by limiting the full mesh requirement to only the peers in the sub-AS. A Route Reflector can also use Route-Reflector within sub-AS.

  Router1(config)# router bgp 65532
  Router1(config-router)# bgp confederation identifier 200
  Router1(config-router)# bgp confederation peers 65530 65531
  Router1(config-router)# neighbor 141.153.12.1 remote-as 65530
  Router1(config-router)# neighbor 141.153.17.2 remote-as 65531
  

Principles:

Loop avoidance:

Caveats

Confederations: Benefits

Confederations are useful for linking an acquired network into the main network.