VPN & SD-WAN Zero Downtime Failover - Best Practice Guide
Zitat von mpachmann am 13. Dezember 2023, 11:53 Uhrhttps://community.sophos.com/sophos-xg-firewall/f/recommended-reads/143009/sophos-firewall-vpn-sd-wan-zero-downtime-failover---best-practice-guide
Disclaimer: This information is provided as-is for the benefit of the Community. Please contact Sophos Professional Services if you require assistance with your specific environment.
Table of Contents
Overview
This Recommended read describes a best practice guide to build up a connectivity between multiple connection with high fail over and zero downtime approaches with real world scenarios and examples.
We are expanding on new and old features of SFOS (Sophos Firewall OS) and how to use them to modernize the network connectivity. This guide gives a guideline on building up and scaling the own network. Configuration may change based on your specific configuration.
All steps in this guide can be automated by Sophos Central Orchestration in case of larger deployments – For customization and learning reasons, we wanted to highlight the entire configuration within the guide.
Product and Prerequisites
- Read-write permissions on the SFOS web admin for the relevant features.
- Version 19.5 above and no explicit subscription required
- Version 20 were used in the following guide
Network Diagram
To show case the network diagram and features working, we build an example network.
- Wiesbaden is our HQ with Dual WAN connectivity.
- Berlin and Hamburg are our branch offices with a single WAN.
- We use a star topology, and all locations connect to our HQ.
Additionally, we are connecting to a Third-Party VPN Site, which is our responsibility.
Best Practice: Use the location name as hostname for the SFOS Appliance to differentiate the web admin consoles while configuration.
Note: On the following screenshots, we are moving between Wiesbaden and Hamburg. The Hostname will be reflected on the top right corner.
Configuration
VPN
Our goal in the VPN section is to fully utilize both WAN connections.
We will be using a Route Based VPN approach with XFRM interfaces to utilize 4 tunnels. Each Location will build up 2 tunnels.
As an example, we will go through the configuration of one of the tunnels.
For ease of deployment, we are building the tunnels on both peers at the same time, therefore we can copy/paste configuration between each firewall and avoid any mistakes.
We are using Route Based VPN Tunnel (called XFRM or Tunnel Interface).
- Wiesbaden as the HQ is in Respond only mode.
- Profiles are predefined IKEv2 (Internet Key Exchange Protocol Version 2) and can be selected on every SFOS installation.
- RSA Key as Authentication Type is defined by the tunnel, as RSA Key offers a quick configuration by copy/paste and does not require an external key management.
In Hamburg, we initiate the connection and select the same settings.
- RSA keys are defined per default from each appliance individually and expect the remote key from the peer appliance.
The last part is the combination of listing port on the local appliance and (remote) gateway address.
Listing Interfaces are WAN interfaces on the local firewall – We will select for this tunnel WAN1 (Fiber). The gateway address reflects the WAN IP / FQDN of the remote location (WAN1 of Hamburg).
Note: We are using DNS hostnames for the remote peers. Based on your setup, you will have different WAN IPs (Static IP or Dynamic DNS).
Best Practice: You can also use Dynamic DNS records for Gateway address or a Wildcard (*) in case you have a dynamic peer.
Additionally, we specify the Local ID Type DNS and give both proper names. This helps to identify the tunnels. This step is not mandatory but recommended.
We'll be doing this for four times for the 4 Tunnels at HQ and 2 in each locations.
To verify this, we can check the status of the tunnel in the IPsec overview section by going to CONFIGURE>Site-to-site VPN>IPsec Tab
HQ
Separate Site Location
XFRM Interface mapping recommendation
XFRM Interfaces are like a long cable between two appliances, and both ends of the tunnel require an IP Address.
- You should not reuse the same network range on multiple tunnels
- Segmentation is needed.
A common approach is to slice a /24 subnet into multiple /30 subnetwork like:
Tunnel1: 10.252.0.1/30 + 10.252.0.2/30Tunnel2: 10.252.0.5/30 + 10.252.0.6/30Tunnel3: 10.252.0.9/30 + 10.252.0.10/30 […] Tunnel64: 10.252.0.253/30 + 10.252.0.254/30
This approach does not require much planning, but it is difficult to identify the correct interface in case of an issue.
An alternative approach in this guide is to create readable segmentations per tunnel. We are giving every tunnel an own assignment.
XFRM Interface
Route Based Tunnel (Tunnel Interface) creates a XFRM Tunnel interface. In Configuration >Network >Interfaces you will see the XFRM Interface under each WAN Interface.
Note: You can click the blue line to expand the WAN Interface
XFRM Interfaces are transfer networks. You need an IP Address on both ends of the tunnel, which must be in the same subnet range.
Best Practice: For better visibility, XFRM Interface name can be match to the tunnel name.
We strongly encourage you to set up a plan of action for XFRM Interfaces.
There are multiple approaches to do this setup, and our approach might be only an idea and not suitable for your setup. You should lay out the XFRM Interface mapping before going into the configuration to avoid misconfiguration.
SD-WAN gateway
With the IPsec tunnels in place, we want to route traffic through those tunnels.
For routing, we have multiple approaches such as the following:
- Static Routing
- Dynamic Routing
- SD-WAN Routing.
To build zero downtime and high utilization of all WAN interfaces, we decided to use SD-WAN routes.
The following setup will be needed
- Gateway
- SD-WAN Profiles
- SD-WAN Route
One gateway for every XFRM tunnel will be needed
We choose the XFRM Interface which we want to use and the peer gateway IP address.
In this case, the interface is 10.20.10.1 and the peer gateway IP address is 10.20.10.2.
- By selecting the Interface, it is easy to identify the local IP address.
- Using a /30 XFRM assignment, we can identify the next IP address will be the gateway.
SD-WAN profiles
SD-WAN Profiles will be generated per location you want to make reachable.
In the SD-WAN Profiles, we are selecting all applicable locations and choose “Load Balancing”.
Load Balancing method decides whenever SFOS should select a specific VPN Tunnel.
You will find more information about Load Balancing in the Appendix.
For our example, we do a gateway weight of 1 and 1, which results in 50% / 50% Load Balancing. It could be a different priority in your setup based on the throughput or other criteria like costs.
The SLA strategy can vary in your setup and can be customized, if needed. You can leave all settings per default or customize based on your needs.
We are using the peer Location LAN IP as a probe target (192.168.130.1 is LAN Port1 of the Hamburg Site).
SD-WAN Routes
Our last step is to make networks reachable. We use SD-WAN Routes to build up the routes between our locations.
We created one SD-WAN Route per location, so it will be reachable
In the SD-WAN Routes, we are using IP Host Groups as a destination and leave the rest as ANY. You can customize this if you want to route only specifics.
Best Practice: By using IP Host Groups, we can always make sure all network ranges of all locations are routed correctly.
Best Practice: We are publishing and maintaining those objects via Sophos Central Firewall Management. Hamburg Networks, Berlin Networks and Wiesbaden Networks are automatically published on all Firewalls to ensure the routes are published. Changing the object in Central will change the routing as well.
You can use network objects as well; we found the usage of IP host groups more simplicist and less error sensitive.
Firewall Rules
We are using VPN to LAN and LAN to VPN Firewall rules in this Guide.
You should build your firewall rule concept based on your security need.
Redundancy and Zero Downtime
Proof of Concept
If a client from Berlin tries to reach a client in Hamburg, we can see the connection in the Diagnostic – Packet capture section.
In Wiesbaden you will see packets coming from XFRM3, going out on XFRM1 and the response from XFRM1 going back to XFRM3.
Note: Packet capture shows the newest packet on the top – to read a connection you will start with the last session.
Failover scenario
Our WAN1 fails in Wiesbaden, which results in the failure of 2 / 4 tunnels. But still, everything is reachable through the other connection.
Zero Downtime failover
Our setup supports per default a zero-downtime failover in case of the failover scenario above. Our clients will not notice a failover to the other tunnel and all connections remain active.
We can see this in the packet capture as well.
Before the failover we were using XFRM1 and XFRM3.
After the failover SFOS automatically switched to the XFRM4 and XFRM2 setup. Important to notice: The source ports of the connection are the same – The client is still connected to the server and did not stop (or rebuild) the connection.
Note: “IN Interface” in packet capture in case of a SD-WAN failover is reflected incorrectly in web admin. This is a known issue and only cosmetical.
SFOS will automatically fail back to the old VPN connection if the WAN1 comes back online. The source port is still unchanged.
Third Party VPN
Connecting to a Third Party can vary based on the product used and how much influence one has on the tunnel. Often a customer just gets a rule set of policies to use.
In our example, the Third Party requires us to NAT our network within the Tunnel (masquerade), and it offers a network range we want to reach. Additionally, we get a rule set of IPsec requirements to follow. This section is also in the Appendix.
We are building up the Tunnel with Site-to-Site Type (Policy Based) and initiate the connection.
In the Gateway Settings we must specify the Local and remote subnet.
Based on the requirements, we use “ThirdPartyNetwork” as stated by the ThirdParty and use the required Translated IP in Local Subnet. To reduce the complexity of this guide, we are not giving examples of those IPs.
We need to create a NAT Rule as well.
The NAT will translate every traffic going to the ThirdPartyNetwork and MASQ the traffic to the Translated IP.
As the last step, we need to generate an IPsec Route on the CLI for this destination network.
The command be: system ipsec_route add net <remote subnet> tunnelname <ipsec_tunnel>
system ipsec_route add net 192.168.3.0/255.255.255.0 tunnelname ThirdPartyTunnel
Note: You can use “TAB” for auto complete
The ThirdPartyRessource is now reachable for Wiesbaden. To make it reachable for Berlin/Hamburg, we need an additional SD-WAN Route.
This SD-WAN Route will route the traffic going to the ThirdPartyNetwork through Wiesbaden to the ThirdParty.
Scalability
Our guide reflects a smaller setup with 2 remote locations, and we want to review the options to scale this system to larger networks.
Steps to attach a new location
Creating a new location includes the following configuration steps:
- IPsec Tunnel on both Appliances
- IPsec XFRM Interfaces on both Appliances
- Gateways on both Appliances
- SD-WAN Profile on both Appliances
- SD-WAN Routes on both Appliances
Using SD-WAN Orchestration in Sophos Central
Sophos Central with the xStream Protection License supports the SD-WAN Orchestration and generates all points above for all managed appliances.
You'll find more Information here: Sophos Firewall: Managing Firewall and SD-WAN Orchestration
Dynamic Routing
In larger networks deployment, administrators might want to use dynamic Routing instead of SD-WAN routes. SD-WAN routes offer an easy way to deploy latency, jitter and packet loss measurement and routing decisions without the need of being experienced with the routing protocol itself. You may find more information about dynamic routing in the appendix.
Appendix
Disclaimer: This information is provided as-is for the benefit of the Community. Please contact Sophos Professional Services if you require assistance with your specific environment.
Table of Contents
Overview
This Recommended read describes a best practice guide to build up a connectivity between multiple connection with high fail over and zero downtime approaches with real world scenarios and examples.
We are expanding on new and old features of SFOS (Sophos Firewall OS) and how to use them to modernize the network connectivity. This guide gives a guideline on building up and scaling the own network. Configuration may change based on your specific configuration.
All steps in this guide can be automated by Sophos Central Orchestration in case of larger deployments – For customization and learning reasons, we wanted to highlight the entire configuration within the guide.
Product and Prerequisites
- Read-write permissions on the SFOS web admin for the relevant features.
- Version 19.5 above and no explicit subscription required
- Version 20 were used in the following guide
Network Diagram
To show case the network diagram and features working, we build an example network.
- Wiesbaden is our HQ with Dual WAN connectivity.
- Berlin and Hamburg are our branch offices with a single WAN.
- We use a star topology, and all locations connect to our HQ.
Additionally, we are connecting to a Third-Party VPN Site, which is our responsibility.
Best Practice: Use the location name as hostname for the SFOS Appliance to differentiate the web admin consoles while configuration.
Note: On the following screenshots, we are moving between Wiesbaden and Hamburg. The Hostname will be reflected on the top right corner.
Configuration
VPN
Our goal in the VPN section is to fully utilize both WAN connections.
We will be using a Route Based VPN approach with XFRM interfaces to utilize 4 tunnels. Each Location will build up 2 tunnels.
As an example, we will go through the configuration of one of the tunnels.
For ease of deployment, we are building the tunnels on both peers at the same time, therefore we can copy/paste configuration between each firewall and avoid any mistakes.
We are using Route Based VPN Tunnel (called XFRM or Tunnel Interface).
- Wiesbaden as the HQ is in Respond only mode.
- Profiles are predefined IKEv2 (Internet Key Exchange Protocol Version 2) and can be selected on every SFOS installation.
- RSA Key as Authentication Type is defined by the tunnel, as RSA Key offers a quick configuration by copy/paste and does not require an external key management.
In Hamburg, we initiate the connection and select the same settings.
- RSA keys are defined per default from each appliance individually and expect the remote key from the peer appliance.
The last part is the combination of listing port on the local appliance and (remote) gateway address.
Listing Interfaces are WAN interfaces on the local firewall – We will select for this tunnel WAN1 (Fiber). The gateway address reflects the WAN IP / FQDN of the remote location (WAN1 of Hamburg).
Note: We are using DNS hostnames for the remote peers. Based on your setup, you will have different WAN IPs (Static IP or Dynamic DNS).
Best Practice: You can also use Dynamic DNS records for Gateway address or a Wildcard (*) in case you have a dynamic peer.
Additionally, we specify the Local ID Type DNS and give both proper names. This helps to identify the tunnels. This step is not mandatory but recommended.
We'll be doing this for four times for the 4 Tunnels at HQ and 2 in each locations.
To verify this, we can check the status of the tunnel in the IPsec overview section by going to CONFIGURE>Site-to-site VPN>IPsec Tab
HQ
Separate Site Location
XFRM Interface mapping recommendation
XFRM Interfaces are like a long cable between two appliances, and both ends of the tunnel require an IP Address.
- You should not reuse the same network range on multiple tunnels
- Segmentation is needed.
A common approach is to slice a /24 subnet into multiple /30 subnetwork like:
Tunnel1: 10.252.0.1/30 + 10.252.0.2/30Tunnel2: 10.252.0.5/30 + 10.252.0.6/30Tunnel3: 10.252.0.9/30 + 10.252.0.10/30 […] Tunnel64: 10.252.0.253/30 + 10.252.0.254/30
This approach does not require much planning, but it is difficult to identify the correct interface in case of an issue.
An alternative approach in this guide is to create readable segmentations per tunnel. We are giving every tunnel an own assignment.
XFRM Interface
Route Based Tunnel (Tunnel Interface) creates a XFRM Tunnel interface. In Configuration >Network >Interfaces you will see the XFRM Interface under each WAN Interface.
Note: You can click the blue line to expand the WAN Interface
XFRM Interfaces are transfer networks. You need an IP Address on both ends of the tunnel, which must be in the same subnet range.
Best Practice: For better visibility, XFRM Interface name can be match to the tunnel name.
We strongly encourage you to set up a plan of action for XFRM Interfaces.
There are multiple approaches to do this setup, and our approach might be only an idea and not suitable for your setup. You should lay out the XFRM Interface mapping before going into the configuration to avoid misconfiguration.
SD-WAN gateway
With the IPsec tunnels in place, we want to route traffic through those tunnels.
For routing, we have multiple approaches such as the following:
- Static Routing
- Dynamic Routing
- SD-WAN Routing.
To build zero downtime and high utilization of all WAN interfaces, we decided to use SD-WAN routes.
The following setup will be needed
- Gateway
- SD-WAN Profiles
- SD-WAN Route
One gateway for every XFRM tunnel will be needed
We choose the XFRM Interface which we want to use and the peer gateway IP address.
In this case, the interface is 10.20.10.1 and the peer gateway IP address is 10.20.10.2.
- By selecting the Interface, it is easy to identify the local IP address.
- Using a /30 XFRM assignment, we can identify the next IP address will be the gateway.
SD-WAN profiles
SD-WAN Profiles will be generated per location you want to make reachable.
In the SD-WAN Profiles, we are selecting all applicable locations and choose “Load Balancing”.
Load Balancing method decides whenever SFOS should select a specific VPN Tunnel.
You will find more information about Load Balancing in the Appendix.
For our example, we do a gateway weight of 1 and 1, which results in 50% / 50% Load Balancing. It could be a different priority in your setup based on the throughput or other criteria like costs.
The SLA strategy can vary in your setup and can be customized, if needed. You can leave all settings per default or customize based on your needs.
We are using the peer Location LAN IP as a probe target (192.168.130.1 is LAN Port1 of the Hamburg Site).
SD-WAN Routes
Our last step is to make networks reachable. We use SD-WAN Routes to build up the routes between our locations.
We created one SD-WAN Route per location, so it will be reachable
In the SD-WAN Routes, we are using IP Host Groups as a destination and leave the rest as ANY. You can customize this if you want to route only specifics.
Best Practice: By using IP Host Groups, we can always make sure all network ranges of all locations are routed correctly.
Best Practice: We are publishing and maintaining those objects via Sophos Central Firewall Management. Hamburg Networks, Berlin Networks and Wiesbaden Networks are automatically published on all Firewalls to ensure the routes are published. Changing the object in Central will change the routing as well.
You can use network objects as well; we found the usage of IP host groups more simplicist and less error sensitive.
Firewall Rules
We are using VPN to LAN and LAN to VPN Firewall rules in this Guide.
You should build your firewall rule concept based on your security need.
Redundancy and Zero Downtime
Proof of Concept
If a client from Berlin tries to reach a client in Hamburg, we can see the connection in the Diagnostic – Packet capture section.
In Wiesbaden you will see packets coming from XFRM3, going out on XFRM1 and the response from XFRM1 going back to XFRM3.
Note: Packet capture shows the newest packet on the top – to read a connection you will start with the last session.
Failover scenario
Our WAN1 fails in Wiesbaden, which results in the failure of 2 / 4 tunnels. But still, everything is reachable through the other connection.
Zero Downtime failover
Our setup supports per default a zero-downtime failover in case of the failover scenario above. Our clients will not notice a failover to the other tunnel and all connections remain active.
We can see this in the packet capture as well.
Before the failover we were using XFRM1 and XFRM3.
After the failover SFOS automatically switched to the XFRM4 and XFRM2 setup. Important to notice: The source ports of the connection are the same – The client is still connected to the server and did not stop (or rebuild) the connection.
Note: “IN Interface” in packet capture in case of a SD-WAN failover is reflected incorrectly in web admin. This is a known issue and only cosmetical.
SFOS will automatically fail back to the old VPN connection if the WAN1 comes back online. The source port is still unchanged.
Third Party VPN
Connecting to a Third Party can vary based on the product used and how much influence one has on the tunnel. Often a customer just gets a rule set of policies to use.
In our example, the Third Party requires us to NAT our network within the Tunnel (masquerade), and it offers a network range we want to reach. Additionally, we get a rule set of IPsec requirements to follow. This section is also in the Appendix.
We are building up the Tunnel with Site-to-Site Type (Policy Based) and initiate the connection.
In the Gateway Settings we must specify the Local and remote subnet.
Based on the requirements, we use “ThirdPartyNetwork” as stated by the ThirdParty and use the required Translated IP in Local Subnet. To reduce the complexity of this guide, we are not giving examples of those IPs.
We need to create a NAT Rule as well.
The NAT will translate every traffic going to the ThirdPartyNetwork and MASQ the traffic to the Translated IP.
As the last step, we need to generate an IPsec Route on the CLI for this destination network.
The command be: system ipsec_route add net <remote subnet> tunnelname <ipsec_tunnel>
system ipsec_route add net 192.168.3.0/255.255.255.0 tunnelname ThirdPartyTunnel
Note: You can use “TAB” for auto complete
The ThirdPartyRessource is now reachable for Wiesbaden. To make it reachable for Berlin/Hamburg, we need an additional SD-WAN Route.
This SD-WAN Route will route the traffic going to the ThirdPartyNetwork through Wiesbaden to the ThirdParty.
Scalability
Our guide reflects a smaller setup with 2 remote locations, and we want to review the options to scale this system to larger networks.
Steps to attach a new location
Creating a new location includes the following configuration steps:
- IPsec Tunnel on both Appliances
- IPsec XFRM Interfaces on both Appliances
- Gateways on both Appliances
- SD-WAN Profile on both Appliances
- SD-WAN Routes on both Appliances
Using SD-WAN Orchestration in Sophos Central
Sophos Central with the xStream Protection License supports the SD-WAN Orchestration and generates all points above for all managed appliances.
You'll find more Information here: Sophos Firewall: Managing Firewall and SD-WAN Orchestration
Dynamic Routing
In larger networks deployment, administrators might want to use dynamic Routing instead of SD-WAN routes. SD-WAN routes offer an easy way to deploy latency, jitter and packet loss measurement and routing decisions without the need of being experienced with the routing protocol itself. You may find more information about dynamic routing in the appendix.