Troubleshooting IPsec Connections¶
IPsec connection names¶
IPsec tunnels follow a consistent naming pattern when forming connection names
used in the strongSwan configuration. These names are printed in the IPsec
status and can also be found in the IPsec configuration file
(/var/etc/ipsec/swanctl.conf
), the IPsec log, and the output of various
swanctl
commands.
Non-mobile tunnels all use an IKE connection named conX
where X
is the
phase 1 IKE ID.
Phase 2 child definitions use slightly different names based on the tunnel settings:
For normal IKEv2 tunnels without Split Connections enabled all phase 2
entries are combined into a single child definition. In this case the
connections are named conX
where X
is the phase 1 IKE ID and this is
identical to the name of the IKE portion of the connection.
For IKEv1 tunnels and for IKEv2 tunnels with Split Connections enabled each
phase 2 entry is defined as a separate child. In this case the child definitions
are named conX_Y
where X
is the phase 1 IKE ID and Y
is the phase 2
reqid.
Note
The phase 1 IKE ID and phase 2 reqid are printed in the IPsec tunnel list and on the page when editing those entries.
To see a list of current connections, run the following command from the shell:
# swanctl --list-conns
The output of that command lists the IKE connection name first (e.g. con1
)
with no indentation. Child definitions are listed at the end of a tunnel entry
and are indented.
Manually connect IPsec from the shell¶
Connections can be manually initiated and terminated from the shell using the
swanctl
command.
Tip
When initiating a tunnel in this way, swanctl
will output only the
relevant logs to the terminal. This is much easier than attempting to follow
the log file contents in other ways.
The connection name for a tunnel must be used in this case, such as con1
or
con2_1
.
Note
To locate the correct con
identifier, see IPsec connection names.
The following command will attempt to initiate the IKE portion of a tunnel (phase 1):
# swanctl --initiate --ike conX
The following command will attempt to initiate the child SA portion of a tunnel (phase 2) as well as IKE if it is not already connected:
# swanctl --initiate --child conX
Terminating a tunnel uses similar syntax.
Terminate IKE connection (also terminates all child connections):
# swanctl --terminate --ike conX
Terminate a child connection:
# swanctl --terminate --child conX
Tunnel does not establish¶
First check the service status at Status > Services. If the IPsec service is stopped, check if there is at least one configured and enabled IPsec tunnel (IPsec Tunnels Tab).
If the service is running, check the firewall logs at Status > System Logs,
Firewall tab. Look for entries that indicate that the connection is being
blocked. If the tunnel is not establishing, check for UDP entries for ports
500
and 4500
. Rules are normally added automatically for IPsec
(IPsec and firewall rules), but that feature can be disabled or there
may be edge cases where the firewall cannot identify the remote IPsec gateway.
Add rules to pass traffic if needed.
The single most common cause of failed IPsec tunnel connections is a configuration mismatch. Often it is something small, such as a DH group set differently, or perhaps a subnet mask of /24 on one side and /32 on the other in the phase 2 networks. Some routers (Linksys, for one) also like to hide certain options behind “Advanced” buttons or make assumptions. A lot of trial and error may be involved, and a lot of log reading, but ensuring that both sides match precisely will help the most.
Depending on the Internet connections on either end of the tunnel, it is also
possible that a router involved on one side or the other does not properly
handle IPsec traffic. This is a larger concern with mobile clients and networks
where NAT is involved outside of the actual IPsec endpoints. The problems are
generally with the ESP protocol and problems with it being blocked or mishandled
along the way. NAT Traversal (NAT-T) encapsulates ESP in UDP port 4500
traffic to work around these issues. Typically this situation is detected
automatically but in some edge cases it can help to force NAT traversal for
IKEv1 tunnels.
“Random” tunnel disconnects/DPD failures on low-end routers¶
If IPsec tunnels are dropped on low-end hardware that is pushing the limits of its CPU, DPD on the tunnel may need disabled. Such failures tend to correlate with times of high bandwidth usage. This happens when the CPU on a low-power system is tied up with sending IPsec traffic or is otherwise occupied. Due to the CPU overload it may not take the time to respond to DPD requests or see a response to a request of its own. As a consequence, the tunnel will fail a DPD check and be disconnected. This is a clear sign that the hardware is being driven beyond its capacity. If this happens, consider replacing the firewall with a more powerful model.
Tunnels establish and work but fail to renegotiate¶
In some cases a tunnel will function properly but once the phase 1 or phase 2 lifetime expires the tunnel will fail to renegotiate properly. This can manifest itself in a few different ways, each with a different resolution.
DPD is unsupported and one side drops while the other remains¶
Consider this scenario, which DPD is designed to prevent, but can happen in places where DPD is unsupported:
A tunnel is established from Site A to Site B, from traffic initiated at Site A.
Site B expires the phase 1 or phase 2 before Site A
Site A will believe the tunnel is up and continue to send traffic as though the tunnel is working properly.
Only when the Site A phase 1 or phase 2 lifetime expires will it renegotiate as expected.
In this scenario, the likely things resolutions are:
Check to make sure all of the settings match on both sides, especially the phase 1 DH Group and phase 2 PFS values.
Enable DPD, or Site B must send traffic to Site A which will cause the entire tunnel to renegotiate. The easiest way to make this happen is to enable a keep alive mechanism on both sides of the tunnel.
Enable the periodic check keep alive method on one end (Configuring IPsec Keep Alive)
Tunnel establishes when initiating but not when responding¶
If a tunnel will establish sometimes, but not always, generally there is a settings mismatch. The tunnel may still establish because if the settings presented by one side are more secure the other may accept them, but not the other way around.
Lifetime mismatches do not cause a failure in phase 1 or phase 2.
To track down these failures, configure the logs as shown in Troubleshooting IPsec Logs and attempt to initiate the tunnel from each side, then check the logs.
Tunnel establishes at start but not when disconnected¶
An IPsec tunnel can be disconnected for a variety of reasons. For example, connectivity being interrupted to the far side, the remote being down or offline for an extended time, or even a manual or policy action on the far side.
Note
This is not the same scenario as a rekey or reauthentication event, which will rebuild the appropriate parts of the tunnel and remain active.
A tunnel mode IPsec instance will connect at start and when it disconnects, will connect again on demand. This happens due to trap policies which trigger initiation when traffic attempts to use the tunnel. A tunnel mode IPsec connection can be reconnected without manual intervention by the automatic ping keep alive function on a phase 2 entry.
VTI mode IPsec cannot support trap policies so it is not capable of using this tactic. As such, a VTI tunnel may need help to stay up and running at all times.
There are a two workarounds that may help in this case:
- Keep Alive - Periodic Check:
The IPsec phase 2 Keep Alive option to perform a periodic IPsec status check is ideally suited to this case. When enabled, if a given phase 2 is down it will trigger an initiation directly.
This works with VTI because it does not rely on trap policies.
Note
This feature is new in pfSense® Plus software version 22.01 and CE 2.6.0.
- Child SA Actions:
Another tactic to keep a tunnel up is to set it to initiate immediately at start and automatically reconnect if it gets disconnected. This should only be set on one side of a tunnel.
- Child SA Start Action:
Set the start action to Initiate at start. This will trigger a tunnel initiation when the IPsec daemon starts, such as at boot time.
Note
This does not trigger when the IPsec configuration is changed and reloaded, only when the daemon loads the configuration the first time at startup.
- Child SA Close Action:
Set the close action to Restart/Reconnect which will attempt to immediately reconnect the child SA if it gets disconnected.
Depending on the reason the tunnel was disconnected, this may or may not be helpful. For example, if the reason the tunnel disconnected was a local cause, these events may not trigger. The periodic check keep alive method is much more reliable, but only available on current versions of pfSense software.
Tunnel stops attempting connections after timeout¶
If the remote end of an IPsec tunnel is down when the tunnel attempts to initiate at start, but fails, it may eventually times out and stop trying to connect.
The solution here is similar to the previous scenario above, which is to enable keep alive options for the tunnel which will trigger a fresh initiation periodically if the tunnel is down.