Troubleshooting IPsec Connections

IPsec connection names

IPsec tunnels follow a consistent naming pattern when forming connection names used in the strongSwan configuration. These names are printed in the IPsec status and can also be found in the IPsec configuration file (/var/etc/ipsec/swanctl.conf), the IPsec log, and the output of various swanctl commands.

Non-mobile tunnels all use an IKE connection named conX where X is the phase 1 IKE ID.

Phase 2 child definitions use slightly different names based on the tunnel settings:

For normal IKEv2 tunnels without Split Connections enabled all phase 2 entries are combined into a single child definition. In this case the connections are named conX where X is the phase 1 IKE ID and this is identical to the name of the IKE portion of the connection.

For IKEv1 tunnels and for IKEv2 tunnels with Split Connections enabled each phase 2 entry is defined as a separate child. In this case the child definitions are named conX_Y where X is the phase 1 IKE ID and Y is the phase 2 reqid.

Note

The phase 1 IKE ID and phase 2 reqid are printed in the IPsec tunnel list and on the page when editing those entries.

To see a list of current connections, run the following command from the shell:

# swanctl --list-conns

The output of that command lists the IKE connection name first (e.g. con1) with no indentation. Child definitions are listed at the end of a tunnel entry and are indented.

Manually connect IPsec from the shell

Connections can be manually initiated and terminated from the shell using the swanctl command.

Tip

When initiating a tunnel in this way, swanctl will output only the relevant logs to the terminal. This is much easier than attempting to follow the log file contents in other ways.

The connection name for a tunnel must be used in this case, such as con1 or con2_1.

Note

To locate the correct con identifier, see IPsec connection names.

The following command will attempt to initiate the IKE portion of a tunnel (phase 1):

# swanctl --initiate --ike conX

The following command will attempt to initiate the child SA portion of a tunnel (phase 2) as well as IKE if it is not already connected:

# swanctl --initiate --child conX

Terminating a tunnel uses similar syntax.

Terminate IKE connection (also terminates all child connections):

# swanctl --terminate --ike conX

Terminate a child connection:

# swanctl --terminate --child conX

Tunnel does not establish

First check the service status at Status > Services. If the IPsec service is stopped, check if there is at least one configured and enabled IPsec tunnel (IPsec Tunnels Tab).

If the service is running, check the firewall logs at Status > System Logs, Firewall tab. Look for entries that indicate that the connection is being blocked. If the tunnel is not establishing, check for UDP entries for ports 500 and 4500. Rules are normally added automatically for IPsec (IPsec and firewall rules), but that feature can be disabled or there may be edge cases where the firewall cannot identify the remote IPsec gateway. Add rules to pass traffic if needed.

The single most common cause of failed IPsec tunnel connections is a configuration mismatch. Often it is something small, such as a DH group set differently, or perhaps a subnet mask of /24 on one side and /32 on the other in the phase 2 networks. Some routers (Linksys, for one) also like to hide certain options behind “Advanced” buttons or make assumptions. A lot of trial and error may be involved, and a lot of log reading, but ensuring that both sides match precisely will help the most.

Depending on the Internet connections on either end of the tunnel, it is also possible that a router involved on one side or the other does not properly handle IPsec traffic. This is a larger concern with mobile clients and networks where NAT is involved outside of the actual IPsec endpoints. The problems are generally with the ESP protocol and problems with it being blocked or mishandled along the way. NAT Traversal (NAT-T) encapsulates ESP in UDP port 4500 traffic to work around these issues. Typically this situation is detected automatically but in some edge cases it can help to force NAT traversal for IKEv1 tunnels.

“Random” tunnel disconnects/DPD failures on low-end routers

If IPsec tunnels are dropped on low-end hardware that is pushing the limits of its CPU, DPD on the tunnel may need disabled. Such failures tend to correlate with times of high bandwidth usage. This happens when the CPU on a low-power system is tied up with sending IPsec traffic or is otherwise occupied. Due to the CPU overload it may not take the time to respond to DPD requests or see a response to a request of its own. As a consequence, the tunnel will fail a DPD check and be disconnected. This is a clear sign that the hardware is being driven beyond its capacity. If this happens, consider replacing the firewall with a more powerful model.

Tunnels establish and work but fail to renegotiate

In some cases a tunnel will function properly but once the phase 1 or phase 2 lifetime expires the tunnel will fail to renegotiate properly. This can manifest itself in a few different ways, each with a different resolution.

DPD is unsupported and one side drops while the other remains

Consider this scenario, which DPD is designed to prevent, but can happen in places where DPD is unsupported:

  • A tunnel is established from Site A to Site B, from traffic initiated at Site A.

  • Site B expires the phase 1 or phase 2 before Site A

  • Site A will believe the tunnel is up and continue to send traffic as though the tunnel is working properly.

  • Only when the Site A phase 1 or phase 2 lifetime expires will it renegotiate as expected.

In this scenario, the likely things resolutions are:

  • Check to make sure all of the settings match on both sides, especially the phase 1 DH Group and phase 2 PFS values.

  • Enable DPD, or Site B must send traffic to Site A which will cause the entire tunnel to renegotiate. The easiest way to make this happen is to enable a keep alive mechanism on both sides of the tunnel.

  • Enable the periodic check keep alive method on one end (Configuring IPsec Keep Alive)

Tunnel establishes when initiating but not when responding

If a tunnel will establish sometimes, but not always, generally there is a settings mismatch. The tunnel may still establish because if the settings presented by one side are more secure the other may accept them, but not the other way around.

Lifetime mismatches do not cause a failure in phase 1 or phase 2.

To track down these failures, configure the logs as shown in Troubleshooting IPsec Logs and attempt to initiate the tunnel from each side, then check the logs.

Tunnel establishes at start but not when disconnected

An IPsec tunnel can be disconnected for a variety of reasons. For example, connectivity being interrupted to the far side, the remote being down or offline for an extended time, or even a manual or policy action on the far side.

Note

This is not the same scenario as a rekey or reauthentication event, which will rebuild the appropriate parts of the tunnel and remain active.

A tunnel mode IPsec instance will connect at start and when it disconnects, will connect again on demand. This happens due to trap policies which trigger initiation when traffic attempts to use the tunnel. A tunnel mode IPsec connection can be reconnected without manual intervention by the automatic ping keep alive function on a phase 2 entry.

VTI mode IPsec cannot support trap policies so it is not capable of using this tactic. As such, a VTI tunnel may need help to stay up and running at all times.

There are a two workarounds that may help in this case:

Keep Alive - Periodic Check

The IPsec phase 2 Keep Alive option to perform a periodic IPsec status check is ideally suited to this case. When enabled, if a given phase 2 is down it will trigger an initiation directly.

This works with VTI because it does not rely on trap policies.

Note

This feature is new in pfSense® Plus software version 22.01 and CE 2.6.0.

Child SA Actions

Another tactic to keep a tunnel up is to set it to initiate immediately at start and automatically reconnect if it gets disconnected. This should only be set on one side of a tunnel.

Child SA Start Action

Set the start action to Initiate at start. This will trigger a tunnel initiation when the IPsec daemon starts, such as at boot time.

Note

This does not trigger when the IPsec configuration is changed and reloaded, only when the daemon loads the configuration the first time at startup.

Child SA Close Action

Set the close action to Restart/Reconnect which will attempt to immediately reconnect the child SA if it gets disconnected.

Depending on the reason the tunnel was disconnected, this may or may not be helpful. For example, if the reason the tunnel disconnected was a local cause, these events may not trigger. The periodic check keep alive method is much more reliable, but only available on current versions of pfSense software.

Tunnel stops attempting connections after timeout

If the remote end of an IPsec tunnel is down when the tunnel attempts to initiate at start, but fails, it may eventually times out and stop trying to connect.

The solution here is similar to the previous scenario above, which is to enable keep alive options for the tunnel which will trigger a fresh initiation periodically if the tunnel is down.