Tip

This is the documentation for the 20.02 version. Looking for the documentation of the latest version? Have a look here.

Dataplane Configuration

For the majority of cases the default dataplane configuration is sufficient, but certain cases may require adjustments. These are often covered in more detail throughout the documentation, and relevant sections will be linked where appropriate.

These commands are all available in config mode (Configuration Mode).

Warning

The dataplane service requires a restart to enable configuration changes described in this section. After making changes, restart the dataplane from config mode using the following command:

tnsr# configure
tnsr(config)# service dataplane restart

Buffers

The commands in this section control the amount of memory pre-allocated by the dataplane for buffers.

Buffers per NUMA

Systems with multiple CPU sockets and Non-uniform memory access (NUMA) capabilities may need specific tuning to ensure that enough buffer space is available for the number of separate NUMA nodes. The number of NUMA nodes is typically the number of populated CPU sockets. Specifically, the scenarios which require tuning typically involve a large number of interfaces combined with multiple CPU worker threads.

Note

This refers to separate hardware CPUs, not a single CPU with multiple cores.

The dataplane buffers buffers-per-numa <buffers-per-numa> command allocates the given number of buffers for each CPU socket (e.g. 16384).

Default Data Size

The dataplane buffers default-data-size <default-data-size> controls the default size of each buffer, in bytes (e.g. 2048).

CPU Workers and Affinity

The dataplane has a variety of commands to fine-tune how it uses available CPU resources on the host. These commands control CPU cores TNSR will use, both the number of cores and specific cores.

See also

Cores defined here may also be pinned to interface receive (RX) queues, provided that cores are defined using either the corelist-workers or coremask-workers methods. See Interface Configuration Options for details.

Worker Configuration

dataplane cpu corelist-workers <first> [- <last>]:

Defines a specific list of CPU cores to be used by the dataplane. The command supports adding single cores to the list at a time, or ranges of cores. Run the command multiple times with different core numbers or ranges to define the full list of cores to utilize. When removing items with no, the command accepts a specific core to remove from the list.

dataplane cpu coremask-workers <mask>:

Similar to corelist-workers, but the cores are defined as a hexadecimal mask instead of a list. For example, 0x0000000000C0000C

dataplane cpu main-core <n>:

Assigns the main dataplane process to a specific CPU core.

dataplane cpu scheduler-policy (batch|fifo|idle|other|rr):

Defines a specific scheduler policy for worker thread processor usage allocation

batch:

Scheduling batch processes. Uses dynamic priorities based on nice values in the host OS, but always gives the thread a small scheduling penalty so that other processes take precedence.

fifo:

First in-first out scheduling. Will preempt other types of threads and threads with a lower priority.

idle:

Scheduling very low priority jobs.

other:

Default Linux time-sharing scheduling. Uses dynamic priorities based on nice values in the host OS, similar to batch but without the built-in penalty.

rr:

Round-robin scheduling. Similar to fifo but each thread is time-limited

dataplane cpu scheduler-priority <n>:

For the fifo and rr scheduler policies, this number sets the priority of processes for the dataplane. It can be any number between 1 (low) and 99 (high).

dataplane cpu skip-cores <n>:

Defines the number of cores to skip when creating additional worker threads, in the range of 1 to the highest available core number. The first <n> cores will not be used by worker threads.

Note

This does not affect the core used by the main thread, which is set by dataplane cpu main-core <n>.

Warning

This option is incompatible with interface RX queue core pinning. To utilize interface RX queue core pinning, define a list of cores using either corelist-workers or coremask-workers instead.

dataplane cpu workers <n>:

Defines the number of worker threads to create for the dataplane.

Note

The number of worker threads is in addition to the main process. For example, with a worker count of 4, the dataplane will use one main process with four worker threads, for a total of five threads.

Warning

This option is incompatible with interface RX queue core pinning. To utilize interface RX queue core pinning, define a list of cores using either corelist-workers or coremask-workers instead.

Worker Example

This example sets four additional worker threads, and instructs the dataplane to skip one core when assigning worker threads to cores:

tnsr(config)# dataplane cpu workers 4
tnsr(config)# dataplane cpu skip-cores 1
tnsr(config)# service dataplane restart

Worker Status

The show dataplane cpu threads command displays the current dataplane process list, including the core usage and process IDs. This output corresponds to the example above:

tnsr(config)# show dataplane cpu threads
ID Name     Type    PID  LCore Core Socket
-- -------- ------- ---- ----- ---- ------
 0 vpp_main         2330     1    0      0
 1 vpp_wk_0 workers 2346     2    2      0
 2 vpp_wk_1 workers 2347     3    3      0
 3 vpp_wk_2 workers 2348     4    4      0
 4 vpp_wk_3 workers 2349     5    5      0

The output includes the following columns:

id:

Dataplane thread ID.

name:

Name of the dataplane process.

type:

The type of thread, which will be blank for the main process.

pid:

The host OS process ID for each thread.

LCore:

The logical core used by the process.

Core:

The CPU core used by the process.

Socket:

The CPU socket associated with the core used by the process.

DPDK Configuration

Commands in this section configure hardware settings for DPDK devices.

dataplane dpdk dev <pci-id> (crypto|crypto-vf|network) [num-rx-queues [<rq>]] [num-tx-queues [<tq>]] [num-rx-desc [<rd>]] [num-tx-desc [<td>]]:

Configures a specific dataplane device for use by TNSR.

crypto|crypto-vf:

Configures QAT devices for cryptographic acceleration. See Setup QAT Compatible Hardware for details.

network:

Configures network interface devices, see Setup NICs in Dataplane for details.

num-rx-queues [<rq>] num-tx-queues [<tq>]:

Receive and transmit queue sizes for this device.

num-rx-desc [<rd>]] [num-tx-desc [<td>]:

Receive and transmit descriptor sizes for this device. Certain network cards, such as Fortville models, may need the descriptors set to 2048 to avoid dropping packets at high loads.

dataplane dpdk iova-mode (pa|va):

Manually configures the IO Virtual Addresses (IOVA) mode used by DPDK when performing hardware IO from user space. Hardware must use IO addresses, but it cannot utilize user space virtual addresses directly. These IO addresses can be either physical addresses (PA) or virtual addresses (VA). No matter which mode is set, these are abstracted to TNSR as IOVA addresses so it does not need to use them directly.

In most cases the default IOVA mode selected by DPDK is optimal.

Warning

When the vfio-pci UIO driver is active, IOVA must be explicitly set to pa since the automatic selection of va will fail with that driver.

See also

For more detail on IOVA, consult the DPDK documentation.

pa:

Physical Address mode. IOVA addresses used by DPDK correspond to physical addresses, and both physical and virtual memory layouts match. This mode is safest from the perspective of the hardware, and is the mode chosen by default. Most hardware supports PA mode at a minimum.

The primary downside of PA mode is that memory fragmentation in physical space must also be reflected in virtual memory space.

va:

Virtual Address mode. IOVA addresses do not follow the layout of physical memory; Physical memory is changed to match the virtual memory instead. Because virtual memory appears as one continuous segment, large memory allocations are more likely to succeed.

The primary downside of VA mode is that it relies on kernel support and the availability of IOMMU.

dataplane dpdk no-tx-checksum-offload:

Disables transmit checksum offloading for network devices.

dataplane dpdk no-multi-seg:

Disables multi-segment buffers for network devices. Can improve performance, but disables jumbo MTU support. Recommended for Mellanox devices.

dataplane dpdk num-crypto-mbufs <num>:

Sets the number of memory buffers used by the dataplane for cryptographic tasks, in the range 1-4294967295. Higher values can improve throughput when the dataplane encrypt/decrypt nodes are processing data.

dataplane dpdk uio-driver [<driver-name>]:

Configures the UIO driver for interfaces. See Setup NICs in Dataplane.

dataplane dpdk vdev <sw-dev-type>:

Defines a software device to be used by the dataplane, such as:

aesni_gcm:

AESNI GCM cryptodev

aesni_mb:

AESNI multibuffer cryptodev

Memory

Commands in this section configure memory allocation for the dataplane.

dataplane (ip|ip6) heap-size [<size>]:

Defines the amount of memory to be allocated for the dataplane FIB. The default is 32MB. For more information, see Working with Large BGP Tables.

Note

When tuning this value, also consider increasing the Statistics Segment heap-size.

dataplane ip6 hash-buckets [<size>]:

Defines the number of IPv6 forwarding table hash buckets. The default is 65536.

NAT

Commands in this section configure dataplane NAT behavior.

dataplane nat dslite-ce:

Enables DS-Lite CE mode.

dataplane nat max-translations-per-user <n>:

Defines the number of NAT translation entries to allow for each IP address. The default value is 10240, but it can be set to any integer value between 1-262144. The ideal value depends entirely on the environment and number of sessions per IP address involved in NAT. This includes traffic sourced from TNSR itself address as well, not only internal source IP addresses.

dataplane nat mode (deterministic|endpoint-dependent|simple):

Configures the operating NAT mode. See Dataplane NAT Modes.

dataplane nat mode-options simple (out2in-dpo|static-mapping-only):

Configures options for the NAT mode. See Dataplane NAT Modes.

NAT Memory

Memory available for NAT functions can also be tuned to scale for larger operations. The following paramaters are available:

dataplane nat user hash buckets <size>:

Number of buckets in NAT user lookup hash table. Can be from 1-65535, default 128.

dataplane nat user hash memory <size>:

Memory size of NAT user lookup hash table. Can be from 1-4294967295, default 67108864 (64MiB).

dataplane nat translation hash buckets <size>:

Number of buckets in session lookup hash tables. Can be from 1-65535, default 1024.

dataplane nat translation hash memory <size>:

Memory size of session lookup hash tables. Can be from 1-4294967295, default 134217728 (128MiB).

With the default user hash memory, each user hash bucket can contain approximately 512 active elements (“sessions”). To determine the total number of supported NAT sessions, multiply:

128 (user hash buckets) x 512 (max elements per user hash bucket) = 65,536 NAT sessions

To support more than 65,536 NAT sessions, NAT user hash memory must be increased along with NAT user hash buckets. In the case of user hash, a single client may consume many elements/sessions, limited by the nat max-translations-per-user option mentioned previously in this section.

The nat translation options are similar to the nat user options, but are utilized for endpoint-dependent NAT lookup tables.

Statistics Segment

These commands configure the statistics segment parameters for the dataplane. This feature enables local access to dataplane statistics via shared memory.

See also

For more information on how to make use of this feature, see the VPP documentation for the example stat_client.

dataplane statseg heap-size <heap-size>[kKmMgG]:

Size of shared memory allocation for stats segment, in bytes. This value can be suffixed with K (kilobytes), M (megabytes), or G (gigabytes) in upper or lowercase. Default value is 96M.

Note

This value may need to be increased to accommodate large amounts of routes in routing tables. The default value of 96M can safely accommodate approximately one million routes.

The statistics segment is used to maintain counters for routes, and when multiple worker threads are used, these counters are maintained in each thread. Each counter consumes 16 bytes, and there are two counters for each route. When computing these memory requirements, also keep in mind that the main thread counts in addition to each worker thread. For example, with two worker threads, there are actually three threads total.

The total memory required for route counters alone will be: <routes> * <threads> * 2 counters * 16 Bytes. Additionally, when new memory is being allocated, it must be in a contiguous segment approximately 1.5x the size calculated above. This can negatively impact memory allocation in cases where usage of the statistics segment has become fragmented after repeated allocations and reallocations. All these factors combined mean that when using a large number of routes with multiple worker threads, this value should be given a generous increase over expected normal values.

The dataplane may crash and state that it is out of memory if this value is set too low.

dataplane statseg per-node-counters enable:

Enables per-graph-node performance statistics.

dataplane statseg socket-name <socket-name>:

Absolute path to UNIX domain socket for stats segment. The default path is /run/vpp/stats.sock.