Tip
This is the documentation for the 20.02 version. Looking for the documentation of the latest version? Have a look here.
Dataplane Configuration¶
For the majority of cases the default dataplane configuration is sufficient, but certain cases may require adjustments. These are often covered in more detail throughout the documentation, and relevant sections will be linked where appropriate.
These commands are all available in config
mode
(Configuration Mode).
Warning
The dataplane service requires a restart to enable configuration
changes described in this section. After making changes, restart the
dataplane from config
mode using the following command:
tnsr# configure
tnsr(config)# service dataplane restart
Buffers¶
The commands in this section control the amount of memory pre-allocated by the dataplane for buffers.
Buffers per NUMA¶
Systems with multiple CPU sockets and Non-uniform memory access (NUMA) capabilities may need specific tuning to ensure that enough buffer space is available for the number of separate NUMA nodes. The number of NUMA nodes is typically the number of populated CPU sockets. Specifically, the scenarios which require tuning typically involve a large number of interfaces combined with multiple CPU worker threads.
Note
This refers to separate hardware CPUs, not a single CPU with multiple cores.
The dataplane buffers buffers-per-numa <buffers-per-numa>
command allocates
the given number of buffers for each CPU socket (e.g. 16384
).
Default Data Size¶
The dataplane buffers default-data-size <default-data-size>
controls the
default size of each buffer, in bytes (e.g. 2048
).
CPU Workers and Affinity¶
The dataplane has a variety of commands to fine-tune how it uses available CPU resources on the host. These commands control CPU cores TNSR will use, both the number of cores and specific cores.
See also
Cores defined here may also be pinned to interface receive (RX)
queues, provided that cores are defined using either the corelist-workers
or coremask-workers
methods. See Interface Configuration Options for
details.
Worker Configuration¶
- dataplane cpu corelist-workers <first> [- <last>]:
Defines a specific list of CPU cores to be used by the dataplane. The command supports adding single cores to the list at a time, or ranges of cores. Run the command multiple times with different core numbers or ranges to define the full list of cores to utilize. When removing items with
no
, the command accepts a specific core to remove from the list.- dataplane cpu coremask-workers <mask>:
Similar to
corelist-workers
, but the cores are defined as a hexadecimal mask instead of a list. For example,0x0000000000C0000C
- dataplane cpu main-core <n>:
Assigns the main dataplane process to a specific CPU core.
- dataplane cpu scheduler-policy (batch|fifo|idle|other|rr):
Defines a specific scheduler policy for worker thread processor usage allocation
- batch:
Scheduling batch processes. Uses dynamic priorities based on
nice
values in the host OS, but always gives the thread a small scheduling penalty so that other processes take precedence.- fifo:
First in-first out scheduling. Will preempt other types of threads and threads with a lower priority.
- idle:
Scheduling very low priority jobs.
- other:
Default Linux time-sharing scheduling. Uses dynamic priorities based on
nice
values in the host OS, similar tobatch
but without the built-in penalty.- rr:
Round-robin scheduling. Similar to
fifo
but each thread is time-limited
- dataplane cpu scheduler-priority <n>:
For the
fifo
andrr
scheduler policies, this number sets the priority of processes for the dataplane. It can be any number between 1 (low) and 99 (high).- dataplane cpu skip-cores <n>:
Defines the number of cores to skip when creating additional worker threads, in the range of
1
to the highest available core number. The first<n>
cores will not be used by worker threads.Note
This does not affect the core used by the main thread, which is set by
dataplane cpu main-core <n>
.Warning
This option is incompatible with interface RX queue core pinning. To utilize interface RX queue core pinning, define a list of cores using either
corelist-workers
orcoremask-workers
instead.- dataplane cpu workers <n>:
Defines the number of worker threads to create for the dataplane.
Note
The number of worker threads is in addition to the main process. For example, with a worker count of
4
, the dataplane will use one main process with four worker threads, for a total of five threads.Warning
This option is incompatible with interface RX queue core pinning. To utilize interface RX queue core pinning, define a list of cores using either
corelist-workers
orcoremask-workers
instead.
Worker Example¶
This example sets four additional worker threads, and instructs the dataplane to skip one core when assigning worker threads to cores:
tnsr(config)# dataplane cpu workers 4
tnsr(config)# dataplane cpu skip-cores 1
tnsr(config)# service dataplane restart
Worker Status¶
The show dataplane cpu threads command displays the current dataplane process list, including the core usage and process IDs. This output corresponds to the example above:
tnsr(config)# show dataplane cpu threads
ID Name Type PID LCore Core Socket
-- -------- ------- ---- ----- ---- ------
0 vpp_main 2330 1 0 0
1 vpp_wk_0 workers 2346 2 2 0
2 vpp_wk_1 workers 2347 3 3 0
3 vpp_wk_2 workers 2348 4 4 0
4 vpp_wk_3 workers 2349 5 5 0
The output includes the following columns:
- id:
Dataplane thread ID.
- name:
Name of the dataplane process.
- type:
The type of thread, which will be blank for the main process.
- pid:
The host OS process ID for each thread.
- LCore:
The logical core used by the process.
- Core:
The CPU core used by the process.
- Socket:
The CPU socket associated with the core used by the process.
DPDK Configuration¶
Commands in this section configure hardware settings for DPDK devices.
- dataplane dpdk dev <pci-id> (crypto|crypto-vf|network) [num-rx-queues [<rq>]] [num-tx-queues [<tq>]] [num-rx-desc [<rd>]] [num-tx-desc [<td>]]:
Configures a specific dataplane device for use by TNSR.
- crypto|crypto-vf:
Configures QAT devices for cryptographic acceleration. See Setup QAT Compatible Hardware for details.
- network:
Configures network interface devices, see Setup NICs in Dataplane for details.
- num-rx-queues [<rq>] num-tx-queues [<tq>]:
Receive and transmit queue sizes for this device.
- num-rx-desc [<rd>]] [num-tx-desc [<td>]:
Receive and transmit descriptor sizes for this device. Certain network cards, such as Fortville models, may need the descriptors set to
2048
to avoid dropping packets at high loads.
- dataplane dpdk iova-mode (pa|va):
Manually configures the IO Virtual Addresses (IOVA) mode used by DPDK when performing hardware IO from user space. Hardware must use IO addresses, but it cannot utilize user space virtual addresses directly. These IO addresses can be either physical addresses (PA) or virtual addresses (VA). No matter which mode is set, these are abstracted to TNSR as IOVA addresses so it does not need to use them directly.
In most cases the default IOVA mode selected by DPDK is optimal.
Warning
When the
vfio-pci
UIO driver is active, IOVA must be explicitly set topa
since the automatic selection ofva
will fail with that driver.See also
For more detail on IOVA, consult the DPDK documentation.
- pa:
Physical Address mode. IOVA addresses used by DPDK correspond to physical addresses, and both physical and virtual memory layouts match. This mode is safest from the perspective of the hardware, and is the mode chosen by default. Most hardware supports PA mode at a minimum.
The primary downside of PA mode is that memory fragmentation in physical space must also be reflected in virtual memory space.
- va:
Virtual Address mode. IOVA addresses do not follow the layout of physical memory; Physical memory is changed to match the virtual memory instead. Because virtual memory appears as one continuous segment, large memory allocations are more likely to succeed.
The primary downside of VA mode is that it relies on kernel support and the availability of IOMMU.
- dataplane dpdk no-tx-checksum-offload:
Disables transmit checksum offloading for network devices.
- dataplane dpdk no-multi-seg:
Disables multi-segment buffers for network devices. Can improve performance, but disables jumbo MTU support. Recommended for Mellanox devices.
- dataplane dpdk num-crypto-mbufs <num>:
Sets the number of memory buffers used by the dataplane for cryptographic tasks, in the range
1-4294967295
. Higher values can improve throughput when the dataplane encrypt/decrypt nodes are processing data.- dataplane dpdk uio-driver [<driver-name>]:
Configures the UIO driver for interfaces. See Setup NICs in Dataplane.
- dataplane dpdk vdev <sw-dev-type>:
Defines a software device to be used by the dataplane, such as:
- aesni_gcm:
AESNI GCM cryptodev
- aesni_mb:
AESNI multibuffer cryptodev
Memory¶
Commands in this section configure memory allocation for the dataplane.
- dataplane (ip|ip6) heap-size [<size>]:
Defines the amount of memory to be allocated for the dataplane FIB. The default is 32MB. For more information, see Working with Large BGP Tables.
Note
When tuning this value, also consider increasing the Statistics Segment
heap-size
.- dataplane ip6 hash-buckets [<size>]:
Defines the number of IPv6 forwarding table hash buckets. The default is
65536
.
NAT¶
Commands in this section configure dataplane NAT behavior.
- dataplane nat dslite-ce:
Enables DS-Lite CE mode.
- dataplane nat max-translations-per-user <n>:
Defines the number of NAT translation entries to allow for each IP address. The default value is
10240
, but it can be set to any integer value between1-262144
. The ideal value depends entirely on the environment and number of sessions per IP address involved in NAT. This includes traffic sourced from TNSR itself address as well, not only internal source IP addresses.- dataplane nat mode (deterministic|endpoint-dependent|simple):
Configures the operating NAT mode. See Dataplane NAT Modes.
- dataplane nat mode-options simple (out2in-dpo|static-mapping-only):
Configures options for the NAT mode. See Dataplane NAT Modes.
NAT Memory¶
Memory available for NAT functions can also be tuned to scale for larger operations. The following paramaters are available:
- dataplane nat user hash buckets <size>:
Number of buckets in NAT user lookup hash table. Can be from
1-65535
, default128
.- dataplane nat user hash memory <size>:
Memory size of NAT user lookup hash table. Can be from
1-4294967295
, default67108864
(64MiB).- dataplane nat translation hash buckets <size>:
Number of buckets in session lookup hash tables. Can be from
1-65535
, default1024
.- dataplane nat translation hash memory <size>:
Memory size of session lookup hash tables. Can be from
1-4294967295
, default134217728
(128MiB).
With the default user hash memory, each user hash bucket can contain
approximately 512
active elements (“sessions”). To determine the total
number of supported NAT sessions, multiply:
128 (user hash buckets) x 512 (max elements per user hash bucket) = 65,536 NAT
sessions
To support more than 65,536 NAT sessions, NAT user hash memory must be increased
along with NAT user hash buckets. In the case of user hash, a single client may
consume many elements/sessions, limited by the nat max-translations-per-user
option mentioned previously in this section.
The nat translation
options are similar to the nat user
options, but are
utilized for endpoint-dependent NAT lookup tables.
Statistics Segment¶
These commands configure the statistics segment parameters for the dataplane. This feature enables local access to dataplane statistics via shared memory.
See also
For more information on how to make use of this feature, see the VPP documentation for the example stat_client.
- dataplane statseg heap-size <heap-size>[kKmMgG]:
Size of shared memory allocation for stats segment, in bytes. This value can be suffixed with
K
(kilobytes),M
(megabytes), orG
(gigabytes) in upper or lowercase. Default value is96M
.Note
This value may need to be increased to accommodate large amounts of routes in routing tables. The default value of
96M
can safely accommodate approximately one million routes.The statistics segment is used to maintain counters for routes, and when multiple worker threads are used, these counters are maintained in each thread. Each counter consumes 16 bytes, and there are two counters for each route. When computing these memory requirements, also keep in mind that the main thread counts in addition to each worker thread. For example, with two worker threads, there are actually three threads total.
The total memory required for route counters alone will be:
<routes> * <threads> * 2 counters * 16 Bytes
. Additionally, when new memory is being allocated, it must be in a contiguous segment approximately 1.5x the size calculated above. This can negatively impact memory allocation in cases where usage of the statistics segment has become fragmented after repeated allocations and reallocations. All these factors combined mean that when using a large number of routes with multiple worker threads, this value should be given a generous increase over expected normal values.The dataplane may crash and state that it is out of memory if this value is set too low.
- dataplane statseg per-node-counters enable:
Enables per-graph-node performance statistics.
- dataplane statseg socket-name <socket-name>:
Absolute path to UNIX domain socket for stats segment. The default path is
/run/vpp/stats.sock
.