1.. SPDX-License-Identifier: GPL-2.0-only 2.. Copyright (C) 2022 Red Hat, Inc. 3 4================================================= 5BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH 6================================================= 7 8.. note:: 9 - ``BPF_MAP_TYPE_DEVMAP`` was introduced in kernel version 4.14 10 - ``BPF_MAP_TYPE_DEVMAP_HASH`` was introduced in kernel version 5.4 11 12``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` are BPF maps primarily 13used as backend maps for the XDP BPF helper call ``bpf_redirect_map()``. 14``BPF_MAP_TYPE_DEVMAP`` is backed by an array that uses the key as 15the index to lookup a reference to a net device. While ``BPF_MAP_TYPE_DEVMAP_HASH`` 16is backed by a hash table that uses a key to lookup a reference to a net device. 17The user provides either <``key``/ ``ifindex``> or <``key``/ ``struct bpf_devmap_val``> 18pairs to update the maps with new net devices. 19 20.. note:: 21 - The key to a hash map doesn't have to be an ``ifindex``. 22 - While ``BPF_MAP_TYPE_DEVMAP_HASH`` allows for densely packing the net devices 23 it comes at the cost of a hash of the key when performing a look up. 24 25The setup and packet enqueue/send code is shared between the two types of 26devmap; only the lookup and insertion is different. 27 28Usage 29===== 30Kernel BPF 31---------- 32bpf_redirect_map() 33^^^^^^^^^^^^^^^^^^ 34.. code-block:: c 35 36 long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 37 38Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 39For ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` this map contains 40references to net devices (for forwarding packets through other ports). 41 42The lower two bits of *flags* are used as the return code if the map lookup 43fails. This is so that the return value can be one of the XDP program return 44codes up to ``XDP_TX``, as chosen by the caller. The higher bits of ``flags`` 45can be set to ``BPF_F_BROADCAST`` or ``BPF_F_EXCLUDE_INGRESS`` as defined 46below. 47 48With ``BPF_F_BROADCAST`` the packet will be broadcast to all the interfaces 49in the map, with ``BPF_F_EXCLUDE_INGRESS`` the ingress interface will be excluded 50from the broadcast. 51 52.. note:: 53 - The key is ignored if BPF_F_BROADCAST is set. 54 - The broadcast feature can also be used to implement multicast forwarding: 55 simply create multiple DEVMAPs, each one corresponding to a single multicast group. 56 57This helper will return ``XDP_REDIRECT`` on success, or the value of the two 58lower bits of the ``flags`` argument if the map lookup fails. 59 60More information about redirection can be found :doc:`redirect` 61 62bpf_map_lookup_elem() 63^^^^^^^^^^^^^^^^^^^^^ 64.. code-block:: c 65 66 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 67 68Net device entries can be retrieved using the ``bpf_map_lookup_elem()`` 69helper. 70 71User space 72---------- 73.. note:: 74 DEVMAP entries can only be updated/deleted from user space and not 75 from an eBPF program. Trying to call these functions from a kernel eBPF 76 program will result in the program failing to load and a verifier warning. 77 78bpf_map_update_elem() 79^^^^^^^^^^^^^^^^^^^^^ 80.. code-block:: c 81 82 int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags); 83 84Net device entries can be added or updated using the ``bpf_map_update_elem()`` 85helper. This helper replaces existing elements atomically. The ``value`` parameter 86can be ``struct bpf_devmap_val`` or a simple ``int ifindex`` for backwards 87compatibility. 88 89 .. code-block:: c 90 91 struct bpf_devmap_val { 92 __u32 ifindex; /* device index */ 93 union { 94 int fd; /* prog fd on map write */ 95 __u32 id; /* prog id on map read */ 96 } bpf_prog; 97 }; 98 99The ``flags`` argument can be one of the following: 100 - ``BPF_ANY``: Create a new element or update an existing element. 101 - ``BPF_NOEXIST``: Create a new element only if it did not exist. 102 - ``BPF_EXIST``: Update an existing element. 103 104DEVMAPs can associate a program with a device entry by adding a ``bpf_prog.fd`` 105to ``struct bpf_devmap_val``. Programs are run after ``XDP_REDIRECT`` and have 106access to both Rx device and Tx device. The program associated with the ``fd`` 107must have type XDP with expected attach type ``xdp_devmap``. 108When a program is associated with a device index, the program is run on an 109``XDP_REDIRECT`` and before the buffer is added to the per-cpu queue. Examples 110of how to attach/use xdp_devmap progs can be found in the kernel selftests: 111 112- ``tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c`` 113- ``tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c`` 114 115bpf_map_lookup_elem() 116^^^^^^^^^^^^^^^^^^^^^ 117.. code-block:: c 118 119.. c:function:: 120 int bpf_map_lookup_elem(int fd, const void *key, void *value); 121 122Net device entries can be retrieved using the ``bpf_map_lookup_elem()`` 123helper. 124 125bpf_map_delete_elem() 126^^^^^^^^^^^^^^^^^^^^^ 127.. code-block:: c 128 129.. c:function:: 130 int bpf_map_delete_elem(int fd, const void *key); 131 132Net device entries can be deleted using the ``bpf_map_delete_elem()`` 133helper. This helper will return 0 on success, or negative error in case of 134failure. 135 136Examples 137======== 138 139Kernel BPF 140---------- 141 142The following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP`` 143called tx_port. 144 145.. code-block:: c 146 147 struct { 148 __uint(type, BPF_MAP_TYPE_DEVMAP); 149 __type(key, __u32); 150 __type(value, __u32); 151 __uint(max_entries, 256); 152 } tx_port SEC(".maps"); 153 154The following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP_HASH`` 155called forward_map. 156 157.. code-block:: c 158 159 struct { 160 __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); 161 __type(key, __u32); 162 __type(value, struct bpf_devmap_val); 163 __uint(max_entries, 32); 164 } forward_map SEC(".maps"); 165 166.. note:: 167 168 The value type in the DEVMAP above is a ``struct bpf_devmap_val`` 169 170The following code snippet shows a simple xdp_redirect_map program. This program 171would work with a user space program that populates the devmap ``forward_map`` based 172on ingress ifindexes. The BPF program (below) is redirecting packets using the 173ingress ``ifindex`` as the ``key``. 174 175.. code-block:: c 176 177 SEC("xdp") 178 int xdp_redirect_map_func(struct xdp_md *ctx) 179 { 180 int index = ctx->ingress_ifindex; 181 182 return bpf_redirect_map(&forward_map, index, 0); 183 } 184 185The following code snippet shows a BPF program that is broadcasting packets to 186all the interfaces in the ``tx_port`` devmap. 187 188.. code-block:: c 189 190 SEC("xdp") 191 int xdp_redirect_map_func(struct xdp_md *ctx) 192 { 193 return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); 194 } 195 196User space 197---------- 198 199The following code snippet shows how to update a devmap called ``tx_port``. 200 201.. code-block:: c 202 203 int update_devmap(int ifindex, int redirect_ifindex) 204 { 205 int ret; 206 207 ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0); 208 if (ret < 0) { 209 fprintf(stderr, "Failed to update devmap_ value: %s\n", 210 strerror(errno)); 211 } 212 213 return ret; 214 } 215 216The following code snippet shows how to update a hash_devmap called ``forward_map``. 217 218.. code-block:: c 219 220 int update_devmap(int ifindex, int redirect_ifindex) 221 { 222 struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex }; 223 int ret; 224 225 ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0); 226 if (ret < 0) { 227 fprintf(stderr, "Failed to update devmap_ value: %s\n", 228 strerror(errno)); 229 } 230 return ret; 231 } 232 233References 234=========== 235 236- https://lwn.net/Articles/728146/ 237- https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=6f9d451ab1a33728adb72d7ff66a7b374d665176 238- https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4106 239