1.. _llext_debug:
2
3Debugging extensions
4####################
5
6Debugging extensions is a complex task. Since the extension code is by
7definition not built with the Zephyr application, the final Zephyr ELF file
8does not contain the symbols for extension code. Furthermore, the extension is
9dynamically relocated by :c:func:`llext_load` at runtime, so even if the
10symbols were available, it would be impossible for the debugger to know the
11final locations of the symbols in the extension code.
12
13Setting up the debugger session properly in this case requires a few manual
14steps. The following sections will provide some tips on how to do it with the
15Zephyr SDK and the debug features provided by ``west``, but the instructions
16can be adapted to any GDB-based debugging environment.
17
18Extension debugging process
19===========================
20
211. Make sure the project is set up to display the verbose LLEXT debug output
22   (:kconfig:option:`CONFIG_LOG` and :kconfig:option:`CONFIG_LLEXT_LOG_LEVEL_DBG`
23   are set).
24
252. Build the Zephyr application and the extensions.
26
27   For each target ``name`` included in the current build, two files will be
28   generated into the ``llext`` subdirectory of the build root:
29
30   ``name_ext_debug.elf``
31
32        An intermediate ELF file with full debugging information.
33
34   ``name.llext``
35
36        The final extension binary, stripped to the essential data required for
37        loading into the Zephyr application.
38
39   Other files may be present, depending on the target architecture and the
40   build configuration.
41
423. Start a debugging session of the main Zephyr application. This is described
43   in the :ref:`Debugging <west-debugging>` section of the documentation; on
44   supported boards it is as easy as running ``west debug``, perhaps with some
45   additional arguments.
46
474. Set a breakpoint just after the :c:func:`llext_load` function in your code
48   and let it run. This will load the extension into memory and relocate it.
49   The output logs will contain a line with ``gdb add-symbol-file flags:``,
50   followed by lines all starting with ``-s``.
51
525. Type this command in the GDB console to load this extension's symbols:
53
54   .. code-block::
55
56      add-symbol-file <path-to-debug.elf> <load-addresses>
57
58   where ``<path-to-debug.elf>`` is the full path of the ELF file with debug
59   information identified in step 2, and ``<load-addresses>`` is a space
60   separated list of all the ``-s`` lines collected from the log in the
61   previous step.
62
636. The extension symbols are now available to the debugger. You can set
64   breakpoints, inspect variables, and step through the code as usual.
65
66Steps 4-6 can be repeated for every extension that is loaded by the
67application, if there are several.
68
69Symbol lookup issues
70====================
71
72.. warning::
73
74   It is almost certain that the loaded symbols will be shadowed by others in
75   the main application; for example, they may be located inside the memory
76   area of the ELF buffer or the LLEXT heap.
77
78   In this case GDB chooses the first known symbol and therefore associates the
79   addresses to some ``elf_buffer+0x123`` instead of an expected ``ext_fn``.
80   This further confuses its high-level operations like source stepping or
81   inspecting locals, since they are meaningless in that context.
82
83Two possible solutions to this problem are discussed in the following
84paragraphs.
85
86Discard all Zephyr symbols
87--------------------------
88
89The simplest option is to drop all the Zephyr application symbols from GDB by
90invoking ``add-symbol-file`` with no arguments, before step 5. This will
91however focus the debugging session to the llext only, as all information about
92the Zephyr application will be lost. For example, the debugger may not be able to
93properly follow stack traces outside the extension code.
94
95It is possible to use the same technique multiple times in the same session to
96switch between the main and extension symbol tables as required, but it rapidly
97becomes cumbersome.
98
99Edit the ELF file
100-----------------
101
102This alternative is more complex but allows for a more seamless debugging
103experience. The idea is to edit the main Zephyr ELF file to remove information
104about the symbols that overlap with the extension that is to be debugged, so
105that when the extension symbols are loaded, GDB will not have any ambiguity.
106This can be done by using ``objcopy`` with the ``-N <symbol>`` option.
107
108Identifying the offending symbols is however an iterative trial-and-error
109procedure, as there can be many different layers; for example, the ELF buffer
110may be itself contained in a symbol for the data segment. Fortunately, this
111knowledge can then be used several times as the list is unlikely to change for
112a given project.
113
114Example debugging session
115=========================
116
117This example demonstrates how to debug the ``detached_fn`` extension in the
118``tests/subsys/llext`` project (specifically, the ``writable`` case), on an
119emulated ``mps2/an385`` board which is based on an ARM Cortex-M3.
120
121.. note::
122
123   The logs below have been obtained using Zephyr version 4.1 and the Zephyr
124   SDK version 0.17.0. However, the exact addresses may still vary between
125   runs even when using the same versions. Adjust the commands below to
126   match the results of your own session.
127
128The following command will build the project and start the emulator in
129debugging mode:
130
131.. code-block::
132   :caption: Terminal 1 (build, QEMU emulator, GDB server)
133
134   zephyr$ west build -p -b mps2/an385 tests/subsys/llext/ -T llext.writable -t debugserver_qemu
135   -- west build: generating a build system
136   [...]
137   -- west build: running target debugserver_qemu
138   [...]
139   [186/187] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: cortex-m3
140
141On a separate terminal, set ``ZEPHYR_SDK_INSTALL_DIR`` to the directory for the
142Zephyr SDK on your installation, then start the GDB client for the target:
143
144.. code-block::
145   :caption: Terminal 2 (GDB client)
146
147   zephyr$ export LLEXT_SDK_INSTALL_DIR=/opt/zephyr-sdk-0.17.0
148   zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb build/zephyr/zephyr.elf
149   GNU gdb (Zephyr SDK 0.17.0) 12.1
150   [...]
151   Reading symbols from build/zephyr/zephyr.elf...
152   (gdb)
153
154Connect, set a breakpoint on the ``llext_load`` function and run until it
155finishes:
156
157.. code-block::
158   :caption: Terminal 2 (GDB client)
159
160   (gdb) target extended-remote :1234
161   Remote debugging using :1234
162   z_arm_reset () at zephyr/arch/arm/core/cortex_m/reset.S:124
163   124         movs.n r0, #_EXC_IRQ_DEFAULT_PRIO
164   (gdb) break llext_load
165   Breakpoint 1 at 0x236c: file zephyr/subsys/llext/llext.c, line 168.
166   (gdb) continue
167   Continuing.
168
169   Breakpoint 1, llext_load (ldr=ldr@entry=0x2000bef0 <ztest_thread_stack+3488>,
170                             name=name@entry=0x9d98 "test_detached",
171                             ext=ext@entry=0x2000abb8 <detached_llext>,
172                             ldr_parm=ldr_parm@entry=0x2000bee8 <ztest_thread_stack+3480>)
173                 at zephyr/subsys/llext/llext.c:168
174   168             *ext = llext_by_name(name);
175   (gdb) finish
176   Run till exit from #0  llext_load ([...])
177       at zephyr/subsys/llext/llext.c:168
178   llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:481
179   481             zassert_ok(res, "load should succeed");
180
181The first terminal will have printed lots of debugging information related to
182the extension loading. Find the section with the addresses:
183
184.. code-block::
185   :caption: Terminal 1 (build, QEMU emulator, GDB server)
186
187   [...]
188   D: Allocate and copy regions...
189   [...]
190   D: gdb add-symbol-file flags:
191   D: -s .text 0x20000034
192   D: -s .data 0x200000b4
193   D: -s .bss 0x2000c2e0
194   D: -s .rodata 0x200000b8
195   D: -s .detach 0x200001d0
196   D: Counting exported symbols...
197   [...]
198
199Use these addresses to load the symbols into GDB:
200
201.. code-block::
202   :caption: Terminal 2 (GDB client)
203
204   (gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf -s .text 0x20000034 -s .data 0x200000b4 -s .bss 0x2000c2e0 -s .rodata 0x200000b8 -s .detach 0x200001d0
205   add symbol table from file "build/llext/detached_fn_ext_debug.elf" at
206           .text_addr = 0x20000034
207           .data_addr = 0x200000b4
208           .bss_addr = 0x2000c2e0
209           .rodata_addr = 0x200000b8
210           .detach_addr = 0x200001d0
211   (y or n) y
212   Reading symbols from build/llext/detached_fn_ext_debug.elf...
213   (gdb) break detached_entry
214   Breakpoint 2 at 0x200001d0 (2 locations)
215   (gdb) continue
216   Continuing.
217
218   Breakpoint 2, 0x200001d0 in test_detached_ext ()
219   (gdb) backtrace
220   #0  0x200001d0 in test_detached_ext ()
221   #1  0x200000ac in test_detached_ext ()
222   #2  0x00000706 in llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:496
223   #3  0x00001a36 in run_test_functions (suite=0x92bc <z_ztest_test_node_llext>, data=0x0 <cbvprintf_package>, test=0x92d8 <z_ztest_unit_test.llext.test_detached>) at zephyr/subsys/testsuite/ztest/src/ztest.c:328
224   #4  test_cb (a=0x92bc <z_ztest_test_node_llext>, b=0x92d8 <z_ztest_unit_test.llext.test_detached>, c=0x0 <cbvprintf_package>) at zephyr/subsys/testsuite/ztest/src/ztest.c:662
225   #5  0x00000e96 in z_thread_entry (entry=0x1a05 <test_cb>, p1=0x92bc <z_ztest_test_node_llext>, p2=0x92d8 <z_ztest_unit_test.llext.test_detached>, p3=0x0 <cbvprintf_package>) at zephyr/lib/os/thread_entry.c:48
226   #6  0x00000000 in ?? ()
227
228The symbol associated with the breakpoint location and the last stack frames
229mistakenly reference the ELF buffer in the Zephyr application instead of the
230extension symbols. Note that GDB however knows both:
231
232.. code-block::
233   :caption: Terminal 2 (GDB client)
234
235   (gdb) info sym 0x200001d0
236   test_detached_ext + 464 in section datas of zephyr/build/zephyr/zephyr.elf
237   detached_entry in section .detach of zephyr/build/llext/detached_fn_ext_debug.elf
238   (gdb) info sym 0x200000ac
239   test_detached_ext + 172 in section datas of zephyr/build/zephyr/zephyr.elf
240   test_entry + 8 in section .text of zephyr/build/llext/detached_fn_ext_debug.elf
241
242It is also impossible to inspect the variables in the extension or step through
243code properly:
244
245.. code-block::
246   :caption: Terminal 2 (GDB client)
247
248   (gdb) print bss_cnt
249   No symbol "bss_cnt" in current context.
250   (gdb) print data_cnt
251   No symbol "data_cnt" in current context.
252   (gdb) next
253   Single stepping until exit from function test_detached_ext,
254   which has no line number information.
255
256   Breakpoint 2, 0x200001ea in test_detached_ext ()
257   (gdb)
258
259Discarding symbols
260------------------
261
262Discarding the Zephyr symbols and only focusing on the extension restores full
263debugging functionality at the cost of losing the global context (note the
264backtrace stops outside the extension):
265
266.. code-block::
267   :caption: Terminal 2 (GDB client)
268
269   (gdb) symbol-file
270   Discard symbol table from `zephyr/build/zephyr/zephyr.elf'? (y or n) y
271   Error in re-setting breakpoint 1: No symbol table is loaded.  Use the "file" command.
272   No symbol file now.
273   (gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf -s .text 0x20000034 -s .data 0x200000b4 -s .bss 0x2000c2e0 -s .rodata 0x200000b8 -s .detach 0x200001d0
274   add symbol table from file "build/llext/detached_fn_ext_debug.elf" at
275           .text_addr = 0x20000034
276           .data_addr = 0x200000b4
277           .bss_addr = 0x2000c2e0
278           .rodata_addr = 0x200000b8
279           .detach_addr = 0x200001d0
280   (y or n) y
281   Reading symbols from build/llext/detached_fn_ext_debug.elf...
282   (gdb) backtrace
283   #0  detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:18
284   #1  0x200000ac in test_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:26
285   #2  0x00000706 in ?? ()
286   Backtrace stopped: previous frame identical to this frame (corrupt stack?)
287   (gdb) next
288   19              zassert_true(data_cnt < 0);
289   (gdb) print bss_cnt
290   $1 = 1
291   (gdb) print data_cnt
292   $2 = -2
293   (gdb)
294
295
296Editing the ELF file
297--------------------
298
299In this alternative approach, the patches to the Zephyr ELF file must be
300performed after building the Zephyr binary and starting the emulator on
301Terminal 1, but before starting the GDB client on Terminal 2.
302
303The above debugging session already identified ``test_detached_ext``, the char
304array that holds the ELF file, as an offending symbol, so that will be removed
305in a first pass. Performing the same steps multiple times, ``__data_start`` and
306``__data_region_start`` can also be found to overlap the memory area of
307interest.
308
309The following commands will remove all of these from the Zephyr ELF file, then
310start a debugging session on the modified file:
311
312.. code-block::
313   :caption: Terminal 2 (GDB client)
314
315   zephyr$ export LLEXT_SDK_INSTALL_DIR=/opt/zephyr-sdk-0.17.0
316   zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-objcopy -N test_detached_ext -N __data_start -N __data_region_start build/zephyr/zephyr.elf build/zephyr/zephyr-edit.elf
317   zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb build/zephyr/zephyr-edit.elf
318   GNU gdb (Zephyr SDK 0.17.0) 12.1
319   [...]
320   Reading symbols from build/zephyr/zephyr-edit.elf...
321   (gdb)
322
323The same steps used in the previous run can be performed again to attach to the
324GDB server and load both the extension and its debug symbols. This time, however,
325the result is rather different:
326
327 * the ``break`` command includes line number information;
328
329 * the output from ``backtrace`` contains functions from both the extension and
330   the Zephyr application;
331
332 * the local variables can be properly inspected.
333
334.. code-block::
335   :caption: Terminal 2 (GDB client)
336
337   (gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf [...]
338   [...]
339   Reading symbols from build/llext/detached_fn_ext_debug.elf...
340   (gdb) break detached_entry
341   Breakpoint 2 at 0x200001d6: file zephyr/tests/subsys/llext/src/detached_fn_ext.c, line 17.
342   (gdb) continue
343   Continuing.
344
345   Breakpoint 2, detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:17
346   17              printk("bss %u @ %p\n", bss_cnt++, &bss_cnt);
347   (gdb) backtrace
348   #0  detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:17
349   #1  0x200000ac in test_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:26
350   #2  0x00000706 in llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:496
351   #3  0x00001a36 in run_test_functions (suite=0x92bc <z_ztest_test_node_llext>, data=0x0 <cbvprintf_package>, test=0x92d8 <z_ztest_unit_test.llext.test_detached>) at zephyr/subsys/testsuite/ztest/src/ztest.c:328
352   #4  test_cb (a=0x92bc <z_ztest_test_node_llext>, b=0x92d8 <z_ztest_unit_test.llext.test_detached>, c=0x0 <cbvprintf_package>) at zephyr/subsys/testsuite/ztest/src/ztest.c:662
353   #5  0x00000e96 in z_thread_entry (entry=0x1a05 <test_cb>, p1=0x92bc <z_ztest_test_node_llext>, p2=0x92d8 <z_ztest_unit_test.llext.test_detached>, p3=0x0 <cbvprintf_package>) at zephyr/lib/os/thread_entry.c:48
354   #6  0x00000000 in ?? ()
355   (gdb) print bss_cnt
356   $1 = 0
357   (gdb) print data_cnt
358   $2 = -3
359   (gdb)
360