1.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
2
3=========
4Task List
5=========
6
7Tasks may have the following fields:
8
9- ``Complexity``: Describes the required familiarity with Rust and / or the
10  corresponding kernel APIs or subsystems. There are four different complexities,
11  ``Beginner``, ``Intermediate``, ``Advanced`` and ``Expert``.
12- ``Reference``: References to other tasks.
13- ``Link``: Links to external resources.
14- ``Contact``: The person that can be contacted for further information about
15  the task.
16
17A task might have `[ABCD]` code after its name. This code can be used to grep
18into the code for `TODO` entries related to it.
19
20Enablement (Rust)
21=================
22
23Tasks that are not directly related to nova-core, but are preconditions in terms
24of required APIs.
25
26FromPrimitive API [FPRI]
27------------------------
28
29Sometimes the need arises to convert a number to a value of an enum or a
30structure.
31
32A good example from nova-core would be the ``Chipset`` enum type, which defines
33the value ``AD102``. When probing the GPU the value ``0x192`` can be read from a
34certain register indication the chipset AD102. Hence, the enum value ``AD102``
35should be derived from the number ``0x192``. Currently, nova-core uses a custom
36implementation (``Chipset::from_u32`` for this.
37
38Instead, it would be desirable to have something like the ``FromPrimitive``
39trait [1] from the num crate.
40
41Having this generalization also helps with implementing a generic macro that
42automatically generates the corresponding mappings between a value and a number.
43
44| Complexity: Beginner
45| Link: https://docs.rs/num/latest/num/trait.FromPrimitive.html
46
47Conversion from byte slices for types implementing FromBytes [TRSM]
48-------------------------------------------------------------------
49
50We retrieve several structures from byte streams coming from the BIOS or loaded
51firmware. At the moment converting the bytes slice into the proper type require
52an inelegant `unsafe` operation; this will go away once `FromBytes` implements
53a proper `from_bytes` method.
54
55| Complexity: Beginner
56
57CoherentAllocation improvements [COHA]
58--------------------------------------
59
60`CoherentAllocation` needs a safe way to write into the allocation, and to
61obtain slices within the allocation.
62
63| Complexity: Beginner
64| Contact: Abdiel Janulgue
65
66Generic register abstraction [REGA]
67-----------------------------------
68
69Work out how register constants and structures can be automatically generated
70through generalized macros.
71
72Example:
73
74.. code-block:: rust
75
76	register!(BOOT0, 0x0, u32, pci::Bar<SIZE>, Fields [
77	   MINOR_REVISION(3:0, RO),
78	   MAJOR_REVISION(7:4, RO),
79	   REVISION(7:0, RO), // Virtual register combining major and minor rev.
80	])
81
82This could expand to something like:
83
84.. code-block:: rust
85
86	const BOOT0_OFFSET: usize = 0x00000000;
87	const BOOT0_MINOR_REVISION_SHIFT: u8 = 0;
88	const BOOT0_MINOR_REVISION_MASK: u32 = 0x0000000f;
89	const BOOT0_MAJOR_REVISION_SHIFT: u8 = 4;
90	const BOOT0_MAJOR_REVISION_MASK: u32 = 0x000000f0;
91	const BOOT0_REVISION_SHIFT: u8 = BOOT0_MINOR_REVISION_SHIFT;
92	const BOOT0_REVISION_MASK: u32 = BOOT0_MINOR_REVISION_MASK | BOOT0_MAJOR_REVISION_MASK;
93
94	struct Boot0(u32);
95
96	impl Boot0 {
97	   #[inline]
98	   fn read(bar: &RevocableGuard<'_, pci::Bar<SIZE>>) -> Self {
99	      Self(bar.readl(BOOT0_OFFSET))
100	   }
101
102	   #[inline]
103	   fn minor_revision(&self) -> u32 {
104	      (self.0 & BOOT0_MINOR_REVISION_MASK) >> BOOT0_MINOR_REVISION_SHIFT
105	   }
106
107	   #[inline]
108	   fn major_revision(&self) -> u32 {
109	      (self.0 & BOOT0_MAJOR_REVISION_MASK) >> BOOT0_MAJOR_REVISION_SHIFT
110	   }
111
112	   #[inline]
113	   fn revision(&self) -> u32 {
114	      (self.0 & BOOT0_REVISION_MASK) >> BOOT0_REVISION_SHIFT
115	   }
116	}
117
118Usage:
119
120.. code-block:: rust
121
122	let bar = bar.try_access().ok_or(ENXIO)?;
123
124	let boot0 = Boot0::read(&bar);
125	pr_info!("Revision: {}\n", boot0.revision());
126
127A work-in-progress implementation currently resides in
128`drivers/gpu/nova-core/regs/macros.rs` and is used in nova-core. It would be
129nice to improve it (possibly using proc macros) and move it to the `kernel`
130crate so it can be used by other components as well.
131
132Features desired before this happens:
133
134* Relative register with build-time base address validation,
135* Arrays of registers with build-time index validation,
136* Make I/O optional I/O (for field values that are not registers),
137* Support other sizes than `u32`,
138* Allow visibility control for registers and individual fields,
139* Use Rust slice syntax to express fields ranges.
140
141| Complexity: Advanced
142| Contact: Alexandre Courbot
143
144Numerical operations [NUMM]
145---------------------------
146
147Nova uses integer operations that are not part of the standard library (or not
148implemented in an optimized way for the kernel). These include:
149
150- Aligning up and down to a power of two,
151- The "Find Last Set Bit" (`fls` function of the C part of the kernel)
152  operation.
153
154A `num` core kernel module is being designed to provide these operations.
155
156| Complexity: Intermediate
157| Contact: Alexandre Courbot
158
159Delay / Sleep abstractions [DLAY]
160---------------------------------
161
162Rust abstractions for the kernel's delay() and sleep() functions.
163
164FUJITA Tomonori plans to work on abstractions for read_poll_timeout_atomic()
165(and friends) [1].
166
167| Complexity: Beginner
168| Link: https://lore.kernel.org/netdev/20250228.080550.354359820929821928.fujita.tomonori@gmail.com/ [1]
169
170IRQ abstractions
171----------------
172
173Rust abstractions for IRQ handling.
174
175There is active ongoing work from Daniel Almeida [1] for the "core" abstractions
176to request IRQs.
177
178Besides optional review and testing work, the required ``pci::Device`` code
179around those core abstractions needs to be worked out.
180
181| Complexity: Intermediate
182| Link: https://lore.kernel.org/lkml/20250122163932.46697-1-daniel.almeida@collabora.com/ [1]
183| Contact: Daniel Almeida
184
185Page abstraction for foreign pages
186----------------------------------
187
188Rust abstractions for pages not created by the Rust page abstraction without
189direct ownership.
190
191There is active onging work from Abdiel Janulgue [1] and Lina [2].
192
193| Complexity: Advanced
194| Link: https://lore.kernel.org/linux-mm/20241119112408.779243-1-abdiel.janulgue@gmail.com/ [1]
195| Link: https://lore.kernel.org/rust-for-linux/20250202-rust-page-v1-0-e3170d7fe55e@asahilina.net/ [2]
196
197Scatterlist / sg_table abstractions
198-----------------------------------
199
200Rust abstractions for scatterlist / sg_table.
201
202There is preceding work from Abdiel Janulgue, which hasn't made it to the
203mailing list yet.
204
205| Complexity: Intermediate
206| Contact: Abdiel Janulgue
207
208PCI MISC APIs
209-------------
210
211Extend the existing PCI device / driver abstractions by SR-IOV, config space,
212capability, MSI API abstractions.
213
214| Complexity: Beginner
215
216XArray bindings [XARR]
217----------------------
218
219We need bindings for `xa_alloc`/`xa_alloc_cyclic` in order to generate the
220auxiliary device IDs.
221
222| Complexity: Intermediate
223
224Debugfs abstractions
225--------------------
226
227Rust abstraction for debugfs APIs.
228
229| Reference: Export GSP log buffers
230| Complexity: Intermediate
231
232GPU (general)
233=============
234
235Parse firmware headers
236----------------------
237
238Parse ELF headers from the firmware files loaded from the filesystem.
239
240| Reference: ELF utils
241| Complexity: Beginner
242| Contact: Abdiel Janulgue
243
244Build radix3 page table
245-----------------------
246
247Build the radix3 page table to map the firmware.
248
249| Complexity: Intermediate
250| Contact: Abdiel Janulgue
251
252Initial Devinit support
253-----------------------
254
255Implement BIOS Device Initialization, i.e. memory sizing, waiting, PLL
256configuration.
257
258| Contact: Dave Airlie
259| Complexity: Beginner
260
261MMU / PT management
262-------------------
263
264Work out the architecture for MMU / page table management.
265
266We need to consider that nova-drm will need rather fine-grained control,
267especially in terms of locking, in order to be able to implement asynchronous
268Vulkan queues.
269
270While generally sharing the corresponding code is desirable, it needs to be
271evaluated how (and if at all) sharing the corresponding code is expedient.
272
273| Complexity: Expert
274
275VRAM memory allocator
276---------------------
277
278Investigate options for a VRAM memory allocator.
279
280Some possible options:
281  - Rust abstractions for
282    - RB tree (interval tree) / drm_mm
283    - maple_tree
284  - native Rust collections
285
286| Complexity: Advanced
287
288Instance Memory
289---------------
290
291Implement support for instmem (bar2) used to store page tables.
292
293| Complexity: Intermediate
294| Contact: Dave Airlie
295
296GPU System Processor (GSP)
297==========================
298
299Export GSP log buffers
300----------------------
301
302Recent patches from Timur Tabi [1] added support to expose GSP-RM log buffers
303(even after failure to probe the driver) through debugfs.
304
305This is also an interesting feature for nova-core, especially in the early days.
306
307| Link: https://lore.kernel.org/nouveau/20241030202952.694055-2-ttabi@nvidia.com/ [1]
308| Reference: Debugfs abstractions
309| Complexity: Intermediate
310
311GSP firmware abstraction
312------------------------
313
314The GSP-RM firmware API is unstable and may incompatibly change from version to
315version, in terms of data structures and semantics.
316
317This problem is one of the big motivations for using Rust for nova-core, since
318it turns out that Rust's procedural macro feature provides a rather elegant way
319to address this issue:
320
3211. generate Rust structures from the C headers in a separate namespace per version
3222. build abstraction structures (within a generic namespace) that implement the
323   firmware interfaces; annotate the differences in implementation with version
324   identifiers
3253. use a procedural macro to generate the actual per version implementation out
326   of this abstraction
3274. instantiate the correct version type one on runtime (can be sure that all
328   have the same interface because it's defined by a common trait)
329
330There is a PoC implementation of this pattern, in the context of the nova-core
331PoC driver.
332
333This task aims at refining the feature and ideally generalize it, to be usable
334by other drivers as well.
335
336| Complexity: Expert
337
338GSP message queue
339-----------------
340
341Implement low level GSP message queue (command, status) for communication
342between the kernel driver and GSP.
343
344| Complexity: Advanced
345| Contact: Dave Airlie
346
347Bootstrap GSP
348-------------
349
350Call the boot firmware to boot the GSP processor; execute initial control
351messages.
352
353| Complexity: Intermediate
354| Contact: Dave Airlie
355
356Client / Device APIs
357--------------------
358
359Implement the GSP message interface for client / device allocation and the
360corresponding client and device allocation APIs.
361
362| Complexity: Intermediate
363| Contact: Dave Airlie
364
365Bar PDE handling
366----------------
367
368Synchronize page table handling for BARs between the kernel driver and GSP.
369
370| Complexity: Beginner
371| Contact: Dave Airlie
372
373FIFO engine
374-----------
375
376Implement support for the FIFO engine, i.e. the corresponding GSP message
377interface and provide an API for chid allocation and channel handling.
378
379| Complexity: Advanced
380| Contact: Dave Airlie
381
382GR engine
383---------
384
385Implement support for the graphics engine, i.e. the corresponding GSP message
386interface and provide an API for (golden) context creation and promotion.
387
388| Complexity: Advanced
389| Contact: Dave Airlie
390
391CE engine
392---------
393
394Implement support for the copy engine, i.e. the corresponding GSP message
395interface.
396
397| Complexity: Intermediate
398| Contact: Dave Airlie
399
400VFN IRQ controller
401------------------
402
403Support for the VFN interrupt controller.
404
405| Complexity: Intermediate
406| Contact: Dave Airlie
407
408External APIs
409=============
410
411nova-core base API
412------------------
413
414Work out the common pieces of the API to connect 2nd level drivers, i.e. vGPU
415manager and nova-drm.
416
417| Complexity: Advanced
418
419vGPU manager API
420----------------
421
422Work out the API parts required by the vGPU manager, which are not covered by
423the base API.
424
425| Complexity: Advanced
426
427nova-core C API
428---------------
429
430Implement a C wrapper for the APIs required by the vGPU manager driver.
431
432| Complexity: Intermediate
433
434Testing
435=======
436
437CI pipeline
438-----------
439
440Investigate option for continuous integration testing.
441
442This can go from as simple as running KUnit tests over running (graphics) CTS to
443booting up (multiple) guest VMs to test VFIO use-cases.
444
445It might also be worth to consider the introduction of a new test suite directly
446sitting on top of the uAPI for more targeted testing and debugging. There may be
447options for collaboration / shared code with the Mesa project.
448
449| Complexity: Advanced
450