1
2
3<!--
4    (C) Copyright 2018 The Fuchsia Authors. All rights reserved.
5    Use of this source code is governed by a BSD-style license that can be
6    found in the LICENSE file.
7-->
8
9# Ethernet Devices
10
11This document is part of the [Driver Development Kit tutorial](ddk-tutorial.md) documentation.
12
13## Overview
14
15This chapter looks into the details of ethernet drivers, using the Intel driver code
16for specific examples.
17
18In order to handle ethernet devices, two distinct parts are involved.
19A "top half" driver handles the generic ethernet protocol, and is located in
20`//zircon/system/dev/ethernet/ethernet/ethernet.c` (yes, three "ethernets" in a row),
21and one or more "bottom half" drivers handle the actual devices, located one
22directory higher in `//zircon/system/dev/ethernet/`**_devicename_**`/`.
23
24Multiple Zircon IPC protocols are used for communication between modules.
25
26> We'll just use the term "protocol" to refer to these.
27> Granted, we *are* discussing an Ethernet driver, but since we won't be
28> discussing any of the on-wire communications protocols supported by the driver,
29> this usage shouldn't result in any confusion.
30>
31> @@@ I hope.
32
33
34The top half provides a protocol interface that conforms to `ZX_PROTOCOL_ETHERNET_IMPL`.
35The bottom half provides a protocol interface that conforms to whatever the
36hardware is connected to (for example, this might be `ZX_PROTOCOL_PCI`, for
37PCI-based ethernet cards, or `ZX_PROTOCOL_USB` for USB-based ethernet devices,
38and so on).
39We'll focus on the PCI version here.
40
41The bottom half drivers all expose a `ZX_PROTOCOL_ETHERNET_IMPL` binding, which is how
42the top half finds the bottom halves.
43
44Effectively, the bottom half ethernet driver is responsible for managing the hardware
45associated with the ethernet device, and presenting a consistent abstraction of that
46hardware for use by the top half.
47The top half manages the ethernet interface to the system.
48
49![Figure: Relationship amongst layers in ethernet driver stack](ethernet-000-cropped.png)
50
51> @@@ this diagram: helpful? too busy? font too small?
52
53# Intel PCI-based ethernet
54
55The Intel ethernet driver can be found in `//zircon/system/dev/ethernet/intel-ethernet`,
56and consists of the following files:
57
58<dl>
59<dt>`ethernet.c`
60<dd>The device driver part of the code; handles interface to protocols.
61<dt>`ie.c`
62<dd>The Intel specific part of the code; knows about the hardware registers on the card.
63<dt>`ie-hw.h`
64<dd>Contains the manifest constants for all of the control registers.
65<dt>`ie.h`
66<dd>Common definitions (such as the device context block)
67</dl>
68
69This driver not only handles the `ethmac` protocol, but also:
70
71*   finds its device on the PCI bus,
72*   attaches to legacy or Message Signaled Interrupts (**MSI**),
73*   maps I/O memory, and
74*   creates a background IRQ handling thread.
75
76## Binding
77
78The file `ethernet.c` contains the binding information, implemented by the standard
79binding macros introduced in the [Simple Drivers](simple.md) chapter:
80
81```c
82ZIRCON_DRIVER_BEGIN(intel_ethernet, intel_ethernet_driver_ops, "zircon", "0.1", 11)
83    BI_ABORT_IF(NE, BIND_PROTOCOL, ZX_PROTOCOL_PCI),
84    BI_ABORT_IF(NE, BIND_PCI_VID, 0x8086),
85    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x100E), // Qemu
86    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x15A3), // Broadwell
87    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x1570), // Skylake
88    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x1533), // I210 standalone
89    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x1539), // I211-AT
90    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x156f), // I219-LM (Dawson Canyon NUC)
91    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x15b7), // Skull Canyon NUC
92    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x15b8), // I219-V
93    BI_MATCH_IF(EQ, BIND_PCI_DID, 0x15d8), // Kaby Lake NUC
94ZIRCON_DRIVER_END(intel_ethernet)
95```
96
97This ends up binding to ethernet cards that are identified by vendor ID `0x8086` (Intel),
98and have any of the listed device IDs (the `BIND_PCI_DID` lines indicate the allowed
99hexadecimal device IDs).
100It also requires the `ZX_PROTOCOL_PCI` protocol.
101
102Note the sense of the logic here &mdash; the vendor ID is tested with a
103"`BI_ABORT_IF(NE`" construct (meaning, "**ABORT IF** the values are **N**ot **E**qual"),
104whereas the device IDs are tested with "`BI_MATCH_IF(EQ`" constructs (meaning "**MATCH
105IF** the values are **EQ**ual").
106
107Intuitively, you might think that the vendor ID could be tested with a "`BI_MATCH_IF(EQ`"
108as well, (looking for vendor `0x8086`), but this would have two major problems.
109First, evaluation stops as soon as a condition is true, so that means that **any** device
110that had the Intel vendor ID would be considered a "match."
111Second, even if the device wasn't an Intel vendor ID, it would open the possibility
112of allowing matches to other vendors' devices that had the same device ID as listed.
113
114> The individual tests are evaluated in sequence.
115> The first one that's true terminates evaluation, and performs
116> the given action (i.e., `ABORT` or `MATCH`).
117
118## More about binding
119
120From the command line, `dm drivers` will display this information.
121Here's the relevant portion for the Intel ethernet driver:
122
123```sh
124$ dm drivers
125<snip>
126    Name    : intel_ethernet
127    Driver  : /boot/driver/intel-ethernet.so
128    Flags   : 0x00000000
129    Binding : 11 instructions (88 bytes)
130    [1/11]: if (Protocol != 0x70504349) return no-match;
131    [2/11]: if (PCI.VID != 0x00008086) return no-match;
132    [3/11]: if (PCI.DID == 0x0000100e) return match;
133    [4/11]: if (PCI.DID == 0x000015a3) return match;
134    [5/11]: if (PCI.DID == 0x00001570) return match;
135    [6/11]: if (PCI.DID == 0x00001533) return match;
136    [7/11]: if (PCI.DID == 0x00001539) return match;
137    [8/11]: if (PCI.DID == 0x0000156f) return match;
138    [9/11]: if (PCI.DID == 0x000015b7) return match;
139    [10/11]: if (PCI.DID == 0x000015b8) return match;
140    [11/11]: if (PCI.DID == 0x000015d8) return match;
141```
142
143The `Name` field indicates the name of the driver, given as the first argument to the
144`ZIRCON_DRIVER_BEGIN` and `ZIRCON_DRIVER_END` macros.
145The `Driver` field indicates the location of the shared object that contains the driver code.
146
147> The `Flags` field is not used @@@ correct?
148
149The last section, the binding instructions, corresponds with the `BI_ABORT_IF` and `BI_MATCH_IF`
150macro directives.
151Note that the first binding instruction compares the field `Protocol` against the hexadecimal
152number `0x70504349` &mdash; that "number" is simply the ASCII encoding of the string "`pPCI`",
153indicating the PCI protocol (you can see all of the encodings in
154`//zircon/system/ulib/ddk/include/ddk/protodefs.h`)
155
156From the `ZIRCON_DRIVER_BEGIN` macro, the `intel_ethernet_driver_ops`
157structure contains the driver operations, in this case just the binding function
158**eth_bind()**.
159
160Let's turn our attention to the binding function itself.
161
162## PCI interface
163
164The first part of the binding function deals with the PCI interface.
165
166The Intel ethernet driver is a PCI bus peripheral.
167As such, it needs to first query the PCI configuration registers in order to discover
168where the BIOS (or other startup program) has located the device in memory
169address space, and what interrupt it was assigned.
170Second, it needs to initialize the device for use (such as mapping the configuration
171registers and attaching to the device's interrupt).
172We broadly discussed this in the [Hardware Interfacing](hardware.md) chapter.
173
174As usual, the binding function allocates and initializes a context block:
175
176```c
177static zx_status_t eth_bind(void* ctx, zx_device_t* dev) {
178    ethernet_device_t* edev;
179    if ((edev = calloc(1, sizeof(ethernet_device_t))) == NULL) {
180        return ZX_ERR_NO_MEMORY;
181    }
182    mtx_init(&edev->lock, mtx_plain);
183    mtx_init(&edev->eth.send_lock, mtx_plain);
184```
185
186This allocates a zeroed ethernet context block (`ethernet_device_t`).
187Then we initialize two mutexes (one for locking the device itself (`edev->lock`), and one
188for locking the ethernet send buffers (`edev->eth.send_lock`)).
189
190We'll examine the context block in more detail below.
191
192### PCI protocol operations
193
194The next step fetches the PCI protocol operations pointer (or fails if it can't):
195
196```c
197    if (device_get_protocol(dev, ZX_PROTOCOL_PCI, &edev->pci)) {
198        printf("no pci protocol\n");
199        goto fail;
200    }
201```
202
203This populates `edev->pci` (of type `pci_protocol_t`) with pointers to functions that
204provide PCI protocol services.
205Of the many functions available, we use the following subset (listed in order of
206use in the binding function):
207
208Function            | Description
209--------------------|------------------------------------------------------------------------------
210`get_bti`           | Used to get the Bus Transaction Initiator (**[BTI](../objects/bus_transaction_initiator.md)**) for the device
211`query_irq_mode`    | Returns the number of the specific type of IRQ available (MSI or legacy)
212`set_irq_mode`      | Requests the specified IRQ mode to be used for the device
213`map_interrupt`     | Creates an IRQ handle associated with the device's interrupt
214`map_bar`           | Returns a pointer to the Base Address Register (**BAR**) of the PCI device
215`enable_bus_master` | Enables / disables bus mastering for the device
216
217> Note that the function names given in the table above are the member names within
218> the `pci_protocol_t` structure; throughout the code we'll use the **pci_...()** accessor
219> functions to call the protocol ops.
220
221### Fetch the BTI
222
223The first PCI function we call is
224**pci_get_bti()**:
225
226```c
227    zx_status_t status = pci_get_bti(&edev->pci, 0, &edev->btih);
228    if (status != ZX_OK) {
229        goto fail;
230    }
231```
232
233A [BTI](../objects/bus_transaction_initiator.md)
234is used to represent the bus mastering / DMA capability of a device.
235It can be used for granting memory access to a device.
236The [BTI](../objects/bus_transaction_initiator.md)
237handle is stored in `edev->btih` and is used later to initialize transfer buffers.
238The [Hardware Interfacing](hardware.md) chapter talks more about this, in the DMA section.
239
240### Discover and map interrupts
241
242The interrupt is discovered and mapped next:
243
244```c
245    // Query whether we have MSI or Legacy interrupts.
246    uint32_t irq_cnt = 0;
247    if ((pci_query_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_MSI, &irq_cnt) == ZX_OK) &&
248        (pci_set_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_MSI, 1) == ZX_OK)) {
249        printf("eth: using MSI mode\n");
250    } else if ((pci_query_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_LEGACY, &irq_cnt) == ZX_OK) &&
251               (pci_set_irq_mode(&edev->pci, ZX_PCIE_IRQ_MODE_LEGACY, 1) == ZX_OK)) {
252        printf("eth: using legacy irq mode\n");
253    } else {
254        printf("eth: failed to configure irqs\n");
255        goto fail;
256    }
257
258    zx_status_t r = pci_map_interrupt(&edev->pci, 0, &edev->irqh);
259    if (r != ZX_OK) {
260        printf("eth: failed to map irq\n");
261        goto fail;
262    }
263```
264
265The **pci_query_irq_mode()**
266function determines if the device supports any `MSI` or `LEGACY`
267style interrupts, and returns the count (in `irq_cnt`).
268We're expecting one interrupt, so we ignore the count and examine just the return status.
269If the return status indicates one or more interrupts of that type exist, we set the device to
270use that mode.
271
272The **pci_map_interrupt()**
273function is then used to bind the hardware interrupt to a handle, stored in `edev->irqh`.
274
275We'll see this handle later, when we look at the interrupt service thread.
276
277### Map PCI BAR
278
279Next up, we map the PCI BAR:
280
281```c
282    // map iomem
283    uint64_t sz;
284    zx_handle_t h;
285    void* io;
286    r = pci_map_bar(&edev->pci, 0u, ZX_CACHE_POLICY_UNCACHED_DEVICE, &io, &sz, &h);
287    if (r != ZX_OK) {
288        printf("eth: cannot map io %d\n", h);
289        goto fail;
290    }
291    edev->eth.iobase = (uintptr_t)io;
292    edev->ioh = h;
293
294    if ((r = pci_enable_bus_master(&edev->pci, true)) < 0) {
295        printf("eth: cannot enable bus master %d\n", r);
296        goto fail;
297    }
298```
299
300The call to **pci_map_bar()** creates a handle to the first BAR
301(the `0u` as the second argument
302specifies the BAR ID number), which we store into the context block's `ioh` member.
303(We also capture the virtual address into `edev->eth.iobase`.)
304
305### Ethernet setup and configuration
306
307At this point, we have access to enough of the device that we can go and set it up:
308
309```c
310    if (eth_enable_phy(&edev->eth) != ZX_OK) {
311        goto fail;
312    }
313
314    if (eth_reset_hw(&edev->eth)) {
315        goto fail;
316    }
317```
318
319The implementation of **eth_enable_phy()** and **eth_reset_hw()**
320is in the `ie.c` file.
321
322### DMA buffer setup and hardware configuration
323
324With the device configured, we can now set up the DMA buffers.
325Here we see the [BTI](../objects/bus_transaction_initiator.md)
326handle, `edev->btih`, that we set up above, as the 2nd argument to
327**io_buffer_init()**:
328
329```c
330
331    r = io_buffer_init(&edev->buffer, edev->btih, ETH_ALLOC, IO_BUFFER_RW | IO_BUFFER_CONTIG);
332    if (r < 0) {
333        printf("eth: cannot alloc io-buffer %d\n", r);
334        goto fail;
335    }
336
337    eth_setup_buffers(&edev->eth, io_buffer_virt(&edev->buffer), io_buffer_phys(&edev->buffer));
338    eth_init_hw(&edev->eth);
339```
340
341The **io_buffer_init()**
342function zeroes the buffer, and creates a [VMO](../objects/vm_object.md)
343handle to the [BTI](../objects/bus_transaction_initiator.md).
344The **eth_setup_buffers()** and **eth_init_hw()** functions are defined in the `ie.c` module.
345
346### Final driver binding
347
348The next part binds the device name ("`intel-ethernet`"), context block (`edev`,
349allocated above), device operations (`device_ops`, which supports suspend, resume, and release),
350and the additional optional protocol ops for ethernet (identified as `ZX_PROTOCOL_ETHERNET_IMPL`
351and contained in `ethmac_ops`):
352
353```c
354    device_add_args_t args = {
355        .version = DEVICE_ADD_ARGS_VERSION,
356        .name = "intel-ethernet",
357        .ctx = edev,
358        .ops = &device_ops,
359        .proto_id = ZX_PROTOCOL_ETHERNET_IMPL,
360        .proto_ops = &ethmac_ops,
361    };
362
363    if (device_add(dev, &args, &edev->zxdev)) {
364        goto fail;
365    }
366```
367
368### Interrupt thread creation
369
370Finally, the background Interrupt Handling Thread (**IHT**), **irq_thread()** is created:
371
372```c
373    thrd_create_with_name(&edev->thread, irq_thread, edev, "eth-irq-thread");
374    thrd_detach(edev->thread);
375
376    printf("eth: intel-ethernet online\n");
377
378    return ZX_OK;
379```
380
381As discussed in the [Hardware Interfacing](hardware.md) chapter,
382the IHT handles asynchronous hardware events.
383We'll look at the thread itself below.
384
385### Failure handling
386
387In case of failure, the `fail` label is the target of various `goto`s within the code, and is
388responsible for cleanup of allocated resources as well as returning a failure code to the caller:
389
390```c
391fail:
392    io_buffer_release(&edev->buffer);
393    if (edev->btih) {
394        zx_handle_close(edev->btih);
395    }
396    if (edev->ioh) {
397        pci_enable_bus_master(&edev->pci, false);
398        zx_handle_close(edev->irqh);
399        zx_handle_close(edev->ioh);
400    }
401    free(edev);
402    return ZX_ERR_NOT_SUPPORTED;
403}
404```
405
406That concludes the discussion of the binding function.
407
408## The context structure
409
410At this point, we can circle back and take a look at the context structure:
411
412```c
413typedef struct ethernet_device {
414    ethdev_t        eth;
415    mtx_t           lock;
416    eth_state       state;
417    zx_device_t*    zxdev;
418    pci_protocol_t  pci;
419    zx_handle_t     ioh;
420    zx_handle_t     irqh;
421    thrd_t          thread;
422    zx_handle_t     btih;
423    io_buffer_t     buffer;
424    bool            online;
425
426    // callback interface to attached ethernet layer
427    ethmac_ifc_t*   ifc;
428    void*           cookie;
429} ethernet_device_t;
430```
431
432It holds all of the context for the ethernet devices.
433
434> @@@ How much discussion do we want of the context block members?
435
436## Ethernet protocol operations
437
438Recall from the discussion around the binding function
439**eth_bind()**
440that we bound an `ethmac_protocol_ops_t` structure called
441`ethmac_ops` to the driver.
442This structure provides the following "bottom-half" ethernet driver protocol operations
443for the Intel driver:
444
445```c
446static ethmac_protocol_ops_t ethmac_ops = {
447    .query = eth_query,
448    .stop = eth_stop,
449    .start = eth_start,
450    .queue_tx = eth_queue_tx,
451    .set_param = eth_set_param,
452//  .get_bti not supported
453};
454```
455
456We examine each in turn below.
457
458### Ethernet protocol: **query()**
459
460The **query()** function takes three parameters:
461a context block, an options specifier, and a pointer to
462an `ethmac_info_t` where the information should be stored.
463
464> Note that at the present time, there are no options defined; therefore, the driver
465> should return `ZX_ERR_INVALID_ARGS` in case of a non-zero value.
466
467The `ethmac_info_t` structure is defined as follows (reserved fields omitted for clarity):
468
469```c
470typedef struct ethmac_info {
471    uint32_t    features;
472    uint32_t    mtu;
473    uint8_t     mac[ETH_MAC_SIZE];
474} ethmac_info_t;
475```
476
477The `mtu` field contains the Maximum Transmission Unit (**MTU**) size that the driver
478can support.
479A common value is `1500`.
480
481The `mac` field contains `ETH_MAC_SIZE` (6 bytes) worth of Media Access Control (**MAC**)
482address in big-endian order (that is, for a MAC of `01:23:45:67:89:ab`, the value of
483`mac[0]` is `0x01`).
484
485Finally, the `features` field contains a bitmap of available features:
486
487Feature                 | Meaning
488------------------------|--------------------------------------------
489`ETHMAC_FEATURE_WLAN`   | Device is a wireless network device
490`ETHMAC_FEATURE_SYNTH`  | Device is a synthetic network device
491`ETHMAC_FEATURE_DMA`    | Driver will be doing DMA to/from the VMO
492
493The Intel driver's **eth_query()** is representative:
494
495```c
496static zx_status_t eth_query(void* ctx, uint32_t options, ethmac_info_t* info) {
497    ethernet_device_t* edev = ctx;
498
499    if (options) {
500        return ZX_ERR_INVALID_ARGS;
501    }
502
503    memset(info, 0, sizeof(*info));
504    ZX_DEBUG_ASSERT(ETH_TXBUF_SIZE >= ETH_MTU);
505    info->mtu = ETH_MTU;
506    memcpy(info->mac, edev->eth.mac, sizeof(edev->eth.mac));
507
508    return ZX_OK;
509}
510```
511
512In that it returns `ZX_ERR_INVALID_ARGS` in case the `options` parameter is non zero,
513and otherwise fills the `mtu` and `mac` members.
514
515### Ethernet protocol: **queue_tx()**
516
517The **queue_tx()** function is responsible for taking the `ethmac_netbuf_t` network
518buffer and transmitting it.
519
520```c
521static zx_status_t eth_queue_tx(void* ctx, uint32_t options, ethmac_netbuf_t* netbuf) {
522    ethernet_device_t* edev = ctx;
523    if (edev->state != ETH_RUNNING) {
524        return ZX_ERR_BAD_STATE;
525    }
526    return eth_tx(&edev->eth, netbuf->data, netbuf->len);
527}
528```
529
530The real work for the Intel ethernet driver is done in `ie.c`:
531
532```c
533status_t eth_tx(ethdev_t* eth, const void* data, size_t len) {
534    if (len > ETH_TXBUF_DSIZE) {
535        printf("intel-eth: unsupported packet length %zu\n", len);
536        return ZX_ERR_INVALID_ARGS;
537    }
538
539    zx_status_t status = ZX_OK;
540
541    mtx_lock(&eth->send_lock);
542
543    reap_tx_buffers(eth);
544
545    // obtain buffer, copy into it, setup descriptor
546    framebuf_t *frame = list_remove_head_type(&eth->free_frames, framebuf_t, node);
547    if (frame == NULL) {
548        status = ZX_ERR_NO_RESOURCES;
549        goto out;
550    }
551
552    uint32_t n = eth->tx_wr_ptr;
553    memcpy(frame->data, data, len);
554    // Pad out short packets.
555    if (len < 60) {
556      memset(frame->data + len, 0, 60 - len);
557      len = 60;
558    }
559    eth->txd[n].addr = frame->phys;
560    eth->txd[n].info = IE_TXD_LEN(len) | IE_TXD_EOP | IE_TXD_IFCS | IE_TXD_RS;
561    list_add_tail(&eth->busy_frames, &frame->node);
562
563    // inform hw of buffer availability
564    n = (n + 1) & (ETH_TXBUF_COUNT - 1);
565    eth->tx_wr_ptr = n;
566    writel(n, IE_TDT);
567
568out:
569    mtx_unlock(&eth->send_lock);
570    return status;
571}
572```
573
574This function performs buffer management and talks to the hardware.
575It first locks the mutex, and then finds an available buffer.
576This is done by calling **reap_tx_buffers()** to find available buffers,
577and then calling the macro **list_remove_head_type()** to try and fetch
578a buffer from the head of the list.
579If no buffer is available, an error status (`ZX_ERR_NO_RESOURCES`) is set
580and the function returns.
581
582Otherwise, the frame data is copied (short frames, less than 60 bytes, are padded
583with zeros).
584
585The hardware is kicked via the macro **writel()**, which writes to the
586`IE_TDT` register telling it which buffer is available to be written to the ethernet.
587
588At this point, the frame is queued at the chip level, and will be sent shortly.
589(The timing depends on if there are other frames queued before this one.)
590
591### Ethernet protocol: **set_param()**
592
593Sets a parameter based on the passed `param` argument and `value` argument.
594The Intel driver supports enabling or disabling promiscuous mode, and nothing else:
595
596```c
597static zx_status_t eth_set_param(void *ctx, uint32_t param, int32_t value, void* data) {
598    ethernet_device_t* edev = ctx;
599    zx_status_t status = ZX_OK;
600
601    mtx_lock(&edev->lock);
602
603    switch (param) {
604    case ETHMAC_SETPARAM_PROMISC:
605        if ((bool)value) {
606            eth_start_promisc(&edev->eth);
607        } else {
608            eth_stop_promisc(&edev->eth);
609        }
610        status = ZX_OK;
611        break;
612    default:
613        status = ZX_ERR_NOT_SUPPORTED;
614    }
615    mtx_unlock(&edev->lock);
616
617    return status;
618}
619```
620
621The following parameters are available:
622
623Parameter                           | Meaning (additional data)
624------------------------------------|-------------------------------------------------------------
625`ETHMAC_SETPARAM_PROMISC`           | Controls promiscuous mode (bool)
626`ETHMAC_SETPARAM_MULTICAST_PROMISC` | Controls multicast promiscuous mode (bool)
627`ETHMAC_SETPARAM_MULTICAST_FILTER`  | Sets multicast filtering addresses (count + array)
628`ETHMAC_SETPARAM_DUMP_REGS`         | Used for debug, dumps the registers (no additional data)
629
630For multicast filtering, the `value` argument indicates the count of MAC addresses sequentially
631presented via the `data` argument. For example, if `value` was `2`, then `data`
632would point to two back-to-back MAC addresses (2 x 6 = 12 bytes total).
633
634Note that if a parameter is not supported, the value `ZX_ERR_NOT_SUPPORTED` is returned.
635
636### Ethernet protocol: **start()** and **stop()**
637
638The two functions, **eth_start()** and **eth_stop()** are used to start and stop
639the ethernet device:
640
641```c
642static void eth_stop(void* ctx) {
643    ethernet_device_t* edev = ctx;
644    mtx_lock(&edev->lock);
645    edev->ifc = NULL;
646    mtx_unlock(&edev->lock);
647}
648
649static zx_status_t eth_start(void* ctx, ethmac_ifc_t* ifc, void* cookie) {
650    ethernet_device_t* edev = ctx;
651    zx_status_t status = ZX_OK;
652
653    mtx_lock(&edev->lock);
654    if (edev->ifc) {
655        status = ZX_ERR_BAD_STATE;
656    } else {
657        edev->ifc = ifc;
658        edev->cookie = cookie;
659        edev->ifc->status(edev->cookie, edev->online ? ETHMAC_STATUS_ONLINE : 0);
660    }
661    mtx_unlock(&edev->lock);
662
663    return status;
664}
665```
666
667The Intel ethernet driver code shown above is typical; the `ifc` member of the context
668block is used as both an indication of status (`NULL` if stopped) and, when running,
669it points to a valid interface block.
670
671### Ethernet protocol: **get_bti()**
672
673The Intel ethernet driver doesn't support the optional **get_bti()** callout.
674
675This callout is used to return a handle to the [BTI](../objects/bus_transaction_initiator.md).
676In case the device doesn't support it, it can either leave it out of the `ethmac_protocol_ops_t`
677structure (like the Intel ethernet driver does), or it can return `ZX_HANDLE_INVALID`.
678
679If supported, the handle is returned from the function.
680Note that the ownership of the handle is *not* transferred; the ethernet driver still
681owns the handle.
682In particular, the caller must not close the handle.
683
684## Receiving data
685
686The IHT thread created by the binding function waits for data from the ethernet hardware.
687When data arrives, it calls **eth_handle_irq()** to process the data.
688
689The portion of the thread in `ethernet.c` is as follows:
690
691```c
692static int irq_thread(void* arg) {
693    ethernet_device_t* edev = arg;
694    for (;;) {
695        zx_status_t r;
696        r = zx_interrupt_wait(edev->irqh, NULL);
697        if (r != ZX_OK) {
698            printf("eth: irq wait failed? %d\n", r);
699            break;
700        }
701        mtx_lock(&edev->lock);
702        unsigned irq = eth_handle_irq(&edev->eth);
703        if (irq & ETH_IRQ_RX) {
704            void* data;
705            size_t len;
706
707            while (eth_rx(&edev->eth, &data, &len) == ZX_OK) {
708                if (edev->ifc && (edev->state == ETH_RUNNING)) {
709                    edev->ifc->recv(edev->cookie, data, len, 0);
710                }
711                eth_rx_ack(&edev->eth);
712            }
713        }
714        if (irq & ETH_IRQ_LSC) {
715            bool was_online = edev->online;
716            bool online = eth_status_online(&edev->eth);
717            zxlogf(TRACE, "intel-eth: ETH_IRQ_LSC fired: %d->%d\n", was_online, online);
718            if (online != was_online) {
719                edev->online = online;
720                if (edev->ifc) {
721                    edev->ifc->status(edev->cookie, online ? ETHMAC_STATUS_ONLINE : 0);
722                }
723            }
724        }
725        mtx_unlock(&edev->lock);
726    }
727    return 0;
728}
729```
730
731The thread waits on an interrupt, and, when one occurs, calls **eth_handle_irq()**
732to read the interrupt reason register (which also clears the interrupt
733indication on the card).
734
735Based on the value read from **eth_handle_irq()**,
736there are two major flows in the thread:
737
7381.  the bit `ETH_IRQ_RX` is present &mdash; this indicates data has been
739    received by the card,
7402.  the bit `ETH_IRQ_LSC` is present &mdash; this indicates a Line Status
741    Change (LSC) event has been detected by the card.
742
743If data has been received, the following functions are called:
744
745*   **eth_rx()** &mdash; obtains a pointer to the receive buffer containing the data
746*   **eth_rx_ack()** &mdash; acknowledges receipt of the packet by writing to registers on the card
747
748
749Note that further processing is done by the ethernet device protocol (available via `edev->ifc`):
750
751*   **edev->ifc->recv()** &mdash; processes the received data
752*   **edev->ifc->status()** &mdash; processes the status change
753
754In the case of a line status change, **eth_status_online()** is called to handle the event.
755
756```c
757status_t eth_rx(ethdev_t* eth, void** data, size_t* len) {
758    uint32_t n = eth->rx_rd_ptr;
759    uint64_t info = eth->rxd[n].info;
760
761    if (!(info & IE_RXD_DONE)) {
762        return ZX_ERR_SHOULD_WAIT;
763    }
764
765    // copy out packet
766    zx_status_t r = IE_RXD_LEN(info);
767
768    *data = eth->rxb + ETH_RXBUF_SIZE * n;
769    *len = r;
770
771    return ZX_OK;
772}
773```
774
775