# NIC Tuning for HFT

These are four interrelated techniques used in Linux to distribute network packet processing across multiple CPU cores. The goal is to avoid overloading a single CPU, increase throughput, and reduce latency.

***

## 1. NIC Interrupt Binding (IRQ Affinity)

**What it does:**\
Assigns the hardware interrupt (IRQ) of a network interface card (NIC) to specific CPU cores.

**How it works:**

* When a packet arrives, the NIC raises an interrupt.
* The CPU that handles that interrupt then processes the packet (or hands it off).
* By default, interrupts may be handled by any core, causing cache bouncing and imbalance.
* Binding the interrupt to a dedicated core (or set of cores) improves cache locality and predictable performance.

**Configuration:**

* IRQ affinity is set via `/proc/irq/<irq_num>/smp_affinity` (bitmask) or `irqbalance` service.

***

## 2. RSS – Receive Side Scaling

**What it does:**\
A **hardware** feature of modern NICs that distributes incoming packets among multiple receive queues, each with its own interrupt.

**How it works:**

* The NIC uses a hash (typically over IP addresses and ports) to map each flow to a queue.
* Each queue can be bound to a different CPU core (via interrupt binding).
* Packets of the same flow always go to the same queue → avoids reordering and preserves per‑flow processing locality.
* Only flows are balanced, not individual packets.

**Benefits:**

* Parallel packet reception from the NIC directly.
* Low CPU overhead (hash done in hardware).

**Configuration:**

* Enabled via ethtool: `ethtool -L eth0 combined <num_queues>`
* Queue‑to‑CPU mapping via interrupt affinity or `irqbalance`.

***

## 3. RPS – Receive Packet Steering

**What it does:**\
A **software** implementation of RSS, introduced when NICs have only one queue or to further spread load beyond hardware queues.

**How it works:**

* Works at the driver layer, after the NIC has received the packet.
* For each incoming packet, a hash is computed (similar to RSS) to decide which CPU should process it.
* The packet is placed on the backlog queue of the target CPU, and a softirq is raised on that CPU.
* Requires RPS to be enabled and CPU masks to be defined.

**Benefits:**

* Distributes receive processing among CPUs even with a single‑queue NIC.
* Can be used together with RSS to spread traffic from each hardware queue to multiple CPUs.

**Configuration:**

* `/sys/class/net/<eth0>/queues/rx-<n>/rps_cpus` – CPU mask for flows from this queue.

***

## 4. RFS – Receive Flow Steering

**What it does:**\
Extends RPS by steering packets **to the same CPU that is running the application consuming the flow**.

**How it works:**

* RPS alone sends flows to arbitrary CPUs based on hash.
* RFS tracks the CPU on which a socket is being read (via the kernel’s flow table).
* Incoming packets for that flow are directed to that CPU, increasing cache hit rates.
* Falls back to RPS if no flow information is available.

**Benefits:**

* Better CPU cache utilization → lower latency, higher throughput for CPU‑bound workloads.

**Configuration:**

* `/proc/sys/net/core/rps_sock_flow_entries` (global)
* `/sys/class/net/<eth0>/queues/rx-<n>/rps_flow_cnt` (per queue)
* Also requires RPS to be enabled.

***

## Summary Comparison

| Technique        | Scope              | Mechanism                                  | Dependency                       |
| ---------------- | ------------------ | ------------------------------------------ | -------------------------------- |
| **IRQ Affinity** | Hardware interrupt | Bind IRQ to CPU                            | NIC (any)                        |
| **RSS**          | Hardware           | NIC distributes flows to queues            | NIC must support multiple queues |
| **RPS**          | Software           | Kernel distributes packets after reception | Any NIC, works with single queue |
| **RFS**          | Software           | RPS + steer to application CPU             | RPS enabled, flow table          |

All four can be used together: RSS spreads packets across queues, each queue interrupt is bound to a CPU, RPS further spreads the workload from those queues, and RFS fine‑tunes steering to the application’s CPU.

Below is an ASCII diagram that traces a single network packet from the wire to the application. It highlights **where** each of the four techniques (RSS, Interrupt Binding, RPS, RFS) intervenes and **what** they contribute.

```
        +---------------------------------------+
        |          1.  INCOMING PACKET          |
        |         (Ethernet frame)              |
        +------------------+--------------------+
                           |
                           v
        +---------------------------------------+
        |  2.  NIC HARDWARE (with RSS)          |
        |      +------------+------------+      |
        |      | Queue 0    | Queue 1    | ...  | <--- RSS: hash(5‑tuple)
        |      | (IRQ 104)  | (IRQ 105)  |      |      → choose queue
        |      +------------+------------+      |
        +------------------+--------------------+
                           | (DMA into memory)
                           v
        +---------------------------------------+
        |  3.  INTERRUPT CONTROLLER             |
        |      (delivers IRQ to CPU)            |
        +------------------+--------------------+
                           | (IRQ)
                           v
        +---------------------------------------+
        |  4.  CPU CORES                        |
        |      +------------+------------+      |
        |      | CPU 0      | CPU 1      | ...  | <--- INTERRUPT BINDING
        |      | handles    | handles    |      |      (smp_affinity)
        |      | IRQ 104    | IRQ 105    |      |
        |      +------------+------------+      |
        +------------------+--------------------+
                           |
                           v
        +---------------------------------------+
        |  5.  DRIVER / NAPI POLL               |
        |      (allocate skb, fetch packet)     |
        +------------------+--------------------+
                           |
                           v
        +---------------------------------------+
        |  6.  RPS (Receive Packet Steering)    |
        |      - compute hash again             |
        |      - enqueue to backlog of CPU X    | <--- RPS: spread load
        |      - raise IPI to CPU X             |      from single queue
        +------------------+--------------------+
                           |
                           v
        +---------------------------------------+
        |  7.  RFS (Receive Flow Steering)      |
        |      - consult flow table             |
        |      - override CPU X → CPU Y         | <--- RFS: follow application
        |        (where socket is read)         |      (better cache)
        +------------------+--------------------+
                           |
                           v
        +---------------------------------------+
        |  8.  TARGET CPU (softirq)             |
        |      - run backlog                    |
        |      - IP stack (IP, TCP/UDP)         |
        +------------------+--------------------+
                           |
                           v
        +---------------------------------------+
        |  9.  SOCKET / APPLICATION             |
        |      (running on same CPU as step 8)  |
        +---------------------------------------+
```

***

### How Each Technique Impacts the Flow

| Technique       | Location                   | Role                                                                                      |
| --------------- | -------------------------- | ----------------------------------------------------------------------------------------- |
| **RSS**         | NIC hardware               | Splits incoming flows into separate hardware queues. Enables parallel DMA + IRQs.         |
| **IRQ Binding** | Interrupt controller / CPU | Pins each queue’s IRQ to a specific core. Prevents interrupts from bouncing between CPUs. |
| **RPS**         | Kernel (driver RX path)    | Software‑based spreading: moves packets from the IRQ CPU to another CPU’s backlog.        |
| **RFS**         | Kernel (flow table)        | Refines RPS by steering packets to the **same CPU that is running the consuming app**.    |

**Together** they allow a modern system to:

* Receive packets from a 100 GbE NIC without dropping (RSS + IRQ binding).
* Distribute software processing evenly (RPS).
* Keep data hot in the CPU cache (RFS).

> **Note:** RPS and RFS are only active when explicitly configured; otherwise packets are processed entirely on the CPU that handled the IRQ.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.damonyuan.com/tech/260101-nic-tuning-for-hft.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
