Linux Advanced Routing & Traffic Control HOWTO | ||
---|---|---|
Prev | Chapter 9. Queueing Disciplines for Bandwidth Management | Next |
To properly understand more complicated configurations it is necessary to explain a few concepts first. Because of the complexity and the relative youth of the subject, a lot of different words are used when people in fact mean the same thing.
The following is loosely based on draft-ietf-diffserv-model-06.txt, An Informal Management Model for Diffserv Routers. It can currently be found at http://www.ietf.org/internet-drafts/draft-ietf-diffserv-model-06.txt.
Read it for the strict definitions of the terms used.
An algorithm that manages the queue of a device, either incoming (ingress) or outgoing (egress).
The root qdisc is the qdisc attached to the device.
A qdisc with no configurable internal subdivisions.
A classful qdisc contains multiple classes. Some of these classes contains a further qdisc, which may again be classful, but need not be. According to the strict definition, pfifo_fast *is* classful, because it contains three bands which are, in fact, classes. However, from the user's configuration perspective, it is classless as the classes can't be touched with the tc tool.
A classful qdisc may have many classes, each of which is internal to the qdisc. A class, in turn, may have several classes added to it. So a class can have a qdisc as parent or an other class. A leaf class is a class with no child classes. This class has 1 qdisc attached to it. This qdisc is responsible to send the data from that class. When you create a class, a fifo qdisc is attached to it. When you add a child class, this qdisc is removed. For a leaf class, this fifo qdisc can be replaced with an other more suitable qdisc. You can even replace this fifo qdisc with a classful qdisc so you can add extra classes.
Each classful qdisc needs to determine to which class it needs to send a packet. This is done using the classifier.
Classification can be performed using filters. A filter contains a number of conditions which if matched, make the filter match.
A qdisc may, with the help of a classifier, decide that some packets need to go out earlier than others. This process is called Scheduling, and is performed for example by the pfifo_fast qdisc mentioned earlier. Scheduling is also called 'reordering', but this is confusing.
The process of delaying packets before they go out to make traffic confirm to a configured maximum rate. Shaping is performed on egress. Colloquially, dropping packets to slow traffic down is also often called Shaping.
Delaying or dropping packets in order to make traffic stay below a configured bandwidth. In Linux, policing can only drop a packet and not delay it - there is no 'ingress queue'.
A work-conserving qdisc always delivers a packet if one is available. In other words, it never delays a packet if the network adaptor is ready to send one (in the case of an egress qdisc).
Some queues, like for example the Token Bucket Filter, may need to hold on to a packet for a certain time in order to limit the bandwidth. This means that they sometimes refuse to pass a packet, even though they have one available.
Now that we have our terminology straight, let's see where all these things are.
Userspace programs ^ | +---------------+-----------------------------------------+ | Y | | -------> IP Stack | | | | | | | Y | | | Y | | ^ | | | | / ----------> Forwarding -> | | ^ / | | | |/ Y | | | | | | ^ Y /-qdisc1-\ | | | Egress /--qdisc2--\ | --->->Ingress Classifier ---qdisc3---- | -> | Qdisc \__qdisc4__/ | | \-qdiscN_/ | | | +----------------------------------------------------------+Thanks to Jamal Hadi Salim for this ASCII representation.
The big block represents the kernel. The leftmost arrow represents traffic entering your machine from the network. It is then fed to the Ingress Qdisc which may apply Filters to a packet, and decide to drop it. This is called 'Policing'.
This happens at a very early stage, before it has seen a lot of the kernel. It is therefore a very good place to drop traffic very early, without consuming a lot of CPU power.
If the packet is allowed to continue, it may be destined for a local application, in which case it enters the IP stack in order to be processed, and handed over to a userspace program. The packet may also be forwarded without entering an application, in which case it is destined for egress. Userspace programs may also deliver data, which is then examined and forwarded to the Egress Classifier.
There it is investigated and enqueued to any of a number of qdiscs. In the unconfigured default case, there is only one egress qdisc installed, the pfifo_fast, which always receives the packet. This is called 'enqueueing'.
The packet now sits in the qdisc, waiting for the kernel to ask for it for transmission over the network interface. This is called 'dequeueing'.
This picture also holds in case there is only one network adaptor - the arrows entering and leaving the kernel should not be taken too literally. Each network adaptor has both ingress and egress hooks.