[P4-dev] Mapping between egress port and egress pipeline

Andy Fingerhut andy.fingerhut at gmail.com
Tue Jul 18 10:36:53 EDT 2017


I will give my understanding for why it would be bad if a P4 program for a
switch ASIC that follows the switch architecture described in the P4_14
spec [1] could change the egress port in the egress control block (and for
P4_16, the Portable Switch Architecture (PSA) is planned to be similar in
many ways to the architecture in the P4_14 spec).  Hopefully others can
jump in and add other reasons, if I am missing anything significant.

It is usually desirable for a switch to be work conserving [2] on each of
its output ports.  That is, if the switch contains at least one packet that
is finished processing, and destined for output port X, then it should
already be transmitting a packet on output port X, or should start to do so
very soon.

If there was no egress processing at all, then packets would be transmitted
soon after they were scheduled from the queue for that output port.  The
hardware scheduler for output port X could monitor that link, and know that
if the last packet it scheduled from the output port X queue was N bytes
long, and that output port transmits at 100 gigabits/second, for example,
it can easily calculate when it needs to choose another packet to transmit
out port X, leaving no idle time between packets.  If it schedules the next
packet too soon, then another buffer is needed before the output port to
store the packet somewhere, waiting for port X to be finished with the
previous packet.  If it schedules the next packet too late, then port X
will go idle for a time, and the switch is not work conserving.

All of the description above assumes that there are one or more queues
containing packets, all of which are known to be destined for output port
X, and this choice of output port will not change after the packet has been
chosen from that output port.

If egress processing can change that output port selection, then there is
no way to make the system work conserving.  For example, the scheduler
might schedule packets that ingress processing specified will go to output
ports 1 through 10, but if egress processing changes them all to output
port 5, then all of those packets except the first to be transmitted need
to be buffered somewhere, and all ports except 5 will be idle until the
scheduler chooses a packet that goes to them.

Could you have another set of queues and a big packet buffer after egress
processing?  Sure, I can imagine a switch ASIC designed like that.
However, even then, if some other processing can change the output port
after _that_ packet buffer's scheduler, then the switch cannot achieve work
conserving behavior.

[1] https://p4lang.github.io/p4-spec/
[2] https://en.wikipedia.org/wiki/Work-conserving_scheduler

On Tue, Jul 18, 2017 at 3:35 AM, Eric Ruan <ruanweizhang at gmail.com> wrote:

> Dear Antonin, Andy and all,
> I wonder the mapping between egress port and egress pipeline is 1:1 or
> many:1.
> To the best of my knowledge, the egress port is the output port of switch.
> The egress port has to be set in the ingress pipeline before the packet is
> routed to the egress pipeline.
> If 1:1 is the case, then it makes sense that the egress port cannot be
> changed in egress pipeline. But what is the design principle behind of
> this? Since this is kind of waste of resources.
> If many:1 is the case, then why is it not allowed to change the egress
> port in egress pipeline?
> Thanks in advance.
> Best,
> Eric
> _______________________________________________
> P4-dev mailing list
> P4-dev at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20170718/ae0eeb31/attachment-0002.html>

More information about the P4-dev mailing list