[P4-dev] queue manipulation

Vladimir Gurevich vladimir.gurevich at barefootnetworks.com
Mon Aug 28 01:03:38 EDT 2017


Hello Peng,

You are absolutely correct that hierarchical metering can easily be
achieved by combining the outputs of multiple non-hierarchical meters that
are defined in PSA. The additional benefit is that the depth of the
hierarchy and the actual algorithm that combines the output of two or more
meters to derive the final color can be fairly arbitrary.

This, I think, demonstrates the benefit of having simple primitives and a
flexible way to combine those as opposed to have complex constructs that
are difficult to repurpose or combine. And this brings me to your next
question/comment about having a counter for number of packets/bytes dropped
by the meter.

First, it is important to understand that meters do not drop packets --
they only assign them one of 3 colors. What to do afterwards is decided by
the rest of your program. It can certainly decide to drop Red packets, but
it can also decide to redirect Yellow packets somewhere or multicast only
Green packets, etc. So, providing an integrated counter makes very little
sense. What makes sense is creating a table with regular counters, that are
already defined in PSA and decide which counter you want to increment and
when, For example, you can have a simple set of 3 counters (altogether or
per meter), one counting Red packets/bytes, another one counting Yellow
packets/bytes and the last one counting Green. So, again, you combine
simple primitives to achieve something interesting.

If you want to be able to export these counters via INT, then you will need
to replace counters with registers. The reason is that PSA does not allow
you to access counter values in the data plane -- you can only increment
them. Registers, on the other hand, can be both written, modified and read.

Happy hacking,
Vladimir

On Sun, Aug 27, 2017 at 7:42 PM, Peng He <peng.he.2000 at gmail.com> wrote:

> I like PIFO, and hope there could be (physical) implementation about it
> sometime soon.
>
> An example of hierarchical meter (not queuing) might be like this:
> first-layer of metering, say one based on VLAN ID for an example, then
> after the first layer metering, 2nd layer metering applies (could be one or
> more meter instances), for instance metering based on COS value in the VLAN
> tag.
>
> In the PSA arch doc (thanks for the reference links), meter is an element
> fully based on RFC 2698. It should be feasible for P4 program to achieve
> hierarchical meter with the RFC 2698 based meter element.
>
> I see the value to have information related to metering, e.g., # of
> packets/bytes accepted by the meter, # of packets/bytes dropped by the
> meter, etc. But I am not sure should INT include these info or somebody
> else. Is there a list or definition what info INT should include (for
> trouble shooting purpose, for monitoring purpose, what others?)?
>
> BTW, how does the PSA arch take care about OAM and operational data
> collection? extern (counter)? CPU? control plane?
>
>
> Regards,
> Peng
>
>
>
> On Wed, Aug 23, 2017 at 7:15 PM, Andy Fingerhut <andy.fingerhut at gmail.com>
> wrote:
>
>> P4 doesn't have loops, pointers, malloc(), or free(), among other things
>> it does not have that more general-purpose programming languages have.  OK,
>> technically it does have loops, but the kind it has (in a header parser)
>> doesn't help you in implementing a collection of FIFO queues in P4, nor in
>> implementing a packet scheduler that selects from and removes the first
>> packet from one or more FIFO queues.
>>
>> Sure, a particular P4-programmable implementation might support 4
>> class-of-service queues per output port, with weighted-fair-queuing
>> scheduling and configurable weights between those 4 class-of-service
>> queues.  That doesn't necessarily make it possible to implement the PIFO
>> algorithm on such a device.
>>
>> Having 4 class-of-service queues per output port, and the ability to
>> configure one of them as strictly higher priority than the other 3 queues,
>> and having some algorithm that can be written in P4 to select some packets
>> to go to the strict high priority queue vs. some default priority queue,
>> could be done.
>>
>> Andy
>>
>> On Wed, Aug 23, 2017 at 3:39 PM, <hemant at mnkcg.com> wrote:
>>
>>> Programmable packet schedulers for a switch are starting to emerge –
>>> this is what one would need.  See the URL below which includes a research
>>> paper and also source code.
>>>
>>>
>>>
>>> http://web.mit.edu/pifo/
>>>
>>>
>>>
>>> I don’t see any reason why the PIFO algorithm cannot be written in P4.
>>>
>>>
>>>
>>> Hemant
>>>
>>>
>>>
>>> *From:* Wallas Fróes [mailto:wallasync at gmail.com]
>>> *Sent:* Wednesday, August 23, 2017 4:38 PM
>>> *To:* hemant at mnkcg.com
>>> *Cc:* Andy Fingerhut <andy.fingerhut at gmail.com>; Peng He <
>>> peng.he.2000 at gmail.com>; p4-dev lists.p4.org <p4-dev at lists.p4.org>
>>>
>>> *Subject:* Re: [P4-dev] queue manipulation
>>>
>>>
>>>
>>> Hello,
>>>
>>>
>>>
>>> I was occupied those weeks.
>>>
>>> But what I would liked to know is if at the P4 language has something
>>> that acept the packets queuing.
>>>
>>>
>>>
>>> E.g: I have a packets traffic arriving at the switch and it going out to
>>> the port 1. The out queue port 1 is big. So, it arrive a packet with
>>> priority to port 1.
>>>
>>> I would liked to know how to pass the packet in the front the others.
>>>
>>> I have as create two logic queues to the same port (out_put port 1) or
>>> something that to insert the packet in the queue start.
>>>
>>>
>>>
>>> That is possible using just P4, or is it necessary to programming inside
>>> of the switch?
>>>
>>>
>>>
>>> 2017-08-22 19:53 GMT-03:00 <hemant at mnkcg.com>:
>>>
>>> Hierarchical  queuing framework forms trees such as HQF on Cisco routers
>>> for queueing. See https://www.youtube.com/watch?v=gnqDBiVNyrY
>>>
>>>
>>>
>>> Also, see, http://dpdk.org/doc/guides/prog_guide/qos_framework.html
>>>
>>>
>>>
>>> Let’s see what does Peng have to see for what he had in mind.
>>>
>>>
>>>
>>> Hemant
>>>
>>>
>>>
>>> *From:* P4-dev [mailto:p4-dev-bounces at lists.p4.org] *On Behalf Of *Andy
>>> Fingerhut
>>> *Sent:* Tuesday, August 22, 2017 6:34 PM
>>> *To:* Peng He <peng.he.2000 at gmail.com>
>>> *Cc:* p4-dev lists.p4.org <p4-dev at lists.p4.org>
>>> *Subject:* Re: [P4-dev] queue manipulation
>>>
>>>
>>>
>>> There are 3-color 2-rate meters (also known as policers) in PSA.  They
>>> can be used in egress processing, ingress processing, both, or neither, as
>>> you choose in your P4 program.
>>>
>>>
>>>
>>> If by hierarchical policers you mean something other than a "single
>>> 3-color 2-rate policer", e.g. somehow daisy-chaining multiple of those
>>> together so they can be updated in some way dependent upon each other, then
>>> the only way that I am aware of is if it can be implemented by a sequential
>>> update of multiple independent 3-color 2-rate policers.
>>>
>>>
>>>
>>> I have not heard of any proposals for INT to include information related
>>> to metering.  Do you have any links to proposals involving that?
>>>
>>>
>>>
>>> Andy
>>>
>>>
>>>
>>> On Sun, Aug 20, 2017 at 5:32 PM, Peng He <peng.he.2000 at gmail.com> wrote:
>>>
>>> Thanks Andy for the comprehensive response! I can see readable egress
>>> queue length would be useful for INT. Could you also talk about ingress
>>> metering in PSA? I mean three color two rate policer, will the hierarchical
>>> metering being considered in PSA? I suppose INT would also need info about
>>> metering, e.g., accepted packets/bytes, dropped packets/bytes? BTW, is
>>> there a draft to describe PSA in a bit details at this time? Thanks again.
>>>
>>> Regards,
>>>
>>> Peng
>>>
>>>
>>>
>>> On Wed, Aug 16, 2017 at 1:33 PM, Andy Fingerhut <
>>> andy.fingerhut at gmail.com> wrote:
>>>
>>> A key fact about P4 right now is that different implementations of P4
>>> can have different capabilities from each other.  Neither the P4_14 nor
>>> P4_16 language specs contain anything I can think of that would directly
>>> enable you to implement the things you ask about.  That doesn't stop
>>> _particular P4 implementations_ from providing additional mechanisms to do
>>> so, but of course that can limit the portability of P4 programs wanting to
>>> implement such features.
>>>
>>>
>>>
>>> The Portable Switch Architecture (PSA) spec is in draft form right now,
>>> and I can speak to its current contents, and what I expect it might include
>>> when it reaches version 1.0.
>>>
>>>
>>>
>>> 1. can queue depth be manipulated through P4?
>>>
>>>
>>>
>>> There is a proposal to include as part of the PSA the ability to make
>>> visible to your P4 egress control block the depth that a queue had at the
>>> time a packet was enqueued at its end, here: https://github.com/p4lan
>>> g/p4-spec/issues/335   The title of that issue, and some of the
>>> comments there, suggest making it readable from the ingress control block
>>> before enqueuing it, but that seems to be a less favored approach due to
>>> implementation difficulties in some ASIC implementations.
>>>
>>>
>>>
>>> 2. can the queue congestion avoidance algorithms or associated
>>> parameters per egress queue be manipulated?
>>>
>>>
>>>
>>> "queue congestion avoidance algorithms" covers a lot of possibilities,
>>> so it is difficult to answer that question without knowing more about what
>>> visibility to the system state you need for a particular algorithm.  If you
>>> have one in mind that can be implemented without knowing about current
>>> queue depths, i.e. it can be implemented using the Register extern to read
>>> state, modify it in your P4 program, and write it back, e.g. elephant
>>> traps, Bloom filters, etc. etc., then most likely yes.
>>>
>>>
>>>
>>> 3. you mentioned control plane APIs exist for configuring scheduling
>>> policies, could you elaborate it a bit more? don't think P4 can
>>> describe scheduling polices at this moment? or treat scheduler as an
>>> 'extern'? till P4 can have programmable packet scheduling?
>>>
>>>
>>>
>>> If we split hairs a bit, I didn't actually say they exist :-)  Whether a
>>> P4 implementation has multiple class of service queues, with scheduling
>>> parameters that can be configured via a control plane API, is not part of
>>> the PSA, and I would guess it will not be, except to specify how the
>>> ingress P4 code could specify one of several class-of-service queues for
>>> the selected output port (but allowing an implementation to only have 1
>>> queue per output port).  The P4-API effort might choose to define APIs for
>>> configuring parameters for a scheduler, but I am not sure whether that is
>>> part of that group's plans.
>>>
>>>
>>>
>>> Note that even if a P4 implementation contained some non-P4-programmable
>>> scheduling algorithms, e.g. weighted fair queueing with run-time
>>> configurable weight values for each queue, that can be tremendously
>>> useful.  P4_16 definitely takes the approach that these are in externs.
>>>
>>>
>>>
>>> Andy
>>>
>>>
>>>
>>> On Wed, Aug 16, 2017 at 9:43 AM, Peng He <peng.he.2000 at gmail.com> wrote:
>>>
>>> Sorry to jump here.. I have similar question about queue:
>>>
>>> 1. can queue depth be manipulated through P4?
>>>
>>> 2. can the queue congestion avoidance algorithms or associated
>>> parameters per egress queue be manipulated?
>>>
>>> 3. you mentioned control plane APIs exist for configuring scheduling
>>> policies, could you elaborate it a bit more? don't think P4 can
>>> describe scheduling polices at this moment? or treat scheduler as an
>>> 'extern'? till P4 can have programmable packet scheduling?
>>>
>>> Regards,
>>>
>>> Peng
>>>
>>>
>>>
>>> On Wed, Aug 16, 2017 at 10:24 AM, Andy Fingerhut <
>>> andy.fingerhut at gmail.com> wrote:
>>>
>>> In P4 source code, you typically do not manipulate queues.  You select
>>> which queue a packet should be enqueued upon in P4 source code, e.g. by
>>> assigning a value to a metadata field that selects an output port for the
>>> packet, or multiple output ports via a multicast group.  In the most common
>>> case those queues will be FIFO.  I guess if that is what you meant by
>>> manipulating queues, then that is how it is typically done.
>>>
>>>
>>>
>>> Can you say a little more about what kinds of queue manipulation you are
>>> interested in here?
>>>
>>>
>>>
>>> For example, are asking about what kind of control plane APIs exist for
>>> configuring scheduling policies between multiple queues leading to the same
>>> output port?
>>>
>>>
>>>
>>> Something else?
>>>
>>>
>>>
>>> And
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Aug 15, 2017 at 10:04 PM, Wallas Fróes <wallasync at gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> I have a question.
>>>
>>> Would I like to know, how I do for manipulating the queues in P4?
>>>
>>>
>>>
>>> --
>>>
>>> Att,
>>>
>>>
>>>
>>> Wallas Fróes
>>>
>>> Information Security in Computer Networks
>>>
>>> Computer Engineer
>>>
>>> Cel: (77) 8841-0762
>>>
>>>
>>>
>>> _______________________________________________
>>> P4-dev mailing list
>>> P4-dev at lists.p4.org
>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> P4-dev mailing list
>>> P4-dev at lists.p4.org
>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> P4-dev mailing list
>>> P4-dev at lists.p4.org
>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Att,
>>>
>>>
>>>
>>> Wallas Fróes
>>>
>>> Information Security in Computer Networks
>>>
>>> Computer Engineer
>>>
>>> Cel: (77) 8841-0762
>>>
>>
>>
>
> _______________________________________________
> P4-dev mailing list
> P4-dev at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>



-- 
*Vladimir Gurevich*

*Barefoot Networks*
*Technical Lead, Customer Engineering*
Email: vag at barefootnetworks.com
Phone: (408) 833-4505
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20170827/e5d9f389/attachment-0002.html>


More information about the P4-dev mailing list