[P4-dev] on the cloning primitive in the simple switch

Andy Fingerhut andy.fingerhut at gmail.com
Mon Jun 24 20:00:01 EDT 2019


i) Given that it really shouldn't even be allowed to perform an I2E clone
operation in the egress code, whatever happens when you try it, I cannot
think of any reason why the implementation should change, _except_ if it is
changed to disallow the operation completely.

ii) I tried out my own simple example, and also see that an I2E cloned
packet goes through the parser, then to egress processing, but it does not
do ingress processing (as it should not).  It appears from the
simple_switch implementation code that going through the parser is an
intentional design choice when implementing simple_switch.

One potential danger there is if your parser code depends on a value like
standard_metadata.ingress_port (which might be useful if you expect
different ports to have different sequences of packet headers), it appears
that the I2E cloned packet has standard_metadata.ingress_port cleared to 0
before it goes through the parser, which could cause it to parse the packet
differently than it was parsed when it first arrived.  Most P4 parser code
is written so its results are independent of the value of
standard_metadata.ingress_port, so in most cases it would not make a
difference.  I have filed an issue on p4lang/behavioral-model to ask about
possibly enhancing this, though, or at least documenting it:
https://github.com/p4lang/behavioral-model/issues/795

iii) I would expect that whether the timestamp of the cloned vs. original
packet are higher or lower would depend completely upon the current queue
depths for the queues on which the original vs. cloned packet are enqueued
(which could be the same, but in general can be different).   I would
strongly recommend that you assume those can happen in either order, and
not to write P4 code that only works if one of them always happens before
the other.

Andy



On Mon, Jun 24, 2019 at 7:47 AM Salvatore Signorello <
ssignorello at ciencias.ulisboa.pt> wrote:

> thank you for your reply, Andy, here follow some updates on the previous
> points after re-running the same test program over the latest bmv2-ss
> software:
>
>
> i) it is still possible to call clone.I2E in the egress and get a cloned
> packet going through the egress pipeline (no parser stage as in (ii), see
> below). The *packet_length* value of the cloned packet is not correct
> this time, meaning some other metadata values may not be set correctly
> either.
>
>
> ii) calling clone.I2E , the cloned packet is first parsed again and then
> enters the egress pipeline:
>
> *"[14:58:32.551] [bmv2] [D] [thread 31185] [0.1] [cxt 0] Parser 'parser':
> end*
>
> *[14:58:32.551] [bmv2] [D] [thread 31187] [0.1] [cxt 0] Pipeline 'egress':
> start"*
>
>
> iii) now the switch seems to update almost all the metadata values as
> expected. That is, same *packet_length* value for both original and
> cloned packets. *ingress_global_timestamp* and *egress_global_timestamp *values
> set for the original packet, only the latter one for the cloned packet.
> Yet, the cloned packet has lower *enq_timestamp* and
> *egress_global_timestamp* values than the original packet.
>
>
> Considering the nature of bmv2-ss, I am not sure it would be necessary to
> open an issue for (iii). Then, is the behavior seen in (ii) the intended
> one?
>
> Yet, I see there might be more logical flaws in the execution of programs
> on bvm2-ss with the issue in (i). You may find the simple P4 program used
> for testing in attach. A very simple packet has been created and injected
> through scapy for testing (pkt = Ether()/IP()/TCP/' ' ). sswitch_CLI
> commands are in attach too.
>
>
> Salvatore
>
>
> On 24/06/19 03:00, Andy Fingerhut wrote:
>
> i) While it is true that the intent of the clone operations is that I2E is
> only invoked during ingress processing, and E2E is only invoked during
> egress processing, I would not be surprised at all if the latest p4c and
> behavioral-model code give no error if your try to invoke them in the wrong
> places.  There are several kinds of specific-to-the-v1model-architecture
> consistency checks that are now made by the v1model-specific parts of p4c,
> but likely this is not one of them, at least not yet.
>
> ii) If a cloned packet is going through ingress first, that sounds like a
> bug to me.  I do not recall seeing that behavior in my testing of clone
> operations with p4c and behavioral-model.  If you have a test case
> demonstrating this (i.e. full P4 program, command used to compile it, any
> table entries you added or other control plane configuration done, and a
> test packet to send in), I could take a look.  I would recommend testing
> your code with the latest versions of p4c and behavioral-model, too, just
> in case something has been fixed there since whatever version you have been
> using, but I do not recall any changes made in that area for the last year
> or so.
>
> iii) I would not be surprised at all if timestamp values were 0 for cloned
> packets, given the current behavioral-model code.  You could try filing an
> issue on the p4lang/behavioral-model project and we can find out whether
> that is considered a bug worth fixing, or the expected behavior.  It seems
> to me certainly more useful if ingress_timestamp and egress_timestamp were
> always equal to the last time that a packet began ingress/egress processing.
>
> Similarly for packet_length in a cloned packet.  I would not be surprised
> if for several types of packets (maybe some others besides cloned packets,
> even), that packet_length may be 0 given the current behavioral-model
> implementation.
>
> I would guess that the issues above are independent of the bug that
> sometimes does not preserve metadata with recirculated, resubmitted, and
> cloned packets.
>
> iv)  I have not yet tested configuring a clone session of simple_switch to
> be a multicast group id, but I believe Antonin Bas has implemented support
> for that.  I do not know if there is a simple_switch_CLI command to enable
> configuring that, or not.  Perhaps it can be configured via the P4Runtime
> API in the simple_switch_grpc process, but might not yet be implemented
> from the Thrift API and/or simple_switch_CLI.
>
> Andy
>
>
>
> On Sun, Jun 23, 2019 at 1:14 PM Salvatore Signorello <
> ssignorello at ciencias.ulisboa.pt> wrote:
>
>> Dear all,
>>
>> I would need some clarifications about the clone primitive behaviors,
>> IngressToEgress (I2E) and EgressToEgress (E2E), implemented by the bmv2's
>> simple switch.
>>
>> Caveat: I have been testing the clone primitive in an old tutorial's
>> virtual machine, available here
>> <https://drive.google.com/uc?id=1f22-DYlUV33DsR88_MeMb4s7-1NX_ams&export=download>.
>> So, the compiler tool-chain and the target itself I am using are not the
>> very latest versions.
>>
>>
>> Follow my questions:
>>
>> i) scope of the primitive: I thought I2E and E2E were meant to be only
>> invoked respectively in the ingress pipeline and in the egress one.
>> However, for example, I can invoke clone.I2E in the egress and get a clone
>> of the packet injected back at the beginning of the egress pipeline. First,
>> is this last behavior intended? Then, should not there be some sort of
>> restriction about blocks of your program where these primitives can be
>> invoked?
>>
>>
>> ii) I2E behavior: when clone.I2E is called within the ingress pipeline,
>> the cloned packet goes again through the ingress pipeline (according to the
>> switch's log) and then through the egress, while the original packet enters
>> the egress pipeline too. Is this the expected behavior? One more minor
>> thing, according to the respective *enq_timestamp* std metadata values,
>> the cloned packet enters the traffic manager first.
>>
>>
>> iii) standard metadata values:  I am bit surprised by the metadata
>> *ingress_global_timestamp*/*egress_global_timestamp* and *packet_length*
>> values in cloned packets. The former is always zero, whether I would
>> expected to see the timestamp of either the original packet at ingress or
>> of the cloned one at egress. The latter holds a value which does not
>> correspond to the actual packet size. Are both behaviors intended?
>>
>> Somewhere in [2]:
>>
>> *"Then you may find that the metadata field values are not preserved. The
>> current implementation for preserving such metadata field values is very
>> fragile, and effectively it sometimes works by accident, not by design."*
>>
>> does it relate to my last question? Which kind of metadata (standard or
>> user-defined) does that warning refer to?
>>
>>
>> iv) options for cloning multiple packets: it looks like a session id can
>> be only be associated to a single port (or to a multicast group as
>> mentioned in one related thread on github [1], though not tested by me
>> yet). In fact, a new assignment through the mirroring_add command
>> overwrites the previously assigned port. That would mean that to clone a
>> packet multiple times, every time a cloned instance of the packet going
>> through the program should re-call the clone primitive. Am I correct? That
>> would imply that to generate and customize N (with N greater than two)
>> copies of a packet, the multicast "could be a much better fit". I know this
>> last statement sounds a bit obscure. Indeed, I am still trying to figure
>> out which could be the implications of one choice over the other. So, any
>> hint on this would be very much appreciated.
>>
>>
>> I have used a very simple program to test the above behaviors. In my
>> program, I call an action invoking the clone primitive in the different
>> control blocks , ingress and egress. Then, I create and use a single
>> clone's session id, connected to a switch's output port at run-time through
>> the mirroring_add command.
>>
>>
>> Finally, I am already aware of the two following threads on github,
>> however, please feel free to refer to them (or to specific parts therein)
>> if you realize that I might be missing something from there.
>>
>> [1] https://github.com/p4lang/behavioral-model/issues/667
>>
>> [2]
>> https://github.com/jafingerhut/p4-guide/tree/master/v1model-special-ops
>>
>>
>>
>> Thank you in advance for your help and sorry for this a-bit-verbose
>> message,
>>
>> regards,
>>
>> Salvatore
>> _______________________________________________
>> P4-dev mailing list
>> P4-dev at lists.p4.org
>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20190624/f931ddae/attachment.html>


More information about the P4-dev mailing list