[P4-dev] Is metadata considered in deparsing?
huynh.tu.dang at usi.ch
Tue Jun 21 03:22:07 EDT 2016
Hi Javier, Antonin,
Taking VXLAN tunneling as an example, a packet from VM belong to a VXLAN segment is parsed and then in the processing pipeline, outer ethernet, IP and UDP headers are added to the packet. While the outer headers didn't exist in the parsed graph, they do exist in the deparsed graph. Deparser needs a signal to switch between the original and outer ethernet headers. In my implementation, the metadata header isn’t added to the packet, but it is used as the signal to direct the parser instead.
On Jun 21, 2016, at 3:58 AM, Javier Blazquez <jblazquez at riotgames.com<mailto:jblazquez at riotgames.com>> wrote:
Makes sense, thanks for the explanation.
I agree that a more formalized way to define deparsers would be good to have in a future spec.
On Monday, June 20, 2016, Antonin Bas <antonin at barefootnetworks.com<mailto:antonin at barefootnetworks.com>> wrote:
In my opinion, the example in the spec is not chosen well. If you have A -> B and B->A, then you cannot obtain a topological sort because you end up with a loop in the parse graph (cyclic graph). The current compiler will throw an error and reject your program.
The deparsing order is determined entirely at compile time by running a topological sort. Calls to add_header etc. have no impact on the header ordering. If multiple orderings are valid, one is chosen "at random", which I would agree is far from optimal (I am not even sure the compiler gives a warning).
For example, let's assume you have 2 possible parse paths only:
hdrA -> hdrB
hdrA -> hdrC
The parse graph in this case is very simple. Yet we have two distinct valid topological ordering: hdrA -> hdrB -> hdrC and hdrA -> hdrC -> hdrB.
Imagine a packet comes in which takes the first path. At the beginning of the pipeline, hdrA and hdrB are therefore valid, while hdrC is not. If the packet goes to the deparser like this (without any processing), there is no real ambiguity and the outgoing packet will be identical to the incoming packet.
However, if during the match-action pipeline I call add_header(hdrC) and make hdrC valid, then the deparser will either produce hdrA/hdrB/hdrC or hdrA/hdrC/hdrB. In some cases, you know that only hdrA/hdrB/hdrC makes sense from a networking / protocol point of view. You can therefore add a dummy transition in path1:
hdrA -> hdrB -> hdrC
thus making the topological sorting unique
Thanks Antonin. I have a few questions about the topological sort you mentioned.
As you say, if a packet may have header A followed by header B _or_ header B followed by header A then the static parser graph does indeed have multiple topological sortings. However, when an actual packet flows through the pipeline this packet will have only one of the two orderings, hence it should be possible to reconstruct those headers in the right order.
I assume the issue is that the P4 compiler currently builds its deparser from a static view of the parse graph, and it therefore can't modify its behavior at runtime depending on how the input packet looked like? If the deparser knew which `extract` calls were made in which order then it could reconstruct the packet correctly in those situations.
What about calls to `add_header`/`push`/etc? Those calls don't appear in parsers, so how does the deparser find those headers when it reconstructs the output packet? I assume it visits every single header in the P4 program and checks its valid bit, and if so writes out the header. If that's the case, which ordering does it choose for those headers that have been pushed or added in compound actions? Do they all go after the regular headers that were extracted in a parser?
As Javier say, the deparser behavior is not rigorously specified in P4 v1.1, which hopefully should change in the next version of the language. As I explained in this previous email to the mailing list (http://lists.p4.org/pipermail/p4-dev_lists.p4.org/2016-May/000319.html), the P4 compiler uses a topological sorting on the parse graph to determine the deparsing behavior. The deparser then follows this ordering and emits headers for which the validity bit is set.
This can make things tricky / ambiguous because a topological sorting is not always unique. Furthermore, in some cases, you may wish to emit packets (e.g. CPU packets) which the parser may not need to know about. In this case, you may have no choice but to add dummy transitions to your parse graph. See the copy_to_cpu program in the tutorials repository for an example: https://github.com/p4lang/tutorials/blob/master/examples/copy_to_cpu/p4src/copy_to_cpu.p4#L45
In your case, you have no choice but to make appropriate calls to add_header / remove_header to mark headers as valid / invalid. Maybe future P4 versions will let us program the deparser in a more structured way.
I think of deparsing as simply the reversal of the parsing process, that is, going through whatever `extract` operations have been executed and copying those headers back from the Parsed Representation into the packet for transmission.
In that scenario the use of metadata to switch among parsers shouldn't change things because all that matters is what `extract` operations have been performed on actual headers and in which order. Remember that metadata is by definition not part of the actual packet on the wire so whatever values your metadata headers have shouldn't matter when it comes to assembling the packet for transmission.
However, an interesting side question that comes to mind is regarding the `push`, `pop`, `add_header` and `remove_header` primitive actions. These are not used during the parsing phase; they're used as part of compound actions in the ingress or egress pipeline, but they affect the packet that will be assembled for transmission.
So deparsing isn't _just_ the reversal of the parsing process, but also the addition/removal of headers as per those primitives above? In which case, does the order of headers in the transmitted packet follow the order in which push/add_header/etc have executed for a given packet? And regarding header stacks, I assume they're always serialized as a single block (i.e. all valid headers in the stack following each other)? The 1.1 spec doesn't seem to specify the order in which headers will be serialized to the output packet.
I wonder if metadata is considered in deparsing phase? As I set the value of metadata in an action and use this metadata value to switch among parsers. But it seems the deparser skips checking the value of metadata or the value of metadata is reset before deparsing.
P4-dev mailing list
P4-dev mailing list
More information about the P4-dev