[P4-dev] Minor Issues with the P4 Compiler

Peter Newman (petenewm) petenewm at cisco.com
Wed Jun 24 17:44:54 EDT 2015



I was thinking of something like this. However, the array is only two deep — inner and outer — and I only change the index when I hit a tunneling header, VXLAN for example. That should avoid the problem you describe.

I really, really want to avoid doubling the size of my tcam in all of the pipeline stages of the parser which is what I assume will happen if I follow the p4.org<http://p4.org> data-center example and define both inner and outer versions for every parse state.

Given I don’t have a real target to work with I admit that some of the implications have undoubtedly escaped me.



On Jun 24, 2015, at 2:25 PM, Leo Alterman <leo at barefootnetworks.com<mailto:leo at barefootnetworks.com>> wrote:

Peter, I'm guessing you're using arrays of header instances and "extract(arr[next]);" calls while parsing, so as not to overwrite outer headers when you recurse on inner headers?

If so, this approach is a little dangerous in the following sense:
Let's say I have N-element arrays of ethernet frames, ipv4 headers, TCP headers, and UDP headers.
This works fine for a packet that's TCP all the way down:

eth -> ipv4 -> tcp --(recurse)--> eth -> ipv4 -> tcp --(recurse)--> ...

In each array, data at index N corresponds to that level of encapsulation in the packet.
But let's say you mix TCP and UDP:

eth -> ipv4 -> tcp --(recurse)--> eth -> ipv4 -> udp --(recurse)--> ...

Now, the UDP header at the first layer of encapsulation will be at array index 0, while the other headers at this level of encapsulation will be at index 1. The above packet will be indistinguishable from:

eth -> ipv4 -> udp --(recurse)--> eth -> ipv4 -> tcp --(recurse)--> ...

Some targets, depending on how they do packet reassembly, may not even be able to compile a parse graph that allows the above.

If you're not using arrays, I misunderstood your post and you can probably ignore the above :) .


On Wed, Jun 24, 2015 at 12:58 PM, Antonin Bas <antonin at barefootnetworks.com<mailto:antonin at barefootnetworks.com>> wrote:
Hi Peter,

I am sending your email to the whole mailing list, some people may have a better answer than me.

However in this specific case (tunnelling) I don't think you can get away with using the same header and the same parse state. If you look at our switch.p4 example on p4lang/p4factory, you will notice that we sometimes define inner and outer instances for headers (https://github.com/p4lang/p4factory/blob/master/targets/switch/p4src/includes/parser.p4#L399). We also use different parse states for inner instances and outer instances. For example we have a parse_ethernet state and a parse_outer_ethernet state.
We need 2 different header instances because we can have 2 different instances in the incoming packet and in the outgoing packet. We also need 2 different parse states because extracting to a header instance is done in a static fashion.

Thanks for pointing out the p4_expression irregularity. I will try to look into it when I have time.

On Wed, Jun 24, 2015 at 12:47 PM, Peter Newman (petenewm) <petenewm at cisco.com<mailto:petenewm at cisco.com>> wrote:

In a data center switch pretty much everything you can do in the outer packet can also be done in the inner packet. So when I get to the VXLAN header of the outer packet I just go straight back to the beginning and start all over again with parsing the inner ethernet packet. One does needs to maintain some state to prevent infinite recursion but that shouldn’t be a problem.

I’m up to about 1000 lines of P4 already just on the parser. I’d probably have to get someone’s permission to start sharing source. I’ll look into that.

I wrote some simple tree walking code to print out the type of each object in the expression tree once the front-end had finished. I found ints and strings. Searching for the strings in the p4_header_instance.fields got me what I needed.

I guess I really need to update my installation. I’m working with the earliest public version. I’ll go find a local git expert — but if it’s working for me why fix it right now...

Thanks for your help.


On Jun 24, 2015, at 12:21 PM, Antonin Bas <antonin at barefootnetworks.com<mailto:antonin at barefootnetworks.com>> wrote:

Hi Peter,

Thanks for your email. Please see comments inline:

On Tue, Jun 23, 2015 at 11:53 AM, Peter Newman (petenewm) <petenewm at cisco.com<mailto:petenewm at cisco.com>> wrote:
I am working on a parser in P4. I noticed the following minor issues with the compiler:

Compiler crashes if you have a cycle in the parse graph. It enters an infinite recursion at line 448 in p4_tables.py:

xconds = exclusive_conditions.Solver(hlir)

You can prevent this behavior if you set the optimize flag to false in the call to build(). I didn’t look at what optimization it is trying to do but it doesn’t seem to affect my parser.

I did notice the comment a few lines later: "# I am being lazy, and this is all tentative anyway” so I guess this feature is still under active development. It might be better to check for a cycle before executing the optimization.

What kind of cycle do you have in your P4 program? This MPLS parser (https://github.com/p4lang/p4factory/blob/master/targets/switch/p4src/includes/parser.p4#L147) has a cycle (you can extract up to 3 MPLS headers) and is handled correctly by the compiler. More generally, it is okay to have a cycle so long as you are extracting to a tag stack. I can't think of a use case where this would not be the case, which is why I would be interested in seeing your P4 program.

The HLIR spec defines a p4_expression to contain references to objects of type: None, int, p4_header_instance or p4_field. The compiler actually gives type str instead of p4_header_instance or p4_field. The string contains the name of the field.

At which stage do you observe strings? The resolve_names function (https://github.com/p4lang/p4-hlir/blob/master/p4_hlir/hlir/p4_expressions.py#L64) should have been called at some point and take care of this.

In a variable length packet header the compiler checks the max_length field but interprets this field as being specified in bits whereas the HLIR spec defines it to be specified in bytes.

Wasn't this be fixed by this commit: https://github.com/p4lang/p4-hlir/commit/72f20169a25619e6cc318254b952886a26e25972 ?

Just thought I’d document these observations before I forget them.


P4-dev mailing list
P4-dev at p4.org<mailto:P4-dev at p4.org>
Listinfo - http://mail.p4.org/mailman/listinfo/p4-dev_p4.org
Archives - http://mail.p4.org/pipermail/p4-dev_p4.org/



P4-dev mailing list
P4-dev at p4.org<mailto:P4-dev at p4.org>
Listinfo - http://mail.p4.org/mailman/listinfo/p4-dev_p4.org
Archives - http://mail.p4.org/pipermail/p4-dev_p4.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20150624/d0f3693b/attachment-0001.html>

More information about the P4-dev mailing list