[P4-dev] P4 program implementation cost

Christian Esteve Rothenberg chesteve at dca.fee.unicamp.br
Fri Aug 19 12:27:05 EDT 2016

Here is a related work that despite being for x86 platforms in the context
of NFV it could be useful to help you modeling the P4 implementation cost
and performance --noting that P4-defined HW-based table lookups may have
O(1) costs plus other fundamental differences...


On Thu, Aug 18, 2016 at 9:11 PM, Vladimir Gurevich <
vladimir.gurevich at barefootnetworks.com> wrote:

> Hello Itzik,
> Most of what you are asking for can only be achieved using vendor-specific
> tools only. I am not sure what your particular interest is, but I wanted to
> point you to some free tools that can give you theoretical answers.
> The free tool p4-graphs
> <https://github.com/p4lang/p4-hlir/blob/master/bin/p4-graphs> (that is
> available on P4.org Github in p4-hlir repo) computes the minimum number of
> stages, required to implement a given pipeline.  This is done by analyzing
> the dependencies and then sorting the tables topologically to see the
> maximum depth of the graph.
> In real hardware the number of stages might be higher, but assuming that
> you have hardware with unlimited stage capacity it gives the right answer.
> You can assume that the latency increases linearly with the number of
> stages or you can further look at the dependencies and assume that
> different values for interstage latency, depending on the dependency type.
> While the tool doesn't currently calculate the theoretical number of bits
> required to represent the tables, it can be done fairly easily by
> traversing P4 HLIR (High-Level Intermediate Representation). For example,
> for each table you can calculate the number of bits required for the match
> key (1x for exact match, 2x for ternary, etc.), action data plus log2(n)
> bits to represent an action (one out of n). Then, multiply by the table
> size.
> Again, in the real hardware the actual number of bits used per table might
> differ significantly, depending on allocation quanta, packing, etc. But,
> you can create a free tool that will give you the theoretical minimum (and
> even contribute it to p4.org!) You can read more about P4 HLIR here
> <https://github.com/p4lang/p4-hlir/blob/master/HLIRSpec.pdf>.
> By the way, in P4 we typically assume that table lookup (and action
> execution) takes one unit of a time (one cycle).
> Happy hacking,
> Vladimir
> On Thu, Aug 18, 2016 at 12:05 AM, iashken at cs.technion.ac.il <
> iashken at cs.technion.ac.il> wrote:
>> Hi,
>> I want to compare the "implementation cost" of two different data-path
>> algorithms. I can write a P4 program for each of them. But then, how can I
>> compare their cost or efficiency  in terms of parameters like: # of cycles
>> per packet, packet latency , resource usage, etc? I don't mind if the
>> comparison will be related to a specific asic/switch only.
>> Is there any known reference / article / paper you can recommend on?
>> I will appreciate any response here.
>> Thanks,
>> Itzik
>> [image: itzkik_ashkenazi-eng]
>> _______________________________________________
>> P4-dev mailing list
>> P4-dev at lists.p4.org
>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
> --
> *Vladimir Gurevich*
> *Barefoot Networks*
> *Technical Lead, Customer Engineering*
> Email: vag at barefootnetworks.com
> Phone: (408) 833-4505
> _______________________________________________
> P4-dev mailing list
> P4-dev at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20160819/fc8cf934/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 38571 bytes
Desc: not available
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20160819/fc8cf934/attachment.png>

More information about the P4-dev mailing list