[P4-dev] P4 program implementation cost

Vladimir Gurevich vladimir.gurevich at barefootnetworks.com
Thu Aug 18 20:11:24 EDT 2016

Hello Itzik,

Most of what you are asking for can only be achieved using vendor-specific
tools only. I am not sure what your particular interest is, but I wanted to
point you to some free tools that can give you theoretical answers.

The free tool p4-graphs
<https://github.com/p4lang/p4-hlir/blob/master/bin/p4-graphs> (that is
available on P4.org Github in p4-hlir repo) computes the minimum number of
stages, required to implement a given pipeline.  This is done by analyzing
the dependencies and then sorting the tables topologically to see the
maximum depth of the graph.

In real hardware the number of stages might be higher, but assuming that
you have hardware with unlimited stage capacity it gives the right answer.
You can assume that the latency increases linearly with the number of
stages or you can further look at the dependencies and assume that
different values for interstage latency, depending on the dependency type.

While the tool doesn't currently calculate the theoretical number of bits
required to represent the tables, it can be done fairly easily by
traversing P4 HLIR (High-Level Intermediate Representation). For example,
for each table you can calculate the number of bits required for the match
key (1x for exact match, 2x for ternary, etc.), action data plus log2(n)
bits to represent an action (one out of n). Then, multiply by the table

Again, in the real hardware the actual number of bits used per table might
differ significantly, depending on allocation quanta, packing, etc. But,
you can create a free tool that will give you the theoretical minimum (and
even contribute it to p4.org!) You can read more about P4 HLIR here

By the way, in P4 we typically assume that table lookup (and action
execution) takes one unit of a time (one cycle).

Happy hacking,

On Thu, Aug 18, 2016 at 12:05 AM, iashken at cs.technion.ac.il <
iashken at cs.technion.ac.il> wrote:

> Hi,
> I want to compare the "implementation cost" of two different data-path
> algorithms. I can write a P4 program for each of them. But then, how can I
> compare their cost or efficiency  in terms of parameters like: # of cycles
> per packet, packet latency , resource usage, etc? I don't mind if the
> comparison will be related to a specific asic/switch only.
> Is there any known reference / article / paper you can recommend on?
> I will appreciate any response here.
> Thanks,
> Itzik
> [image: itzkik_ashkenazi-eng]
> _______________________________________________
> P4-dev mailing list
> P4-dev at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org

*Vladimir Gurevich*

*Barefoot Networks*
*Technical Lead, Customer Engineering*
Email: vag at barefootnetworks.com
Phone: (408) 833-4505
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20160818/2b92788b/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 38571 bytes
Desc: not available
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20160818/2b92788b/attachment.png>

More information about the P4-dev mailing list