[P4-design] Maybe too much text describing issues around P4 table and parser value set sizes

Vladimir Gurevich vag at barefootnetworks.com
Sun Mar 25 22:31:35 EDT 2018


Hello,

I think we need to re-state the basic goals of our activities. Specifically

   1. Are we trying to define a super-formal language that will allow one
   to create not only super-portable programs, but even super-portable APIs
   that control those programs to the extent that you can swap targets
   (ranging from high-speed ASICs to fully SW implementations and anything
   in-between) as easily as one swaps one pair of gloves for another *or*
   2. Are we trying to define a practical language that would allow people
   to easily create data plane programs on target device of their choice and
   generate a corresponding control plane API?

I think that goal #2 is quite achievable, but the goal #1 is,
unfortunately, not. Moreover, trying to get closer to goal #1 will probably
take us further away from being able to define a practical, usable language
that gives data plane designers the ability to get things done.

I totally agree that it is impossible to formally define what "size" is.
The question is: should that preclude us from moving forward and having
this in the language? Even in Euclidian geometry there are terms that are
not defined, e.g a point and a straight line. That's ok. Having "size" and,
perhaps, other similar attributes in the language makes programs generally
more readable and understandable. In fact, it even makes them more
target-independent and more suitable for optimization, because it would not
require people to use the externs, requiring a very specific implementation
(as you suggested).

I can certainly see the value of implementation externs, especially if they
would allow us to use fewer annotations, but unfortunately they will not,
because any real implementation(e.g. for hash-tables) will require you to
specify anywhere from one to dozens of parameters per table, most of of
them optional -- something externs are not very good at (not to mention
that it would be nice if they were named).

I urge everyone to keep P4 a practical language and sometimes that means
that we might need to have things there that we might not be able to
formally define. That's OK.

Thanks,
Vladiimir


On Sun, Mar 25, 2018 at 5:55 PM, Andy Fingerhut (jafinger) <
jafinger at cisco.com> wrote:

> If ‘size’ were an integer that meant, ‘the compiler will implement this
> table with something that can in some cases install up to N entries, but
> how likely being able to reach N entries is implementation- and
> table-specific’, then the value of N could be used as the “raw capacity” of
> a hash table, for example.
>
>
>
> This might be what Vladimir means when he says “up to N”.  It seems a bit
> odd to me for an implementation to use, say, a size of 16384 for a hash
> table if then takes that value and uses it as the raw theoretical maximum
> capacity of a hash table, where in practice that could mean that adding a
> set of 12,000 keys causes the 12,000th one to fail to add because of too
> many collisions with earlier added entries.  75% to 95% typical usable
> capacity from a medium to large hash table is about the best I’ve seen
> (where the actual fraction depends on the raw size, how many hash
> functions, overflow TCAM size used, if any, etc.).
>
>
>
> I think it would be reasonable to define ‘size’ to take an integer type
> value in the language spec, and leave the meaning of how that value is used
> up to the implementation, with just a general guideline of “it is
> recommended if you can make a table / parser value set that guarantees that
> many exact, lpm, and/or ternary entries can be added, but an implementation
> is allowed to use implementations that can fail to add entries at lower
> number of entries than the given size’.
>
>
>
> Andy
>
>
>
> *From: *Mihai Budiu <mbudiu at vmware.com>
> *Date: *Sunday, March 25, 2018 at 5:17 PM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>, Vladimir Gurevich
> <vag at barefootnetworks.com>
> *Cc: *p4-design <p4-design at lists.p4.org>
> *Subject: *RE: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> The conclusion is that it is impossible to give a simple description of
> the meaning of “size”.
>
>
>
> There are two possibilities going forward:
>
>    - We specify in the spec that the tables and value_sets can have a
>    size integer parameter, but the meaning is target-dependent
>    - We do not standardize these at all. Tables can still have a size
>    parameter (it is not mandated or forbidden by the spec, as it is today),
>    but each target can choose a different way to specify sizes. An example is
>    given below. For value_sets we have to use an annotation.
>
>
>
> For tables I will submit again the example in ebpf_model, which is perfect
> for this particular target:
>
>
>
> /*
>
> Each table must have an implementation property which is either an
> array_table
>
> or a hash_table.
>
> */
>
>
>
> /**
>
> Implementation property for tables indicating that tables must be
> implemented
>
> using EBPF array map.
>
> */
>
> extern array_table {
>
>     /// @param size: maximum number of entries in table
>
>     array_table(bit<32> size);
>
> }
>
>
>
> /**
>
> Implementation property for tables indicating that tables must be
> implemented
>
> using EBPF hash map.
>
> */
>
> extern hash_table {
>
>     /// @param size: maximum number of entries in table
>
>     hash_table(bit<32> size);
>
> }
>
>
>
> table t {
>
>    keys = …
>
>    implementation = hash_table(32);
>
> }
>
>
>
> The benefit of this approach is that an architecture can define whatever
> parameters it deems necessary for describing the implementation. If
> Vladimir likes max_size, min_size, average_size and other parameters, he
> can define an extern which has all these as constructor arguments for his
> architecture.
>
>
>
> The downside is that this is certainly target-specific, and thus
> non-portable.
>
>
>
> But it looks from Andy’s description that the portability of making size a
> standard feature is illusory.
>
>
>
> Mihai
>
>
>
> *From:* P4-design [mailto:p4-design-bounces at lists.p4.org] *On Behalf Of *Andy
> Fingerhut (jafinger)
> *Sent:* Saturday, March 24, 2018 5:10 PM
> *To:* Vladimir Gurevich <vag at barefootnetworks.com>
> *Cc:* p4-design <p4-design at lists.p4.org>
> *Subject:* Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> If by “up to N” you mean “up to N with high probability”, or “up to N for
> most practical sets of keys you probably want to install”, yes that is what
> I was trying to convey in the article for hash tables.  I wasn’t trying to
> say that ATCAMs are any worse than hash tables in that regard (although
> some variations I have worked with _*are*_), just that they are no better
> in them, either.
>
>
>
> Andy
>
>
>
> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
> *Date: *Saturday, March 24, 2018 at 4:36 PM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
> *Cc: *p4-design <p4-design at lists.p4.org>
> *Subject: *Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> Hi Andy,
>
>
>
> I totally agree that the practical ATCAM capacity from the control plane
> point of view (e.g. how many routes you can put into a ATCAM with N entries
> divided into M partitions) is difficult to predict, but "up to N" statement
> can still be holding some water :) I think that this is an inherent ATCAM
> property, much like the fact that you can't generally guarantee any
> specific occupancy for a hashed table (as you clearly outlined in the
> article).
>
>
>
> Thanks,
>
> Vladmir
>
>
>
> On Sat, Mar 24, 2018 at 10:40 AM, Andy Fingerhut (jafinger) <
> jafinger at cisco.com> wrote:
>
> Thanks for the comments, Vladimir.  I agree with just about every word,
> and would be happy to incorporate your comments into it, perhaps in another
> week or so after I get some other things done.
>
>
>
> One quick response here.  The article does not say that algorithmic TCAMs
> are not “fast”.  It says: ‘All techniques I am aware of here either cannot
> go "fast" as defined above, or have capacity that is dependent upon the
> contents of the table entries.’.  If you know of an algorithmic TCAM method
> that is both fast _and_ has capacity independent of the contents of the
> table entries, then I really want to hear about it.  I’ve spent way too
> much of my life dealing with the non-deterministic capacity of algorithmic
> TCAMs, for Cisco platforms where they wish they could claim the capacity
> was a fixed number N.
>
>
>
> Thanks,
>
> Andy
>
>
>
> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
> *Date: *Saturday, March 24, 2018 at 9:06 AM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
> *Cc: *p4-design <p4-design at lists.p4.org>
> *Subject: *Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> Hi Andy,
>
>
>
> Great stuff, worthy of a book chapter or an encyclopedia article.
>
>
>
> If anything, I'd just say that some of the assumptions are a little "rosy"
> meaning that they cover the "classic" behaviors, but do not account for a
> variety of idiosyncratic behaviors that are often encountered in real life.
> Call them bugs, features or both, but they do have an effect on table
> capacity. One big topic you sort of omitted is the question of how to count
> default actions: are they entries or not? do they need to be included in
> total entry count or not (i.e. should the "natural" size be 2^N, 2^N-1 or
> 2^N+1)? Even in "good" cases, such as TCAMs or indexed tables, I am sure
> that different implementations will answer this question differently. In
> other cases control plane might need to reserve additional entries for a
> variety of technical/efficiency reasons (faster entry moves in TCAMs,
> atomic updates, garbage collection (in action profiles), etc.).
>
>
>
> Therefore, people usually advertise their table capacities as being "up to
> N entries", with that "up to" being the key, that allows customers to
> reasonably judge the device capacity, while freeing the vendors from
> frivolous lawsuits :)
>
>
>
> I know, there are different opinions in this area (mostly guided by (a)
> the n-to-1 case you described and (b) the fact that SW implementations
> might not need size indication at all), but coming from the ASIC
> background, I firmly believe that the basic "up to" size attribute is vital
> for both tables and value_sets and is preferred to an annotation, because
> it expresses a basic intent of the data plane program designer that the
> compiler cannot reasonably guess (as opposed to many technical allocation
> parameters (e.g. the number of ways in a hash table) that the compilers can
> determine automatically, often even better than humans). P4_14 even has
> min_size and max_size attributes in addition to size and I think that was
> not a bad idea (although I am not sure if they were implemented anywhere).
>
>
>
> Last, but not least, I am not quite sure what to make of the comment at
> the end of the article that assert that ATCAMs are not "fast". I think
> everything depends on the implementation, but if you can implement a
> multi-way hash table with O(1) lookup performance, there is no reason why
> you can't implement an ATCAM with the same performance either.
>
>
>
> Happy hacking,
>
> Vladimir
>
>
>
>
>
>
>
> On Fri, Mar 23, 2018 at 9:49 PM, Andy Fingerhut (jafinger) <
> jafinger at cisco.com> wrote:
>
> In the last P4 language design WG meeting, I volunteered to write up some
> notes on P4 table sizes, and what “size” might mean.
>
>
>
> I have written up a discussion of the issues around P4 table and parser
> value set sizes here.  It is more a tutorial style directed at people who
> might be unfamiliar with issues around high speed implementations of
> P4-style matching, and what their consequences are for specifying table
> sizes, especially ones where the actual capacity of a table is not really
> predictable.
>
>
>
> https://github.com/jafingerhut/p4-guide/blob/master/docs/p4-table-and-
> parser-value-set-sizes.md
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jafingerhut_p4-2Dguide_blob_master_docs_p4-2Dtable-2Dand-2Dparser-2Dvalue-2Dset-2Dsizes.md&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=tGW6TKXajnoXSyy1S1P4DHGPe8sj54GGvw-b21n7aWg&m=igHNDkIsLYagXgGCPpIvkY56e6q_4mGo3TN4g8WOQYk&s=Mm_IWg6WCXnjrb1Vh5lONc_7WE2uO3BgmENj5HnM_BQ&e=>
>
>
>
> Corrections/questions/comments welcome.
>
>
>
> Andy
>
>
>
>
> _______________________________________________
> P4-design mailing list
> P4-design at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-design_lists.p4.org
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.p4.org_mailman_listinfo_p4-2Ddesign-5Flists.p4.org&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=tGW6TKXajnoXSyy1S1P4DHGPe8sj54GGvw-b21n7aWg&m=igHNDkIsLYagXgGCPpIvkY56e6q_4mGo3TN4g8WOQYk&s=Avv9VPJx5NI3SRUQy9aN7lk_rDLuHqX0_IwMBoDIA2E&e=>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-design_lists.p4.org/attachments/20180325/cf8fbe91/attachment-0001.html>


More information about the P4-design mailing list