[P4-design] Maybe too much text describing issues around P4 table and parser value set sizes

Vladimir Gurevich vag at barefootnetworks.com
Sun Mar 25 22:37:39 EDT 2018


Hi Andy,

I think that your article is a great explanation and with "up to N" clause
I think it provides a totally usable definition that is understandable by
most people. I was responding to Mihai's suggestion that because we can't
have a "formal" definition of "size" it's better to leave it alone.

Thanks,
Vladimir

On Sun, Mar 25, 2018 at 7:34 PM, Andy Fingerhut (jafinger) <
jafinger at cisco.com> wrote:

> Vladimir, do you have a specific recommendation on what you think the
> language spec should say about the ‘size’ table attribute?  That is what I
> was going for in part of my most recent message, and I don’t think it was
> trying to tie the hands of implementers much at all.
>
>
>
> Andy
>
>
>
> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
> *Date: *Sunday, March 25, 2018 at 7:31 PM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
> *Cc: *Mihai Budiu <mbudiu at vmware.com>, p4-design <p4-design at lists.p4.org>
>
> *Subject: *Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> Hello,
>
>
>
> I think we need to re-state the basic goals of our activities. Specifically
>
>    1. Are we trying to define a super-formal language that will allow one
>    to create not only super-portable programs, but even super-portable APIs
>    that control those programs to the extent that you can swap targets
>    (ranging from high-speed ASICs to fully SW implementations and anything
>    in-between) as easily as one swaps one pair of gloves for another *or*
>    2. Are we trying to define a practical language that would allow
>    people to easily create data plane programs on target device of their
>    choice and generate a corresponding control plane API?
>
> I think that goal #2 is quite achievable, but the goal #1 is,
> unfortunately, not. Moreover, trying to get closer to goal #1 will probably
> take us further away from being able to define a practical, usable language
> that gives data plane designers the ability to get things done.
>
>
>
> I totally agree that it is impossible to formally define what "size" is.
> The question is: should that preclude us from moving forward and having
> this in the language? Even in Euclidian geometry there are terms that are
> not defined, e.g a point and a straight line. That's ok. Having "size" and,
> perhaps, other similar attributes in the language makes programs generally
> more readable and understandable. In fact, it even makes them more
> target-independent and more suitable for optimization, because it would not
> require people to use the externs, requiring a very specific implementation
> (as you suggested).
>
>
>
> I can certainly see the value of implementation externs, especially if
> they would allow us to use fewer annotations, but unfortunately they will
> not, because any real implementation(e.g. for hash-tables) will require you
> to specify anywhere from one to dozens of parameters per table, most of of
> them optional -- something externs are not very good at (not to mention
> that it would be nice if they were named).
>
>
>
> I urge everyone to keep P4 a practical language and sometimes that means
> that we might need to have things there that we might not be able to
> formally define. That's OK.
>
>
>
> Thanks,
> Vladiimir
>
>
>
>
>
> On Sun, Mar 25, 2018 at 5:55 PM, Andy Fingerhut (jafinger) <
> jafinger at cisco.com> wrote:
>
> If ‘size’ were an integer that meant, ‘the compiler will implement this
> table with something that can in some cases install up to N entries, but
> how likely being able to reach N entries is implementation- and
> table-specific’, then the value of N could be used as the “raw capacity” of
> a hash table, for example.
>
>
>
> This might be what Vladimir means when he says “up to N”.  It seems a bit
> odd to me for an implementation to use, say, a size of 16384 for a hash
> table if then takes that value and uses it as the raw theoretical maximum
> capacity of a hash table, where in practice that could mean that adding a
> set of 12,000 keys causes the 12,000th one to fail to add because of too
> many collisions with earlier added entries.  75% to 95% typical usable
> capacity from a medium to large hash table is about the best I’ve seen
> (where the actual fraction depends on the raw size, how many hash
> functions, overflow TCAM size used, if any, etc.).
>
>
>
> I think it would be reasonable to define ‘size’ to take an integer type
> value in the language spec, and leave the meaning of how that value is used
> up to the implementation, with just a general guideline of “it is
> recommended if you can make a table / parser value set that guarantees that
> many exact, lpm, and/or ternary entries can be added, but an implementation
> is allowed to use implementations that can fail to add entries at lower
> number of entries than the given size’.
>
>
>
> Andy
>
>
>
> *From: *Mihai Budiu <mbudiu at vmware.com>
> *Date: *Sunday, March 25, 2018 at 5:17 PM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>, Vladimir Gurevich
> <vag at barefootnetworks.com>
> *Cc: *p4-design <p4-design at lists.p4.org>
> *Subject: *RE: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> The conclusion is that it is impossible to give a simple description of
> the meaning of “size”.
>
>
>
> There are two possibilities going forward:
>
>    - We specify in the spec that the tables and value_sets can have a
>    size integer parameter, but the meaning is target-dependent
>    - We do not standardize these at all. Tables can still have a size
>    parameter (it is not mandated or forbidden by the spec, as it is today),
>    but each target can choose a different way to specify sizes. An example is
>    given below. For value_sets we have to use an annotation.
>
>
>
> For tables I will submit again the example in ebpf_model, which is perfect
> for this particular target:
>
>
>
> /*
>
> Each table must have an implementation property which is either an
> array_table
>
> or a hash_table.
>
> */
>
>
>
> /**
>
> Implementation property for tables indicating that tables must be
> implemented
>
> using EBPF array map.
>
> */
>
> extern array_table {
>
>     /// @param size: maximum number of entries in table
>
>     array_table(bit<32> size);
>
> }
>
>
>
> /**
>
> Implementation property for tables indicating that tables must be
> implemented
>
> using EBPF hash map.
>
> */
>
> extern hash_table {
>
>     /// @param size: maximum number of entries in table
>
>     hash_table(bit<32> size);
>
> }
>
>
>
> table t {
>
>    keys = …
>
>    implementation = hash_table(32);
>
> }
>
>
>
> The benefit of this approach is that an architecture can define whatever
> parameters it deems necessary for describing the implementation. If
> Vladimir likes max_size, min_size, average_size and other parameters, he
> can define an extern which has all these as constructor arguments for his
> architecture.
>
>
>
> The downside is that this is certainly target-specific, and thus
> non-portable.
>
>
>
> But it looks from Andy’s description that the portability of making size a
> standard feature is illusory.
>
>
>
> Mihai
>
>
>
> *From:* P4-design [mailto:p4-design-bounces at lists.p4.org] *On Behalf Of *Andy
> Fingerhut (jafinger)
> *Sent:* Saturday, March 24, 2018 5:10 PM
> *To:* Vladimir Gurevich <vag at barefootnetworks.com>
> *Cc:* p4-design <p4-design at lists.p4.org>
> *Subject:* Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> If by “up to N” you mean “up to N with high probability”, or “up to N for
> most practical sets of keys you probably want to install”, yes that is what
> I was trying to convey in the article for hash tables.  I wasn’t trying to
> say that ATCAMs are any worse than hash tables in that regard (although
> some variations I have worked with _*are*_), just that they are no better
> in them, either.
>
>
>
> Andy
>
>
>
> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
> *Date: *Saturday, March 24, 2018 at 4:36 PM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
> *Cc: *p4-design <p4-design at lists.p4.org>
> *Subject: *Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> Hi Andy,
>
>
>
> I totally agree that the practical ATCAM capacity from the control plane
> point of view (e.g. how many routes you can put into a ATCAM with N entries
> divided into M partitions) is difficult to predict, but "up to N" statement
> can still be holding some water :) I think that this is an inherent ATCAM
> property, much like the fact that you can't generally guarantee any
> specific occupancy for a hashed table (as you clearly outlined in the
> article).
>
>
>
> Thanks,
>
> Vladmir
>
>
>
> On Sat, Mar 24, 2018 at 10:40 AM, Andy Fingerhut (jafinger) <
> jafinger at cisco.com> wrote:
>
> Thanks for the comments, Vladimir.  I agree with just about every word,
> and would be happy to incorporate your comments into it, perhaps in another
> week or so after I get some other things done.
>
>
>
> One quick response here.  The article does not say that algorithmic TCAMs
> are not “fast”.  It says: ‘All techniques I am aware of here either cannot
> go "fast" as defined above, or have capacity that is dependent upon the
> contents of the table entries.’.  If you know of an algorithmic TCAM method
> that is both fast _and_ has capacity independent of the contents of the
> table entries, then I really want to hear about it.  I’ve spent way too
> much of my life dealing with the non-deterministic capacity of algorithmic
> TCAMs, for Cisco platforms where they wish they could claim the capacity
> was a fixed number N.
>
>
>
> Thanks,
>
> Andy
>
>
>
> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
> *Date: *Saturday, March 24, 2018 at 9:06 AM
> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
> *Cc: *p4-design <p4-design at lists.p4.org>
> *Subject: *Re: [P4-design] Maybe too much text describing issues around
> P4 table and parser value set sizes
>
>
>
> Hi Andy,
>
>
>
> Great stuff, worthy of a book chapter or an encyclopedia article.
>
>
>
> If anything, I'd just say that some of the assumptions are a little "rosy"
> meaning that they cover the "classic" behaviors, but do not account for a
> variety of idiosyncratic behaviors that are often encountered in real life.
> Call them bugs, features or both, but they do have an effect on table
> capacity. One big topic you sort of omitted is the question of how to count
> default actions: are they entries or not? do they need to be included in
> total entry count or not (i.e. should the "natural" size be 2^N, 2^N-1 or
> 2^N+1)? Even in "good" cases, such as TCAMs or indexed tables, I am sure
> that different implementations will answer this question differently. In
> other cases control plane might need to reserve additional entries for a
> variety of technical/efficiency reasons (faster entry moves in TCAMs,
> atomic updates, garbage collection (in action profiles), etc.).
>
>
>
> Therefore, people usually advertise their table capacities as being "up to
> N entries", with that "up to" being the key, that allows customers to
> reasonably judge the device capacity, while freeing the vendors from
> frivolous lawsuits :)
>
>
>
> I know, there are different opinions in this area (mostly guided by (a)
> the n-to-1 case you described and (b) the fact that SW implementations
> might not need size indication at all), but coming from the ASIC
> background, I firmly believe that the basic "up to" size attribute is vital
> for both tables and value_sets and is preferred to an annotation, because
> it expresses a basic intent of the data plane program designer that the
> compiler cannot reasonably guess (as opposed to many technical allocation
> parameters (e.g. the number of ways in a hash table) that the compilers can
> determine automatically, often even better than humans). P4_14 even has
> min_size and max_size attributes in addition to size and I think that was
> not a bad idea (although I am not sure if they were implemented anywhere).
>
>
>
> Last, but not least, I am not quite sure what to make of the comment at
> the end of the article that assert that ATCAMs are not "fast". I think
> everything depends on the implementation, but if you can implement a
> multi-way hash table with O(1) lookup performance, there is no reason why
> you can't implement an ATCAM with the same performance either.
>
>
>
> Happy hacking,
>
> Vladimir
>
>
>
>
>
>
>
> On Fri, Mar 23, 2018 at 9:49 PM, Andy Fingerhut (jafinger) <
> jafinger at cisco.com> wrote:
>
> In the last P4 language design WG meeting, I volunteered to write up some
> notes on P4 table sizes, and what “size” might mean.
>
>
>
> I have written up a discussion of the issues around P4 table and parser
> value set sizes here.  It is more a tutorial style directed at people who
> might be unfamiliar with issues around high speed implementations of
> P4-style matching, and what their consequences are for specifying table
> sizes, especially ones where the actual capacity of a table is not really
> predictable.
>
>
>
> https://github.com/jafingerhut/p4-guide/blob/master/docs/p4-table-and-
> parser-value-set-sizes.md
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jafingerhut_p4-2Dguide_blob_master_docs_p4-2Dtable-2Dand-2Dparser-2Dvalue-2Dset-2Dsizes.md&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=tGW6TKXajnoXSyy1S1P4DHGPe8sj54GGvw-b21n7aWg&m=igHNDkIsLYagXgGCPpIvkY56e6q_4mGo3TN4g8WOQYk&s=Mm_IWg6WCXnjrb1Vh5lONc_7WE2uO3BgmENj5HnM_BQ&e=>
>
>
>
> Corrections/questions/comments welcome.
>
>
>
> Andy
>
>
>
>
> _______________________________________________
> P4-design mailing list
> P4-design at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-design_lists.p4.org
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.p4.org_mailman_listinfo_p4-2Ddesign-5Flists.p4.org&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=tGW6TKXajnoXSyy1S1P4DHGPe8sj54GGvw-b21n7aWg&m=igHNDkIsLYagXgGCPpIvkY56e6q_4mGo3TN4g8WOQYk&s=Avv9VPJx5NI3SRUQy9aN7lk_rDLuHqX0_IwMBoDIA2E&e=>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-design_lists.p4.org/attachments/20180325/71c10d23/attachment-0001.html>


More information about the P4-design mailing list