[P4-design] Maybe too much text describing issues around P4 table and parser value set sizes

Nate Foster jnfoster at cs.cornell.edu
Mon Mar 26 10:07:45 EDT 2018


I'm wading into this conversation late. How does an upper bound on
resources help the programmer? I can understand the utility of a
programming abstraction that said "this table can hold at least N
elements." But I'm struggling to understand how one would effectively use
"up to N." Maybe it's just a compiler hint that is not actually useful to
the programmer?

-N

On Sun, Mar 25, 2018 at 10:55 PM, Vladimir Gurevich <
vag at barefootnetworks.com> wrote:

> Hi Andy,
>
> Yes, I think this recommendation is reasonable, although I'd probably
> reword it a little bit, because the word "guarantees" worries me a little.
> How about something like that:
>
> “If a table or parser value set has a size attribute specified for it with
> value N, it is recommended that a compiler should choose a data plane
> implementation that can store (hold)  N exact, lpm, and/or ternary
> entries.  It does not guarantee though that an arbitrary set of N entries
> can always be inserted in such a table".
>
> Note that deliberate difference between being able to hold N entries and
> being able to insert arbitrary N entries
>
> Thanks,
> Vladimir
>
> On Sun, Mar 25, 2018 at 7:47 PM, Andy Fingerhut (jafinger) <
> jafinger at cisco.com> wrote:
>
>> When you say the phrase “up to N”, do you mean the same thing as the
>> following sentences, or do you mean something different?
>>
>>
>>
>> “If a table or parser value set has a size attribute specified for it
>> with value N, it is recommended that if a compiler can choose a data plane
>> implementation that guarantees that an arbitrary set of N exact, lpm,
>> and/or ternary entries can be added, at a reasonable cost, it should do
>> so.  However, a compiler is allowed to choose data plane implementation
>> that can fail to add a new entry even if significantly fewer than N entries
>> are currently installed.”
>>
>>
>>
>> In other words, nothing is guaranteed, but there is a specific kind of
>> recommendation given, and simultaneously a warning to people using the
>> ‘size’ attribute that they are not guaranteed they can add N entries to
>> such an object, unless they find those guarantees outside of the language
>> spec (e.g. in their P4 vendor’s custom attributes/compiler options).
>>
>>
>>
>> Andy
>>
>>
>>
>>
>>
>> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
>> *Date: *Sunday, March 25, 2018 at 7:37 PM
>>
>> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
>> *Cc: *Mihai Budiu <mbudiu at vmware.com>, p4-design <p4-design at lists.p4.org>
>> *Subject: *Re: [P4-design] Maybe too much text describing issues around
>> P4 table and parser value set sizes
>>
>>
>>
>> Hi Andy,
>>
>>
>>
>> I think that your article is a great explanation and with "up to N"
>> clause I think it provides a totally usable definition that is
>> understandable by most people. I was responding to Mihai's suggestion that
>> because we can't have a "formal" definition of "size" it's better to leave
>> it alone.
>>
>>
>>
>> Thanks,
>>
>> Vladimir
>>
>>
>>
>> On Sun, Mar 25, 2018 at 7:34 PM, Andy Fingerhut (jafinger) <
>> jafinger at cisco.com> wrote:
>>
>> Vladimir, do you have a specific recommendation on what you think the
>> language spec should say about the ‘size’ table attribute?  That is what I
>> was going for in part of my most recent message, and I don’t think it was
>> trying to tie the hands of implementers much at all.
>>
>>
>>
>> Andy
>>
>>
>>
>> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
>> *Date: *Sunday, March 25, 2018 at 7:31 PM
>> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
>> *Cc: *Mihai Budiu <mbudiu at vmware.com>, p4-design <p4-design at lists.p4.org>
>>
>>
>> *Subject: *Re: [P4-design] Maybe too much text describing issues around
>> P4 table and parser value set sizes
>>
>>
>>
>> Hello,
>>
>>
>>
>> I think we need to re-state the basic goals of our activities.
>> Specifically
>>
>>    1. Are we trying to define a super-formal language that will allow
>>    one to create not only super-portable programs, but even super-portable
>>    APIs that control those programs to the extent that you can swap targets
>>    (ranging from high-speed ASICs to fully SW implementations and anything
>>    in-between) as easily as one swaps one pair of gloves for another *or*
>>
>>    2. Are we trying to define a practical language that would allow
>>    people to easily create data plane programs on target device of their
>>    choice and generate a corresponding control plane API?
>>
>> I think that goal #2 is quite achievable, but the goal #1 is,
>> unfortunately, not. Moreover, trying to get closer to goal #1 will probably
>> take us further away from being able to define a practical, usable language
>> that gives data plane designers the ability to get things done.
>>
>>
>>
>> I totally agree that it is impossible to formally define what "size" is.
>> The question is: should that preclude us from moving forward and having
>> this in the language? Even in Euclidian geometry there are terms that are
>> not defined, e.g a point and a straight line. That's ok. Having "size" and,
>> perhaps, other similar attributes in the language makes programs generally
>> more readable and understandable. In fact, it even makes them more
>> target-independent and more suitable for optimization, because it would not
>> require people to use the externs, requiring a very specific implementation
>> (as you suggested).
>>
>>
>>
>> I can certainly see the value of implementation externs, especially if
>> they would allow us to use fewer annotations, but unfortunately they will
>> not, because any real implementation(e.g. for hash-tables) will require you
>> to specify anywhere from one to dozens of parameters per table, most of of
>> them optional -- something externs are not very good at (not to mention
>> that it would be nice if they were named).
>>
>>
>>
>> I urge everyone to keep P4 a practical language and sometimes that means
>> that we might need to have things there that we might not be able to
>> formally define. That's OK.
>>
>>
>>
>> Thanks,
>> Vladiimir
>>
>>
>>
>>
>>
>> On Sun, Mar 25, 2018 at 5:55 PM, Andy Fingerhut (jafinger) <
>> jafinger at cisco.com> wrote:
>>
>> If ‘size’ were an integer that meant, ‘the compiler will implement this
>> table with something that can in some cases install up to N entries, but
>> how likely being able to reach N entries is implementation- and
>> table-specific’, then the value of N could be used as the “raw capacity” of
>> a hash table, for example.
>>
>>
>>
>> This might be what Vladimir means when he says “up to N”.  It seems a bit
>> odd to me for an implementation to use, say, a size of 16384 for a hash
>> table if then takes that value and uses it as the raw theoretical maximum
>> capacity of a hash table, where in practice that could mean that adding a
>> set of 12,000 keys causes the 12,000th one to fail to add because of too
>> many collisions with earlier added entries.  75% to 95% typical usable
>> capacity from a medium to large hash table is about the best I’ve seen
>> (where the actual fraction depends on the raw size, how many hash
>> functions, overflow TCAM size used, if any, etc.).
>>
>>
>>
>> I think it would be reasonable to define ‘size’ to take an integer type
>> value in the language spec, and leave the meaning of how that value is used
>> up to the implementation, with just a general guideline of “it is
>> recommended if you can make a table / parser value set that guarantees that
>> many exact, lpm, and/or ternary entries can be added, but an implementation
>> is allowed to use implementations that can fail to add entries at lower
>> number of entries than the given size’.
>>
>>
>>
>> Andy
>>
>>
>>
>> *From: *Mihai Budiu <mbudiu at vmware.com>
>> *Date: *Sunday, March 25, 2018 at 5:17 PM
>> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>, Vladimir
>> Gurevich <vag at barefootnetworks.com>
>> *Cc: *p4-design <p4-design at lists.p4.org>
>> *Subject: *RE: [P4-design] Maybe too much text describing issues around
>> P4 table and parser value set sizes
>>
>>
>>
>> The conclusion is that it is impossible to give a simple description of
>> the meaning of “size”.
>>
>>
>>
>> There are two possibilities going forward:
>>
>>    - We specify in the spec that the tables and value_sets can have a
>>    size integer parameter, but the meaning is target-dependent
>>    - We do not standardize these at all. Tables can still have a size
>>    parameter (it is not mandated or forbidden by the spec, as it is today),
>>    but each target can choose a different way to specify sizes. An example is
>>    given below. For value_sets we have to use an annotation.
>>
>>
>>
>> For tables I will submit again the example in ebpf_model, which is
>> perfect for this particular target:
>>
>>
>>
>> /*
>>
>> Each table must have an implementation property which is either an
>> array_table
>>
>> or a hash_table.
>>
>> */
>>
>>
>>
>> /**
>>
>> Implementation property for tables indicating that tables must be
>> implemented
>>
>> using EBPF array map.
>>
>> */
>>
>> extern array_table {
>>
>>     /// @param size: maximum number of entries in table
>>
>>     array_table(bit<32> size);
>>
>> }
>>
>>
>>
>> /**
>>
>> Implementation property for tables indicating that tables must be
>> implemented
>>
>> using EBPF hash map.
>>
>> */
>>
>> extern hash_table {
>>
>>     /// @param size: maximum number of entries in table
>>
>>     hash_table(bit<32> size);
>>
>> }
>>
>>
>>
>> table t {
>>
>>    keys = …
>>
>>    implementation = hash_table(32);
>>
>> }
>>
>>
>>
>> The benefit of this approach is that an architecture can define whatever
>> parameters it deems necessary for describing the implementation. If
>> Vladimir likes max_size, min_size, average_size and other parameters, he
>> can define an extern which has all these as constructor arguments for his
>> architecture.
>>
>>
>>
>> The downside is that this is certainly target-specific, and thus
>> non-portable.
>>
>>
>>
>> But it looks from Andy’s description that the portability of making size
>> a standard feature is illusory.
>>
>>
>>
>> Mihai
>>
>>
>>
>> *From:* P4-design [mailto:p4-design-bounces at lists.p4.org] *On Behalf Of *Andy
>> Fingerhut (jafinger)
>> *Sent:* Saturday, March 24, 2018 5:10 PM
>> *To:* Vladimir Gurevich <vag at barefootnetworks.com>
>> *Cc:* p4-design <p4-design at lists.p4.org>
>> *Subject:* Re: [P4-design] Maybe too much text describing issues around
>> P4 table and parser value set sizes
>>
>>
>>
>> If by “up to N” you mean “up to N with high probability”, or “up to N for
>> most practical sets of keys you probably want to install”, yes that is what
>> I was trying to convey in the article for hash tables.  I wasn’t trying to
>> say that ATCAMs are any worse than hash tables in that regard (although
>> some variations I have worked with _*are*_), just that they are no
>> better in them, either.
>>
>>
>>
>> Andy
>>
>>
>>
>> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
>> *Date: *Saturday, March 24, 2018 at 4:36 PM
>> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
>> *Cc: *p4-design <p4-design at lists.p4.org>
>> *Subject: *Re: [P4-design] Maybe too much text describing issues around
>> P4 table and parser value set sizes
>>
>>
>>
>> Hi Andy,
>>
>>
>>
>> I totally agree that the practical ATCAM capacity from the control plane
>> point of view (e.g. how many routes you can put into a ATCAM with N entries
>> divided into M partitions) is difficult to predict, but "up to N" statement
>> can still be holding some water :) I think that this is an inherent ATCAM
>> property, much like the fact that you can't generally guarantee any
>> specific occupancy for a hashed table (as you clearly outlined in the
>> article).
>>
>>
>>
>> Thanks,
>>
>> Vladmir
>>
>>
>>
>> On Sat, Mar 24, 2018 at 10:40 AM, Andy Fingerhut (jafinger) <
>> jafinger at cisco.com> wrote:
>>
>> Thanks for the comments, Vladimir.  I agree with just about every word,
>> and would be happy to incorporate your comments into it, perhaps in another
>> week or so after I get some other things done.
>>
>>
>>
>> One quick response here.  The article does not say that algorithmic TCAMs
>> are not “fast”.  It says: ‘All techniques I am aware of here either cannot
>> go "fast" as defined above, or have capacity that is dependent upon the
>> contents of the table entries.’.  If you know of an algorithmic TCAM method
>> that is both fast _and_ has capacity independent of the contents of the
>> table entries, then I really want to hear about it.  I’ve spent way too
>> much of my life dealing with the non-deterministic capacity of algorithmic
>> TCAMs, for Cisco platforms where they wish they could claim the capacity
>> was a fixed number N.
>>
>>
>>
>> Thanks,
>>
>> Andy
>>
>>
>>
>> *From: *Vladimir Gurevich <vag at barefootnetworks.com>
>> *Date: *Saturday, March 24, 2018 at 9:06 AM
>> *To: *"Andy Fingerhut (jafinger)" <jafinger at cisco.com>
>> *Cc: *p4-design <p4-design at lists.p4.org>
>> *Subject: *Re: [P4-design] Maybe too much text describing issues around
>> P4 table and parser value set sizes
>>
>>
>>
>> Hi Andy,
>>
>>
>>
>> Great stuff, worthy of a book chapter or an encyclopedia article.
>>
>>
>>
>> If anything, I'd just say that some of the assumptions are a little
>> "rosy" meaning that they cover the "classic" behaviors, but do not account
>> for a variety of idiosyncratic behaviors that are often encountered in real
>> life. Call them bugs, features or both, but they do have an effect on table
>> capacity. One big topic you sort of omitted is the question of how to count
>> default actions: are they entries or not? do they need to be included in
>> total entry count or not (i.e. should the "natural" size be 2^N, 2^N-1 or
>> 2^N+1)? Even in "good" cases, such as TCAMs or indexed tables, I am sure
>> that different implementations will answer this question differently. In
>> other cases control plane might need to reserve additional entries for a
>> variety of technical/efficiency reasons (faster entry moves in TCAMs,
>> atomic updates, garbage collection (in action profiles), etc.).
>>
>>
>>
>> Therefore, people usually advertise their table capacities as being "up
>> to N entries", with that "up to" being the key, that allows customers to
>> reasonably judge the device capacity, while freeing the vendors from
>> frivolous lawsuits :)
>>
>>
>>
>> I know, there are different opinions in this area (mostly guided by (a)
>> the n-to-1 case you described and (b) the fact that SW implementations
>> might not need size indication at all), but coming from the ASIC
>> background, I firmly believe that the basic "up to" size attribute is vital
>> for both tables and value_sets and is preferred to an annotation, because
>> it expresses a basic intent of the data plane program designer that the
>> compiler cannot reasonably guess (as opposed to many technical allocation
>> parameters (e.g. the number of ways in a hash table) that the compilers can
>> determine automatically, often even better than humans). P4_14 even has
>> min_size and max_size attributes in addition to size and I think that was
>> not a bad idea (although I am not sure if they were implemented anywhere).
>>
>>
>>
>> Last, but not least, I am not quite sure what to make of the comment at
>> the end of the article that assert that ATCAMs are not "fast". I think
>> everything depends on the implementation, but if you can implement a
>> multi-way hash table with O(1) lookup performance, there is no reason why
>> you can't implement an ATCAM with the same performance either.
>>
>>
>>
>> Happy hacking,
>>
>> Vladimir
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Mar 23, 2018 at 9:49 PM, Andy Fingerhut (jafinger) <
>> jafinger at cisco.com> wrote:
>>
>> In the last P4 language design WG meeting, I volunteered to write up some
>> notes on P4 table sizes, and what “size” might mean.
>>
>>
>>
>> I have written up a discussion of the issues around P4 table and parser
>> value set sizes here.  It is more a tutorial style directed at people who
>> might be unfamiliar with issues around high speed implementations of
>> P4-style matching, and what their consequences are for specifying table
>> sizes, especially ones where the actual capacity of a table is not really
>> predictable.
>>
>>
>>
>> https://github.com/jafingerhut/p4-guide/blob/master/docs/p4-
>> table-and-parser-value-set-sizes.md
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jafingerhut_p4-2Dguide_blob_master_docs_p4-2Dtable-2Dand-2Dparser-2Dvalue-2Dset-2Dsizes.md&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=tGW6TKXajnoXSyy1S1P4DHGPe8sj54GGvw-b21n7aWg&m=igHNDkIsLYagXgGCPpIvkY56e6q_4mGo3TN4g8WOQYk&s=Mm_IWg6WCXnjrb1Vh5lONc_7WE2uO3BgmENj5HnM_BQ&e=>
>>
>>
>>
>> Corrections/questions/comments welcome.
>>
>>
>>
>> Andy
>>
>>
>>
>>
>> _______________________________________________
>> P4-design mailing list
>> P4-design at lists.p4.org
>> http://lists.p4.org/mailman/listinfo/p4-design_lists.p4.org
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.p4.org_mailman_listinfo_p4-2Ddesign-5Flists.p4.org&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=tGW6TKXajnoXSyy1S1P4DHGPe8sj54GGvw-b21n7aWg&m=igHNDkIsLYagXgGCPpIvkY56e6q_4mGo3TN4g8WOQYk&s=Avv9VPJx5NI3SRUQy9aN7lk_rDLuHqX0_IwMBoDIA2E&e=>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> _______________________________________________
> P4-design mailing list
> P4-design at lists.p4.org
> http://lists.p4.org/mailman/listinfo/p4-design_lists.p4.org
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-design_lists.p4.org/attachments/20180326/ec75d4a2/attachment-0001.html>


More information about the P4-design mailing list