[P4-dev] General question on P4

Michael Borokhovich michaelbor at gmail.com
Sun Apr 9 20:59:41 EDT 2017

I agree with Andy. NPUs are for programmable data plane.

On Sat, Apr 8, 2017 at 5:44 PM, Andy Fingerhut <andy.fingerhut at gmail.com>

> Except for one thing, everything you said sounds good to me.
> The one exception is that I haven't heard the term NPU used to include the
> part of the system that runs the control plane code.  In my experience
> those are called Supervisor CPUs, control plane CPU, or route processor
> CPU, but no matter which of those names you call them, they tend to be
> off-the-shelf PowerPC, MIPS, Intel, or Arm CPUs, the same kind you would
> find in servers or embedded devices.
> NPU I have usually heard to mean some kind of highly programmable device
> that is tailored for packet forwarding, similar to the categories of ASICs
> you mention for packet forwarding.  Often they are even more flexible in
> what they can do, but at a significant cost in lower packet rate per chip
> than other networking ASICs.  They often allow even more flexibility than
> even the programmable ASICs, e.g. can run code that processes every byte of
> the payload, whereas even the programmable networking ASICs tend to focus
> on flexibility in packet headers.
> Andy
> On Sat, Apr 8, 2017 at 1:38 PM, Rommel Bajamundi <rommel at bajamundi.com>
> wrote:
>> Ok... I would like to understand this better.  It is my understanding:
>> NPU = router processor, executes control plane (CP), a type of integrated
>> circuit(IC)
>> ASIC= chipset with some prebuilt pipeline to do stuff to offload NPU,
>> I.e. Bit Torrent algorithm built into IC, to offload compute and or GPU
>> resources do the math on the ASIC. Or  Instead of asking NPU where to
>> forward packets, define a FIB that can be part of the packet parsing
>> pipeline to do a lookup and forward.
>> Fixed pipeline ASIC = pre-built parsing and decision tree to accelerate
>> DP forwarding
>> Flexible pipeline ASIC= some ability to modify the flow through decision
>> tree, but not necessarily the ability to change headers parsed with some
>> offloaded functionality that balances speed and features,
>> Programmable ASIC= highest flexibility, flexible parser
>> P4 = higher layer programming language built to define data plane parser,
>> still needs some NPU for programmatic decisions
>> It is my understanding that, Tofino, gave you the flexibility to do stuff
>> on DP I.e send some log even based on ingress packet, or parse 10 bytes
>> after offset 0x08 and lookup, parse a header like GTP and store it as part
>> of your EcMP hash.
>> Benefits:
>> -Portable code, assuming P4 capable ASIC was same P4 version
>> - Flexible parser.... mobile guys can parse GTP now, or craft VxLAN over
>> IPv6 can be done without waiting for support.
>> I would liken this to modern day switching using Broadcom or mellanox
>> with intel as NPU.  Intel does all the determining of FIB and calculating
>> RIB and putting those into tables that the ASIC deals with.
>> I am not a programmer or Hw engineer.   But this is how I understand it.
>>   It was my assumption that the value of Tofino is its ability to have a
>> long lifecycle due to the programmability, address unsupported protocols by
>> current vendors, & address new protocols created.  One would only need
>> chassis upgrades if pps runs out.
>> Rommel
>> On Apr 8, 2017, at 1:48 PM, Michael Borokhovich <michaelbor at gmail.com>
>> wrote:
>> Hi Andy,
>> Thank you for the insight. If this is the performance difference, then of
>> course the advantage of P4 ASIC (e.g., Tofino) is obvious. I see that
>> EZchip NP5 supports 300 millions packets per second. But I didn't find a
>> similar spec for Tofino. Also, this comparison should be done for
>> comparable programs since each additional piece of functionality
>> (parsing/modifying an additional header field or doing an additional table
>> search) affects this pps metrics.
>> But again, if Tofino indeed achieves ~10 times more pps than e.g., EZchip
>> NP5 for the same program, than I clearly see the benefit and the novelty.
>> Michael.
>> On Fri, Apr 7, 2017 at 5:52 PM, Andy Fingerhut <andy.fingerhut at gmail.com>
>> wrote:
>>> In case it isn't obvious, max packet rate that you can achieve in an
>>> ASIC turns into a significant difference in cost when buying the equipment
>>> and paying the power bill for a network.
>>> Suppose you have a choice of a programmable ASIC that goes at 2 billion
>>> packets per second, and an NPU that goes up to 200 million packets per
>>> second, and they both cost roughly the same amount and consume the same
>>> power.
>>> You have some part of a data center connecting a bunch of hosts together
>>> where you decide that kind of programmability is important.  You do some
>>> calculations to determine those hosts need 200 billion packets per second
>>> of forwarding capacity between them.
>>> Do you want buy and provide power for 200/2 = 100 fast programmable
>>> ASICs, or 200/.2 = 1,000 programmable NPUs?
>>> Andy
>>> On Fri, Apr 7, 2017 at 2:37 PM, Andy Fingerhut <andy.fingerhut at gmail.com
>>> > wrote:
>>>> I don't have experience with all NPUs, but many I have seen top out on
>>>> the order of hundreds of millions of packets per second with current
>>>> technology.
>>>> With the same current technology, it is possible to design fixed
>>>> function ASICs, and programmable ASICs like Barefoot's Tofino, that achieve
>>>> billions of packets per second.
>>>> The main difference that I am aware of is that many NPUs are based on
>>>> parallel arrays of 32-bit or 64-bit processor cores, and each core requires
>>>> many cycles for things like constructing table search keys and performing
>>>> side effects on the 'packet vector' (state maintained while forwarding the
>>>> packet about that packet only).  If you want to go at billions of packets
>>>> per second, the only way I know to get there is to have fixed or
>>>> configurable hardware that can do those things in 1 or 2 clock cycles per
>>>> packet.
>>>> You can write a compiler that compiles a P4 program to run on an NPU as
>>>> described above, and it will achieve portability of the P4 program, but it
>>>> won't make that NPU able to go at billions of packets per second.  It is
>>>> limited in performance by its hardware architecture.
>>>> There are proprietary methods for programming some ASICs that can go at
>>>> billions of packets per second, but all that I know of are lower level than
>>>> P4 and non-portable.
>>>> Andy
>>>> On Thu, Apr 6, 2017 at 6:37 PM, Michael Borokhovich <
>>>> michaelbor at gmail.com> wrote:
>>>>> Hi Remy,
>>>>> I'm not confusing hardware with the language... What I mean is that P4
>>>>> + ASIC that supports it claims to give us programmable data-plane and this
>>>>> is claimed to be the innovation. But that is exactly the purpose of NPUs -
>>>>> to give us programmable data-plane and NPUs are around for a very long
>>>>> time. So maybe I'm missing the point of innovation that P4 + ASIC that
>>>>> supports it gives. As Nate said, and I agree, one big advantage is
>>>>> portability and the other - ability to do verification.
>>>>> So, P4 brings kind of an open standard for programmable ASICs which is
>>>>> analogous to a programming language (e.g., C) for regular CPUs. While each
>>>>> NPU currently have its own language and a programming style.
>>>>> What do you think?
>>>>> Thanks,
>>>>> Michael.
>>>>> On Thu, Apr 6, 2017 at 2:07 PM, Remy Chang <remy at barefootnetworks.com>
>>>>> wrote:
>>>>>> Hi Michael,
>>>>>> It seems you're conflating hardware with language.  NPU, programmable
>>>>>> ASIC, general purpose CPU, and even GPU can all potentially execute p4
>>>>>> code.
>>>>>> Regards,
>>>>>> Remy
>>>>>> On Apr 6, 2017 10:57, "Michael Borokhovich" <michaelbor at gmail.com>
>>>>>> wrote:
>>>>>> Thanks for the reply Nate!
>>>>>> So, to summarize, the benefits of P4 approach are: portability and
>>>>>> performance. Other than that you probably can achieve the same (if not
>>>>>> better) flexibility/programmability with an NPU. Is this correct?
>>>>>> On Thu, Apr 6, 2017 at 1:01 AM, Nate Foster <jnfoster at cs.cornell.edu>
>>>>>> wrote:
>>>>>>> Your question seems to be more about the relative merits of various
>>>>>>> architectures than the P4 language. But yes an ASIC is generally more
>>>>>>> efficient than an NPU, at least at scale.
>>>>>>> Beyond efficiency there are other benefits to expressing a data
>>>>>>> plane algorithm in an open framework like P4. For example, a P4 programs
>>>>>>> should be relatively easy to port to a different target. The same is
>>>>>>> unlikely to be true for C programs written against closed SDKs.
>>>>>>> -N
>>>>>>> On Wed, Apr 5, 2017 at 6:59 PM, Michael Borokhovich <
>>>>>>> michaelbor at gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> P4 allows for configurable data-plane, e.g., we can easily support
>>>>>>>> new custom protocols. However, the same functionality may be achieved by
>>>>>>>> using a network processor, e.g., EZchip (the one I had experience with).
>>>>>>>> As I understand, the advantages of programmable ASIC/FPGA that
>>>>>>>> supports P4 is better performance and a lower price than a network
>>>>>>>> processor?
>>>>>>>> What do you think?
>>>>>>>> Thanks!
>>>>>>>> Michael.
>>>>>>>> _______________________________________________
>>>>>>>> P4-dev mailing list
>>>>>>>> P4-dev at lists.p4.org
>>>>>>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>>>> _______________________________________________
>>>>>> P4-dev mailing list
>>>>>> P4-dev at lists.p4.org
>>>>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>>> _______________________________________________
>>>>> P4-dev mailing list
>>>>> P4-dev at lists.p4.org
>>>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>> _______________________________________________
>> P4-dev mailing list
>> P4-dev at lists.p4.org
>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20170409/8608aab5/attachment-0002.html>

More information about the P4-dev mailing list