[P4-dev] General question on P4

Rommel Bajamundi rommel at bajamundi.com
Sun Apr 9 22:37:36 EDT 2017

appreciate both your clarification.   I have been on the switch/router
vendor side.  I have always referred to our chips as custom ASICs vs NPU.

On Apr 9, 2017 8:00 PM, "Michael Borokhovich" <michaelbor at gmail.com> wrote:

> I agree with Andy. NPUs are for programmable data plane.
> On Sat, Apr 8, 2017 at 5:44 PM, Andy Fingerhut <andy.fingerhut at gmail.com>
> wrote:
>> Except for one thing, everything you said sounds good to me.
>> The one exception is that I haven't heard the term NPU used to include
>> the part of the system that runs the control plane code.  In my experience
>> those are called Supervisor CPUs, control plane CPU, or route processor
>> CPU, but no matter which of those names you call them, they tend to be
>> off-the-shelf PowerPC, MIPS, Intel, or Arm CPUs, the same kind you would
>> find in servers or embedded devices.
>> NPU I have usually heard to mean some kind of highly programmable device
>> that is tailored for packet forwarding, similar to the categories of ASICs
>> you mention for packet forwarding.  Often they are even more flexible in
>> what they can do, but at a significant cost in lower packet rate per chip
>> than other networking ASICs.  They often allow even more flexibility than
>> even the programmable ASICs, e.g. can run code that processes every byte of
>> the payload, whereas even the programmable networking ASICs tend to focus
>> on flexibility in packet headers.
>> Andy
>> On Sat, Apr 8, 2017 at 1:38 PM, Rommel Bajamundi <rommel at bajamundi.com>
>> wrote:
>>> Ok... I would like to understand this better.  It is my understanding:
>>> NPU = router processor, executes control plane (CP), a type of
>>> integrated circuit(IC)
>>> ASIC= chipset with some prebuilt pipeline to do stuff to offload NPU,
>>> I.e. Bit Torrent algorithm built into IC, to offload compute and or GPU
>>> resources do the math on the ASIC. Or  Instead of asking NPU where to
>>> forward packets, define a FIB that can be part of the packet parsing
>>> pipeline to do a lookup and forward.
>>> Fixed pipeline ASIC = pre-built parsing and decision tree to accelerate
>>> DP forwarding
>>> Flexible pipeline ASIC= some ability to modify the flow through decision
>>> tree, but not necessarily the ability to change headers parsed with some
>>> offloaded functionality that balances speed and features,
>>> Programmable ASIC= highest flexibility, flexible parser
>>> P4 = higher layer programming language built to define data plane
>>> parser, still needs some NPU for programmatic decisions
>>> It is my understanding that, Tofino, gave you the flexibility to do
>>> stuff on DP I.e send some log even based on ingress packet, or parse 10
>>> bytes after offset 0x08 and lookup, parse a header like GTP and store it as
>>> part of your EcMP hash.
>>> Benefits:
>>> -Portable code, assuming P4 capable ASIC was same P4 version
>>> - Flexible parser.... mobile guys can parse GTP now, or craft VxLAN over
>>> IPv6 can be done without waiting for support.
>>> I would liken this to modern day switching using Broadcom or mellanox
>>> with intel as NPU.  Intel does all the determining of FIB and calculating
>>> RIB and putting those into tables that the ASIC deals with.
>>> I am not a programmer or Hw engineer.   But this is how I understand it.
>>>   It was my assumption that the value of Tofino is its ability to have a
>>> long lifecycle due to the programmability, address unsupported protocols by
>>> current vendors, & address new protocols created.  One would only need
>>> chassis upgrades if pps runs out.
>>> Rommel
>>> On Apr 8, 2017, at 1:48 PM, Michael Borokhovich <michaelbor at gmail.com>
>>> wrote:
>>> Hi Andy,
>>> Thank you for the insight. If this is the performance difference, then
>>> of course the advantage of P4 ASIC (e.g., Tofino) is obvious. I see that
>>> EZchip NP5 supports 300 millions packets per second. But I didn't find a
>>> similar spec for Tofino. Also, this comparison should be done for
>>> comparable programs since each additional piece of functionality
>>> (parsing/modifying an additional header field or doing an additional table
>>> search) affects this pps metrics.
>>> But again, if Tofino indeed achieves ~10 times more pps than e.g.,
>>> EZchip NP5 for the same program, than I clearly see the benefit and the
>>> novelty.
>>> Michael.
>>> On Fri, Apr 7, 2017 at 5:52 PM, Andy Fingerhut <andy.fingerhut at gmail.com
>>> > wrote:
>>>> In case it isn't obvious, max packet rate that you can achieve in an
>>>> ASIC turns into a significant difference in cost when buying the equipment
>>>> and paying the power bill for a network.
>>>> Suppose you have a choice of a programmable ASIC that goes at 2 billion
>>>> packets per second, and an NPU that goes up to 200 million packets per
>>>> second, and they both cost roughly the same amount and consume the same
>>>> power.
>>>> You have some part of a data center connecting a bunch of hosts
>>>> together where you decide that kind of programmability is important.  You
>>>> do some calculations to determine those hosts need 200 billion packets per
>>>> second of forwarding capacity between them.
>>>> Do you want buy and provide power for 200/2 = 100 fast programmable
>>>> ASICs, or 200/.2 = 1,000 programmable NPUs?
>>>> Andy
>>>> On Fri, Apr 7, 2017 at 2:37 PM, Andy Fingerhut <
>>>> andy.fingerhut at gmail.com> wrote:
>>>>> I don't have experience with all NPUs, but many I have seen top out on
>>>>> the order of hundreds of millions of packets per second with current
>>>>> technology.
>>>>> With the same current technology, it is possible to design fixed
>>>>> function ASICs, and programmable ASICs like Barefoot's Tofino, that achieve
>>>>> billions of packets per second.
>>>>> The main difference that I am aware of is that many NPUs are based on
>>>>> parallel arrays of 32-bit or 64-bit processor cores, and each core requires
>>>>> many cycles for things like constructing table search keys and performing
>>>>> side effects on the 'packet vector' (state maintained while forwarding the
>>>>> packet about that packet only).  If you want to go at billions of packets
>>>>> per second, the only way I know to get there is to have fixed or
>>>>> configurable hardware that can do those things in 1 or 2 clock cycles per
>>>>> packet.
>>>>> You can write a compiler that compiles a P4 program to run on an NPU
>>>>> as described above, and it will achieve portability of the P4 program, but
>>>>> it won't make that NPU able to go at billions of packets per second.  It is
>>>>> limited in performance by its hardware architecture.
>>>>> There are proprietary methods for programming some ASICs that can go
>>>>> at billions of packets per second, but all that I know of are lower level
>>>>> than P4 and non-portable.
>>>>> Andy
>>>>> On Thu, Apr 6, 2017 at 6:37 PM, Michael Borokhovich <
>>>>> michaelbor at gmail.com> wrote:
>>>>>> Hi Remy,
>>>>>> I'm not confusing hardware with the language... What I mean is that
>>>>>> P4 + ASIC that supports it claims to give us programmable data-plane and
>>>>>> this is claimed to be the innovation. But that is exactly the purpose of
>>>>>> NPUs - to give us programmable data-plane and NPUs are around for a very
>>>>>> long time. So maybe I'm missing the point of innovation that P4 + ASIC that
>>>>>> supports it gives. As Nate said, and I agree, one big advantage is
>>>>>> portability and the other - ability to do verification.
>>>>>> So, P4 brings kind of an open standard for programmable ASICs which
>>>>>> is analogous to a programming language (e.g., C) for regular CPUs. While
>>>>>> each NPU currently have its own language and a programming style.
>>>>>> What do you think?
>>>>>> Thanks,
>>>>>> Michael.
>>>>>> On Thu, Apr 6, 2017 at 2:07 PM, Remy Chang <remy at barefootnetworks.com
>>>>>> > wrote:
>>>>>>> Hi Michael,
>>>>>>> It seems you're conflating hardware with language.  NPU,
>>>>>>> programmable ASIC, general purpose CPU, and even GPU can all potentially
>>>>>>> execute p4 code.
>>>>>>> Regards,
>>>>>>> Remy
>>>>>>> On Apr 6, 2017 10:57, "Michael Borokhovich" <michaelbor at gmail.com>
>>>>>>> wrote:
>>>>>>> Thanks for the reply Nate!
>>>>>>> So, to summarize, the benefits of P4 approach are: portability and
>>>>>>> performance. Other than that you probably can achieve the same (if not
>>>>>>> better) flexibility/programmability with an NPU. Is this correct?
>>>>>>> On Thu, Apr 6, 2017 at 1:01 AM, Nate Foster <jnfoster at cs.cornell.edu
>>>>>>> > wrote:
>>>>>>>> Your question seems to be more about the relative merits of various
>>>>>>>> architectures than the P4 language. But yes an ASIC is generally more
>>>>>>>> efficient than an NPU, at least at scale.
>>>>>>>> Beyond efficiency there are other benefits to expressing a data
>>>>>>>> plane algorithm in an open framework like P4. For example, a P4 programs
>>>>>>>> should be relatively easy to port to a different target. The same is
>>>>>>>> unlikely to be true for C programs written against closed SDKs.
>>>>>>>> -N
>>>>>>>> On Wed, Apr 5, 2017 at 6:59 PM, Michael Borokhovich <
>>>>>>>> michaelbor at gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>> P4 allows for configurable data-plane, e.g., we can easily support
>>>>>>>>> new custom protocols. However, the same functionality may be achieved by
>>>>>>>>> using a network processor, e.g., EZchip (the one I had experience with).
>>>>>>>>> As I understand, the advantages of programmable ASIC/FPGA that
>>>>>>>>> supports P4 is better performance and a lower price than a network
>>>>>>>>> processor?
>>>>>>>>> What do you think?
>>>>>>>>> Thanks!
>>>>>>>>> Michael.
>>>>>>>>> _______________________________________________
>>>>>>>>> P4-dev mailing list
>>>>>>>>> P4-dev at lists.p4.org
>>>>>>>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>>>>> _______________________________________________
>>>>>>> P4-dev mailing list
>>>>>>> P4-dev at lists.p4.org
>>>>>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>>>>> _______________________________________________
>>>>>> P4-dev mailing list
>>>>>> P4-dev at lists.p4.org
>>>>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
>>> _______________________________________________
>>> P4-dev mailing list
>>> P4-dev at lists.p4.org
>>> http://lists.p4.org/mailman/listinfo/p4-dev_lists.p4.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.p4.org/pipermail/p4-dev_lists.p4.org/attachments/20170409/99f745ca/attachment-0002.html>

More information about the P4-dev mailing list