new language features for "5G" and "edge" use cases

GP
Gergely Pongracz
Tue, Jun 1, 2021 12:56 PM

Hi,

On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future.

There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf

Basically there are the following 3 cases:

  1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some "buffer", and "remove" actions would be needed preferably with timer support
  2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective
  3. Modular pipelines: this would mean that we could specify multiple pipelines and a "master" pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a "master" can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level.

We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let's focus on the ones above.

I'm quite new to this community although we've been using P4 for a while now. So I don't really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these.
Thanks for any hints and help.
BR,

Gergely

Hi, On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future. There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf Basically there are the following 3 cases: 1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some "buffer", and "remove" actions would be needed preferably with timer support 2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective 3. Modular pipelines: this would mean that we could specify multiple pipelines and a "master" pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a "master" can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level. We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let's focus on the ones above. I'm quite new to this community although we've been using P4 for a while now. So I don't really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these. Thanks for any hints and help. BR, Gergely
NF
Nate Foster
Tue, Jun 1, 2021 1:51 PM

Hi Gergely,

The best way to get involved is to attend the monthly P4 LDWG meeting. For
bringing proposals, we have a process discussed on the README.md for the
p4-spec repository (https://github.com/p4lang/p4-spec). Basically we are
happy to entertain a high-level proposal, resulting in a thumbs up or
thumbs down. For detailed proposals, we expect to see a number of things
fully worked out: specification language changes, prototype implementation,
and example programs. Suffice to say, this is a lot of work, so it's great
to either be certain you want to pursue it and are able to follow through,
or you can convince others to help you out.

Responding to these topics:

  1. Does this need a language change or could these buffers be modeled with
    an extern? If so, then no language change is needed, just an architecture
    that supports buffering.
  2. Is the idea the same as Ibanez et al.'s notion of events (
    https://dl.acm.org/doi/10.1145/3365609.3365848)? If not, what language
    changes would you propose?
  3. Simple forms of modularity can be accomplished by treating controls as
    composable units. Note that they support constructors with
    statically-determined parameters. Otherwise, have you looked at Lyra and
    MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology,
    so I recommend using the name "primary" and not the one you used.)

-N

On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design <
p4-design@lists.p4.org> wrote:

Hi,

On the P4 WS we presented a use case that is I admit not very unique these
days (implementing 5G network functions with P4), but while we were
implementing these NFs we came across a few limitations in P4 and started
to wonder whether some new features solving these issues could be part of
P4 in the future.

There are a few slides on these:
https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf

Basically there are the following 3 cases:

1. Short-term (~1 RTT) buffering: this would be handy for
segmentation-reassembly and retransmit loop use cases. Basically some
“buffer”, and “remove” actions would be needed preferably with timer support
2. More generic buffering and time-based events: for programmable
traffic management, packet scheduling or keepalive messages we could use a
more generic method. Possibly the problem is very similar to the previous
on the API level, but in these cases there is probably less limitations on
the buffer size, which could make this a bit more tricky from hardware
development perspective
3. Modular pipelines: this would mean that we could specify multiple
pipelines and a “master” pipeline that would call the underlying ones as
subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master”
can also be re-used as a module by a higher layer pipeline. Probably this
could be solved entirely on the compiler level.

We also described a few less important cases (e.g. re-using tables,
registers, etc.), but we could easily find workarounds to overcome those
limitations, so let’s focus on the ones above.

I’m quite new to this community although we’ve been using P4 for a while
now. So I don’t really know what is the best way to start discussing these
issues and if you find these useful, how to start working on (some of)
these.

Thanks for any hints and help.

BR,

Gergely


P4-design mailing list -- p4-design@lists.p4.org
To unsubscribe send an email to p4-design-leave@lists.p4.org

Hi Gergely, The best way to get involved is to attend the monthly P4 LDWG meeting. For bringing proposals, we have a process discussed on the README.md for the p4-spec repository (https://github.com/p4lang/p4-spec). Basically we are happy to entertain a high-level proposal, resulting in a thumbs up or thumbs down. For detailed proposals, we expect to see a number of things fully worked out: specification language changes, prototype implementation, and example programs. Suffice to say, this is a lot of work, so it's great to either be certain you want to pursue it and are able to follow through, or you can convince others to help you out. Responding to these topics: 1. Does this need a language change or could these buffers be modeled with an extern? If so, then no language change is needed, just an architecture that supports buffering. 2. Is the idea the same as Ibanez et al.'s notion of events ( https://dl.acm.org/doi/10.1145/3365609.3365848)? If not, what language changes would you propose? 3. Simple forms of modularity can be accomplished by treating controls as composable units. Note that they support constructors with statically-determined parameters. Otherwise, have you looked at Lyra and MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, so I recommend using the name "primary" and not the one you used.) -N On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design < p4-design@lists.p4.org> wrote: > Hi, > > > > On the P4 WS we presented a use case that is I admit not very unique these > days (implementing 5G network functions with P4), but while we were > implementing these NFs we came across a few limitations in P4 and started > to wonder whether some new features solving these issues could be part of > P4 in the future. > > > > There are a few slides on these: > https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf > > > > Basically there are the following 3 cases: > > 1. Short-term (~1 RTT) buffering: this would be handy for > segmentation-reassembly and retransmit loop use cases. Basically some > “buffer”, and “remove” actions would be needed preferably with timer support > 2. More generic buffering and time-based events: for programmable > traffic management, packet scheduling or keepalive messages we could use a > more generic method. Possibly the problem is very similar to the previous > on the API level, but in these cases there is probably less limitations on > the buffer size, which could make this a bit more tricky from hardware > development perspective > 3. Modular pipelines: this would mean that we could specify multiple > pipelines and a “master” pipeline that would call the underlying ones as > subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master” > can also be re-used as a module by a higher layer pipeline. Probably this > could be solved entirely on the compiler level. > > > > We also described a few less important cases (e.g. re-using tables, > registers, etc.), but we could easily find workarounds to overcome those > limitations, so let’s focus on the ones above. > > > > I’m quite new to this community although we’ve been using P4 for a while > now. So I don’t really know what is the best way to start discussing these > issues and if you find these useful, how to start working on (some of) > these. > > Thanks for any hints and help. > > BR, > > > > Gergely > > > _______________________________________________ > P4-design mailing list -- p4-design@lists.p4.org > To unsubscribe send an email to p4-design-leave@lists.p4.org >
NF
Nate Foster
Tue, Jun 1, 2021 1:53 PM

ps. When exploring a language design, it's extremely helpful to look at
concrete examples (and not just abstract features).

If you have code for some 5G use cases worked out, even in hypothetical
extensions of P4, those would be great to see.

-N

On Tue, Jun 1, 2021 at 9:51 AM Nate Foster jnfoster@cs.cornell.edu wrote:

Hi Gergely,

The best way to get involved is to attend the monthly P4 LDWG meeting. For
bringing proposals, we have a process discussed on the README.md for the
p4-spec repository (https://github.com/p4lang/p4-spec). Basically we are
happy to entertain a high-level proposal, resulting in a thumbs up or
thumbs down. For detailed proposals, we expect to see a number of things
fully worked out: specification language changes, prototype implementation,
and example programs. Suffice to say, this is a lot of work, so it's great
to either be certain you want to pursue it and are able to follow through,
or you can convince others to help you out.

Responding to these topics:

  1. Does this need a language change or could these buffers be modeled with
    an extern? If so, then no language change is needed, just an architecture
    that supports buffering.
  2. Is the idea the same as Ibanez et al.'s notion of events (
    https://dl.acm.org/doi/10.1145/3365609.3365848)? If not, what language
    changes would you propose?
  3. Simple forms of modularity can be accomplished by treating controls as
    composable units. Note that they support constructors with
    statically-determined parameters. Otherwise, have you looked at Lyra and
    MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology,
    so I recommend using the name "primary" and not the one you used.)

-N

On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design <
p4-design@lists.p4.org> wrote:

Hi,

On the P4 WS we presented a use case that is I admit not very unique
these days (implementing 5G network functions with P4), but while we were
implementing these NFs we came across a few limitations in P4 and started
to wonder whether some new features solving these issues could be part of
P4 in the future.

There are a few slides on these:
https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf

Basically there are the following 3 cases:

1. Short-term (~1 RTT) buffering: this would be handy for
segmentation-reassembly and retransmit loop use cases. Basically some
“buffer”, and “remove” actions would be needed preferably with timer support
2. More generic buffering and time-based events: for programmable
traffic management, packet scheduling or keepalive messages we could use a
more generic method. Possibly the problem is very similar to the previous
on the API level, but in these cases there is probably less limitations on
the buffer size, which could make this a bit more tricky from hardware
development perspective
3. Modular pipelines: this would mean that we could specify multiple
pipelines and a “master” pipeline that would call the underlying ones as
subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master”
can also be re-used as a module by a higher layer pipeline. Probably this
could be solved entirely on the compiler level.

We also described a few less important cases (e.g. re-using tables,
registers, etc.), but we could easily find workarounds to overcome those
limitations, so let’s focus on the ones above.

I’m quite new to this community although we’ve been using P4 for a while
now. So I don’t really know what is the best way to start discussing these
issues and if you find these useful, how to start working on (some of)
these.

Thanks for any hints and help.

BR,

Gergely


P4-design mailing list -- p4-design@lists.p4.org
To unsubscribe send an email to p4-design-leave@lists.p4.org

ps. When exploring a language design, it's extremely helpful to look at concrete examples (and not just abstract features). If you have code for some 5G use cases worked out, even in hypothetical extensions of P4, those would be great to see. -N On Tue, Jun 1, 2021 at 9:51 AM Nate Foster <jnfoster@cs.cornell.edu> wrote: > Hi Gergely, > > The best way to get involved is to attend the monthly P4 LDWG meeting. For > bringing proposals, we have a process discussed on the README.md for the > p4-spec repository (https://github.com/p4lang/p4-spec). Basically we are > happy to entertain a high-level proposal, resulting in a thumbs up or > thumbs down. For detailed proposals, we expect to see a number of things > fully worked out: specification language changes, prototype implementation, > and example programs. Suffice to say, this is a lot of work, so it's great > to either be certain you want to pursue it and are able to follow through, > or you can convince others to help you out. > > Responding to these topics: > 1. Does this need a language change or could these buffers be modeled with > an extern? If so, then no language change is needed, just an architecture > that supports buffering. > 2. Is the idea the same as Ibanez et al.'s notion of events ( > https://dl.acm.org/doi/10.1145/3365609.3365848)? If not, what language > changes would you propose? > 3. Simple forms of modularity can be accomplished by treating controls as > composable units. Note that they support constructors with > statically-determined parameters. Otherwise, have you looked at Lyra and > MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, > so I recommend using the name "primary" and not the one you used.) > > -N > > On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design < > p4-design@lists.p4.org> wrote: > >> Hi, >> >> >> >> On the P4 WS we presented a use case that is I admit not very unique >> these days (implementing 5G network functions with P4), but while we were >> implementing these NFs we came across a few limitations in P4 and started >> to wonder whether some new features solving these issues could be part of >> P4 in the future. >> >> >> >> There are a few slides on these: >> https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf >> >> >> >> Basically there are the following 3 cases: >> >> 1. Short-term (~1 RTT) buffering: this would be handy for >> segmentation-reassembly and retransmit loop use cases. Basically some >> “buffer”, and “remove” actions would be needed preferably with timer support >> 2. More generic buffering and time-based events: for programmable >> traffic management, packet scheduling or keepalive messages we could use a >> more generic method. Possibly the problem is very similar to the previous >> on the API level, but in these cases there is probably less limitations on >> the buffer size, which could make this a bit more tricky from hardware >> development perspective >> 3. Modular pipelines: this would mean that we could specify multiple >> pipelines and a “master” pipeline that would call the underlying ones as >> subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master” >> can also be re-used as a module by a higher layer pipeline. Probably this >> could be solved entirely on the compiler level. >> >> >> >> We also described a few less important cases (e.g. re-using tables, >> registers, etc.), but we could easily find workarounds to overcome those >> limitations, so let’s focus on the ones above. >> >> >> >> I’m quite new to this community although we’ve been using P4 for a while >> now. So I don’t really know what is the best way to start discussing these >> issues and if you find these useful, how to start working on (some of) >> these. >> >> Thanks for any hints and help. >> >> BR, >> >> >> >> Gergely >> >> >> _______________________________________________ >> P4-design mailing list -- p4-design@lists.p4.org >> To unsubscribe send an email to p4-design-leave@lists.p4.org >> >
MB
Mihai Budiu
Tue, Jun 1, 2021 5:35 PM

BTW: the next design meeting is taking place Monday June 7 at 1:30PM Pacific Time.

You can join the meeting here: https://VMware.zoom.us/j/5366328548?pwd=dVhmMlZJQnl2czkvdENUVnIrdjBKdz09
I realize this may not be a good time for you; in that case we can organize a special session at a different time where you can present your proposals.

Mihai

From: Nate Foster jnfoster@cs.cornell.edu
Sent: Tuesday, June 1, 2021 6:51 AM
To: Gergely Pongracz Gergely.Pongracz@ericsson.com
Cc: p4-design p4-design@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

Hi Gergely,

The best way to get involved is to attend the monthly P4 LDWG meeting. For bringing proposals, we have a process discussed on the README.md for the p4-spec repository (https://github.com/p4lang/p4-spechttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fp4lang%2Fp4-spec&data=04%7C01%7Cmbudiu%40vmware.com%7Ce95cca9413f94e7ae05b08d925046285%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637581523035504509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=MwNE6KPn%2FiThcmXAXJ2eFaQeTvv51O1U2ebZd%2B6T1w4%3D&reserved=0). Basically we are happy to entertain a high-level proposal, resulting in a thumbs up or thumbs down. For detailed proposals, we expect to see a number of things fully worked out: specification language changes, prototype implementation, and example programs. Suffice to say, this is a lot of work, so it's great to either be certain you want to pursue it and are able to follow through, or you can convince others to help you out.

Responding to these topics:

  1. Does this need a language change or could these buffers be modeled with an extern? If so, then no language change is needed, just an architecture that supports buffering.
  2. Is the idea the same as Ibanez et al.'s notion of events (https://dl.acm.org/doi/10.1145/3365609.3365848https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdl.acm.org%2Fdoi%2F10.1145%2F3365609.3365848&data=04%7C01%7Cmbudiu%40vmware.com%7Ce95cca9413f94e7ae05b08d925046285%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637581523035504509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=kjhmfmlK0MpSY0V3J6yFB105rf2kh6OsOMh67QfZ3TA%3D&reserved=0)? If not, what language changes would you propose?
  3. Simple forms of modularity can be accomplished by treating controls as composable units. Note that they support constructors with statically-determined parameters. Otherwise, have you looked at Lyra and MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, so I recommend using the name "primary" and not the one you used.)

-N

On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design <p4-design@lists.p4.orgmailto:p4-design@lists.p4.org> wrote:
Hi,

On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future.

There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdfhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopennetworking.org%2Fwp-content%2Fuploads%2F2021%2F05%2FGergely-Pongracz-Slides.pdf&data=04%7C01%7Cmbudiu%40vmware.com%7Ce95cca9413f94e7ae05b08d925046285%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637581523035514508%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=0F2yzvhzy6sK7qm%2F2ys4NaSUBoxrs39PoNWnrzbcfX8%3D&reserved=0

Basically there are the following 3 cases:

  1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some "buffer", and "remove" actions would be needed preferably with timer support
  2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective
  3. Modular pipelines: this would mean that we could specify multiple pipelines and a "master" pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a "master" can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level.

We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let's focus on the ones above.

I'm quite new to this community although we've been using P4 for a while now. So I don't really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these.
Thanks for any hints and help.
BR,

Gergely


P4-design mailing list -- p4-design@lists.p4.orgmailto:p4-design@lists.p4.org
To unsubscribe send an email to p4-design-leave@lists.p4.orgmailto:p4-design-leave@lists.p4.org

BTW: the next design meeting is taking place Monday June 7 at 1:30PM Pacific Time. You can join the meeting here: https://VMware.zoom.us/j/5366328548?pwd=dVhmMlZJQnl2czkvdENUVnIrdjBKdz09 I realize this may not be a good time for you; in that case we can organize a special session at a different time where you can present your proposals. Mihai From: Nate Foster <jnfoster@cs.cornell.edu> Sent: Tuesday, June 1, 2021 6:51 AM To: Gergely Pongracz <Gergely.Pongracz@ericsson.com> Cc: p4-design <p4-design@lists.p4.org> Subject: [P4-design] Re: new language features for "5G" and "edge" use cases Hi Gergely, The best way to get involved is to attend the monthly P4 LDWG meeting. For bringing proposals, we have a process discussed on the README.md for the p4-spec repository (https://github.com/p4lang/p4-spec<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fp4lang%2Fp4-spec&data=04%7C01%7Cmbudiu%40vmware.com%7Ce95cca9413f94e7ae05b08d925046285%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637581523035504509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=MwNE6KPn%2FiThcmXAXJ2eFaQeTvv51O1U2ebZd%2B6T1w4%3D&reserved=0>). Basically we are happy to entertain a high-level proposal, resulting in a thumbs up or thumbs down. For detailed proposals, we expect to see a number of things fully worked out: specification language changes, prototype implementation, and example programs. Suffice to say, this is a lot of work, so it's great to either be certain you want to pursue it and are able to follow through, or you can convince others to help you out. Responding to these topics: 1. Does this need a language change or could these buffers be modeled with an extern? If so, then no language change is needed, just an architecture that supports buffering. 2. Is the idea the same as Ibanez et al.'s notion of events (https://dl.acm.org/doi/10.1145/3365609.3365848<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdl.acm.org%2Fdoi%2F10.1145%2F3365609.3365848&data=04%7C01%7Cmbudiu%40vmware.com%7Ce95cca9413f94e7ae05b08d925046285%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637581523035504509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=kjhmfmlK0MpSY0V3J6yFB105rf2kh6OsOMh67QfZ3TA%3D&reserved=0>)? If not, what language changes would you propose? 3. Simple forms of modularity can be accomplished by treating controls as composable units. Note that they support constructors with statically-determined parameters. Otherwise, have you looked at Lyra and MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, so I recommend using the name "primary" and not the one you used.) -N On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design <p4-design@lists.p4.org<mailto:p4-design@lists.p4.org>> wrote: Hi, On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future. There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopennetworking.org%2Fwp-content%2Fuploads%2F2021%2F05%2FGergely-Pongracz-Slides.pdf&data=04%7C01%7Cmbudiu%40vmware.com%7Ce95cca9413f94e7ae05b08d925046285%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637581523035514508%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=0F2yzvhzy6sK7qm%2F2ys4NaSUBoxrs39PoNWnrzbcfX8%3D&reserved=0> Basically there are the following 3 cases: 1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some "buffer", and "remove" actions would be needed preferably with timer support 2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective 3. Modular pipelines: this would mean that we could specify multiple pipelines and a "master" pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a "master" can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level. We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let's focus on the ones above. I'm quite new to this community although we've been using P4 for a while now. So I don't really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these. Thanks for any hints and help. BR, Gergely _______________________________________________ P4-design mailing list -- p4-design@lists.p4.org<mailto:p4-design@lists.p4.org> To unsubscribe send an email to p4-design-leave@lists.p4.org<mailto:p4-design-leave@lists.p4.org>
GP
Gergely Pongracz
Fri, Jun 4, 2021 9:13 AM

Hi Nate,

Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst
The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running.

Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started.

Regarding to your questions:

  1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA.
  2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now).
  3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run.

So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too.
Thanks!

Gergely

From: Nate Foster jnfoster@cs.cornell.edu
Sent: Tuesday, June 1, 2021 3:51 PM
To: Gergely Pongracz Gergely.Pongracz@ericsson.com
Cc: p4-design p4-design@lists.p4.org
Subject: Re: [P4-design] new language features for "5G" and "edge" use cases

Hi Gergely,

The best way to get involved is to attend the monthly P4 LDWG meeting. For bringing proposals, we have a process discussed on the README.md for the p4-spec repository (https://github.com/p4lang/p4-spechttps://protect2.fireeye.com/v1/url?k=6d35ce48-32aef74a-6d358ed3-869a14f4b08c-43b5c3677e857afe&q=1&e=904f2a87-34c8-4d7b-a5d3-e68eda85728c&u=https%3A%2F%2Fgithub.com%2Fp4lang%2Fp4-spec). Basically we are happy to entertain a high-level proposal, resulting in a thumbs up or thumbs down. For detailed proposals, we expect to see a number of things fully worked out: specification language changes, prototype implementation, and example programs. Suffice to say, this is a lot of work, so it's great to either be certain you want to pursue it and are able to follow through, or you can convince others to help you out.

Responding to these topics:

  1. Does this need a language change or could these buffers be modeled with an extern? If so, then no language change is needed, just an architecture that supports buffering.
  2. Is the idea the same as Ibanez et al.'s notion of events (https://dl.acm.org/doi/10.1145/3365609.3365848https://protect2.fireeye.com/v1/url?k=f95a03cb-a6c13ac9-f95a4350-869a14f4b08c-5b32820b43216c12&q=1&e=904f2a87-34c8-4d7b-a5d3-e68eda85728c&u=https%3A%2F%2Fdl.acm.org%2Fdoi%2F10.1145%2F3365609.3365848)? If not, what language changes would you propose?
  3. Simple forms of modularity can be accomplished by treating controls as composable units. Note that they support constructors with statically-determined parameters. Otherwise, have you looked at Lyra and MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, so I recommend using the name "primary" and not the one you used.)

-N

On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design <p4-design@lists.p4.orgmailto:p4-design@lists.p4.org> wrote:
Hi,

On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future.

There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdfhttps://protect2.fireeye.com/v1/url?k=cb6a12ef-94f12bed-cb6a5274-869a14f4b08c-1cb642ccaa4f8b6f&q=1&e=904f2a87-34c8-4d7b-a5d3-e68eda85728c&u=https%3A%2F%2Fopennetworking.org%2Fwp-content%2Fuploads%2F2021%2F05%2FGergely-Pongracz-Slides.pdf

Basically there are the following 3 cases:

  1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some “buffer”, and “remove” actions would be needed preferably with timer support
  2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective
  3. Modular pipelines: this would mean that we could specify multiple pipelines and a “master” pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master” can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level.

We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let’s focus on the ones above.

I’m quite new to this community although we’ve been using P4 for a while now. So I don’t really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these.
Thanks for any hints and help.
BR,

Gergely


P4-design mailing list -- p4-design@lists.p4.orgmailto:p4-design@lists.p4.org
To unsubscribe send an email to p4-design-leave@lists.p4.orgmailto:p4-design-leave@lists.p4.org

Hi Nate, Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running. Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started. Regarding to your questions: 1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA. 2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now). 3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run. So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too. Thanks! Gergely From: Nate Foster <jnfoster@cs.cornell.edu> Sent: Tuesday, June 1, 2021 3:51 PM To: Gergely Pongracz <Gergely.Pongracz@ericsson.com> Cc: p4-design <p4-design@lists.p4.org> Subject: Re: [P4-design] new language features for "5G" and "edge" use cases Hi Gergely, The best way to get involved is to attend the monthly P4 LDWG meeting. For bringing proposals, we have a process discussed on the README.md for the p4-spec repository (https://github.com/p4lang/p4-spec<https://protect2.fireeye.com/v1/url?k=6d35ce48-32aef74a-6d358ed3-869a14f4b08c-43b5c3677e857afe&q=1&e=904f2a87-34c8-4d7b-a5d3-e68eda85728c&u=https%3A%2F%2Fgithub.com%2Fp4lang%2Fp4-spec>). Basically we are happy to entertain a high-level proposal, resulting in a thumbs up or thumbs down. For detailed proposals, we expect to see a number of things fully worked out: specification language changes, prototype implementation, and example programs. Suffice to say, this is a lot of work, so it's great to either be certain you want to pursue it and are able to follow through, or you can convince others to help you out. Responding to these topics: 1. Does this need a language change or could these buffers be modeled with an extern? If so, then no language change is needed, just an architecture that supports buffering. 2. Is the idea the same as Ibanez et al.'s notion of events (https://dl.acm.org/doi/10.1145/3365609.3365848<https://protect2.fireeye.com/v1/url?k=f95a03cb-a6c13ac9-f95a4350-869a14f4b08c-5b32820b43216c12&q=1&e=904f2a87-34c8-4d7b-a5d3-e68eda85728c&u=https%3A%2F%2Fdl.acm.org%2Fdoi%2F10.1145%2F3365609.3365848>)? If not, what language changes would you propose? 3. Simple forms of modularity can be accomplished by treating controls as composable units. Note that they support constructors with statically-determined parameters. Otherwise, have you looked at Lyra and MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, so I recommend using the name "primary" and not the one you used.) -N On Tue, Jun 1, 2021 at 8:56 AM Gergely Pongracz via P4-design <p4-design@lists.p4.org<mailto:p4-design@lists.p4.org>> wrote: Hi, On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future. There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf<https://protect2.fireeye.com/v1/url?k=cb6a12ef-94f12bed-cb6a5274-869a14f4b08c-1cb642ccaa4f8b6f&q=1&e=904f2a87-34c8-4d7b-a5d3-e68eda85728c&u=https%3A%2F%2Fopennetworking.org%2Fwp-content%2Fuploads%2F2021%2F05%2FGergely-Pongracz-Slides.pdf> Basically there are the following 3 cases: 1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some “buffer”, and “remove” actions would be needed preferably with timer support 2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective 3. Modular pipelines: this would mean that we could specify multiple pipelines and a “master” pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master” can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level. We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let’s focus on the ones above. I’m quite new to this community although we’ve been using P4 for a while now. So I don’t really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these. Thanks for any hints and help. BR, Gergely _______________________________________________ P4-design mailing list -- p4-design@lists.p4.org<mailto:p4-design@lists.p4.org> To unsubscribe send an email to p4-design-leave@lists.p4.org<mailto:p4-design-leave@lists.p4.org>
H
hemant@mnkcg.com
Fri, Jun 4, 2021 4:34 PM

Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet tools chain.  Just like multiple timers are easy to support on FPGA, the Linux user space can support them easily as well.  Both DPDK and VPP (Fd.io) run in Linux user space.  The p4c bmv2 backend also runs in Linux user space.  The PNA (Portable NIC Architecture) has also asked for timers/events.  I am not sure if any p4 lang changes to adopt Ibanez’s proposal – the proposal uses enqueue/deque blocks and metadata to supports event driven processing.  I think a new architecture model for event driven processing could be added to the p4 spec. Thereafter, implement p4c changes for the p4c UBPF or bmv2 backend.  The UBPF backend is simpler to modify.

Since I work with p4c, I can try to help with p4c changes.  On Monday, June 7th we have a P4 LDWG meeting we could discuss what I have mentioned above.

Thanks and best wishes,

Hemant

From: Gergely Pongracz via P4-design p4-design@lists.p4.org
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster jnfoster@cs.cornell.edu
Cc: p4-design p4-design@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

Hi Nate,

Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst

The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running.

Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started.

Regarding to your questions:

  1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA.
  2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now).
  3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run.

So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too.

Thanks!

Gergely

Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet tools chain. Just like multiple timers are easy to support on FPGA, the Linux user space can support them easily as well. Both DPDK and VPP (Fd.io) run in Linux user space. The p4c bmv2 backend also runs in Linux user space. The PNA (Portable NIC Architecture) has also asked for timers/events. I am not sure if any p4 lang changes to adopt Ibanez’s proposal – the proposal uses enqueue/deque blocks and metadata to supports event driven processing. I think a new architecture model for event driven processing could be added to the p4 spec. Thereafter, implement p4c changes for the p4c UBPF or bmv2 backend. The UBPF backend is simpler to modify. Since I work with p4c, I can try to help with p4c changes. On Monday, June 7th we have a P4 LDWG meeting we could discuss what I have mentioned above. Thanks and best wishes, Hemant From: Gergely Pongracz via P4-design <p4-design@lists.p4.org> Sent: Friday, June 04, 2021 5:14 AM To: Nate Foster <jnfoster@cs.cornell.edu> Cc: p4-design <p4-design@lists.p4.org> Subject: [P4-design] Re: new language features for "5G" and "edge" use cases Hi Nate, Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running. Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started. Regarding to your questions: 1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA. 2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now). 3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run. So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too. Thanks! Gergely
H
hemant@mnkcg.com
Fri, Jun 4, 2021 7:30 PM

If one sees Gergely’s slides (https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf), he has also asked for conditional emit in deparser.  p4c has already added an if condition to the P4 parser during this year.  We should discuss use of conditional in deparser, at least for certain low speed architectures.

Thanks,

Hemant

From: Gergely Pongracz via P4-design p4-design@lists.p4.org
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster jnfoster@cs.cornell.edu
Cc: p4-design p4-design@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

Hi Nate,

Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst

The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running.

Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started.

Regarding to your questions:

  1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA.
  2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now).
  3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run.

So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too.

Thanks!

Gergely

If one sees Gergely’s slides (https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf), he has also asked for conditional emit in deparser. p4c has already added an if condition to the P4 parser during this year. We should discuss use of conditional in deparser, at least for certain low speed architectures. Thanks, Hemant From: Gergely Pongracz via P4-design <p4-design@lists.p4.org> Sent: Friday, June 04, 2021 5:14 AM To: Nate Foster <jnfoster@cs.cornell.edu> Cc: p4-design <p4-design@lists.p4.org> Subject: [P4-design] Re: new language features for "5G" and "edge" use cases Hi Nate, Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running. Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started. Regarding to your questions: 1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA. 2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now). 3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run. So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too. Thanks! Gergely
TL
thomas.luinaud@gmail.com
Fri, Jun 4, 2021 8:47 PM

Hi Hemant,

I am also interested in conditional emit inside the deparser. Currently, I am working on ways to add P4c passes which simplify the deparsing graph by looking at header validity. Those passes are using conditional statements to achieve the simplification.
I would be glad to help contribute to this topic.

Thanks,

Thomas

Hi Hemant, I am also interested in conditional emit inside the deparser. Currently, I am working on ways to add P4c passes which simplify the deparsing graph by looking at header validity. Those passes are using conditional statements to achieve the simplification.\ I would be glad to help contribute to this topic. Thanks, Thomas
H
hemant@mnkcg.com
Sun, Jun 6, 2021 8:05 PM

In case anyone doesn’t have the Ibanez paper, it is attached.

Hemant

From: Hemant Singh via P4-design p4-design@lists.p4.org
Sent: Friday, June 04, 2021 12:34 PM
To: Gergely.Pongracz@ericsson.com; jnfoster@cs.cornell.edu
Cc: p4-design@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet tools chain.  Just like multiple timers are easy to support on FPGA, the Linux user space can support them easily as well.  Both DPDK and VPP (Fd.io) run in Linux user space.  The p4c bmv2 backend also runs in Linux user space.  The PNA (Portable NIC Architecture) has also asked for timers/events.  I am not sure if any p4 lang changes to adopt Ibanez’s proposal – the proposal uses enqueue/deque blocks and metadata to supports event driven processing.  I think a new architecture model for event driven processing could be added to the p4 spec. Thereafter, implement p4c changes for the p4c UBPF or bmv2 backend.  The UBPF backend is simpler to modify.

Since I work with p4c, I can try to help with p4c changes.  On Monday, June 7th we have a P4 LDWG meeting we could discuss what I have mentioned above.

Thanks and best wishes,

Hemant

From: Gergely Pongracz via P4-design <p4-design@lists.p4.org mailto:p4-design@lists.p4.org >
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster <jnfoster@cs.cornell.edu mailto:jnfoster@cs.cornell.edu >
Cc: p4-design <p4-design@lists.p4.org mailto:p4-design@lists.p4.org >
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

Hi Nate,

Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst

The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running.

Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started.

Regarding to your questions:

  1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA.
  2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now).
  3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run.

So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too.

Thanks!

Gergely

In case anyone doesn’t have the Ibanez paper, it is attached. Hemant From: Hemant Singh via P4-design <p4-design@lists.p4.org> Sent: Friday, June 04, 2021 12:34 PM To: Gergely.Pongracz@ericsson.com; jnfoster@cs.cornell.edu Cc: p4-design@lists.p4.org Subject: [P4-design] Re: new language features for "5G" and "edge" use cases Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet tools chain. Just like multiple timers are easy to support on FPGA, the Linux user space can support them easily as well. Both DPDK and VPP (Fd.io) run in Linux user space. The p4c bmv2 backend also runs in Linux user space. The PNA (Portable NIC Architecture) has also asked for timers/events. I am not sure if any p4 lang changes to adopt Ibanez’s proposal – the proposal uses enqueue/deque blocks and metadata to supports event driven processing. I think a new architecture model for event driven processing could be added to the p4 spec. Thereafter, implement p4c changes for the p4c UBPF or bmv2 backend. The UBPF backend is simpler to modify. Since I work with p4c, I can try to help with p4c changes. On Monday, June 7th we have a P4 LDWG meeting we could discuss what I have mentioned above. Thanks and best wishes, Hemant From: Gergely Pongracz via P4-design <p4-design@lists.p4.org <mailto:p4-design@lists.p4.org> > Sent: Friday, June 04, 2021 5:14 AM To: Nate Foster <jnfoster@cs.cornell.edu <mailto:jnfoster@cs.cornell.edu> > Cc: p4-design <p4-design@lists.p4.org <mailto:p4-design@lists.p4.org> > Subject: [P4-design] Re: new language features for "5G" and "edge" use cases Hi Nate, Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running. Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started. Regarding to your questions: 1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA. 2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now). 3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run. So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too. Thanks! Gergely
NF
Nate Foster
Sun, Jun 6, 2021 8:24 PM

Let’s not get in the habit of redistributing other people’s materials on
this list.

Better to link to the ACM Digital Library, ACM SICOMM sponsored open-access
versions, or the author’s copy on their webpages — none of these run afoul
of copyright law.

-N

On Sun, Jun 6, 2021 at 4:04 PM hemant@mnkcg.com wrote:

In case anyone doesn’t have the Ibanez paper, it is attached.

Hemant

From: Hemant Singh via P4-design p4-design@lists.p4.org
Sent: Friday, June 04, 2021 12:34 PM
To: Gergely.Pongracz@ericsson.com; jnfoster@cs.cornell.edu
Cc: p4-design@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use
cases

Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet
tools chain.  Just like multiple timers are easy to support on FPGA, the
Linux user space can support them easily as well.  Both DPDK and VPP
(Fd.io) run in Linux user space.  The p4c bmv2 backend also runs in Linux
user space.  The PNA (Portable NIC Architecture) has also asked for
timers/events.  I am not sure if any p4 lang changes to adopt Ibanez’s
proposal – the proposal uses enqueue/deque blocks and metadata to supports
event driven processing.  I think a new architecture model for event driven
processing could be added to the p4 spec. Thereafter, implement p4c changes
for the p4c UBPF or bmv2 backend.  The UBPF backend is simpler to modify.

Since I work with p4c, I can try to help with p4c changes.  On Monday,
June 7th we have a P4 LDWG meeting we could discuss what I have mentioned
above.

Thanks and best wishes,

Hemant

From: Gergely Pongracz via P4-design p4-design@lists.p4.org
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster jnfoster@cs.cornell.edu
Cc: p4-design p4-design@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use
cases

Hi Nate,

Sorry for the long delay. I uploaded our example code here:
https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst

The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 –
we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for
compiling and running.

Buffering would be executed in the RANDownlink() control block, now we add
a special header and send out the packet towards the service IP of the BaaS
(buffer-as-a-service) which runs as a Kubernetes service for now. We could
clone the packet just as well and send it directly to the downlink path
while sending the copy to the buffer, but now it is sent to the buffer and
on successful buffering the BaaS service returns the packet – this way we
know that the original packet is buffered and timeout counter is started.

Regarding to your questions:

1. You are right, maybe it could be solved by an extern similarly as
we solve it with a non-P4 component. On the other hand I don’t particularly
like having too much architectures around as that kills one of the main
advantages of P4 (to my knowledge) which is portability. So I’d rather go
for a language change with this – for me the only reason not doing that
could be if the task would be impossible to support by some hardware
targets. You know the language much better, but I’d say buffering a few
packets could be similar to having a bit more registers. So buffering
itself doesn’t seem a huge issue for me. Running timers and assigning
events to them on the other hand might be a bigger change as potentially
there would be a large amount of parallel timers – and of course there are
good data structures for that, but are they hardware friendly enough?
Ibanez’s presentation suggests it can be done fairly simply on FPGA.
2. According to the presentation I think the proposed solution –
especially if all proposed primitives on slide 6 would be implemented – is
a superset of what we’d need (I’d say for us enqueue, dequeue and timer
expiration would be enough). So if Ibanez’s proposal would be part of the
language, we wouldn’t need more (at least for now).
3. Yes, if you have a look at the code you’ll see that we already use
control blocks for modularizing the code. With Tofino sometimes it’s not
straightforward as the compiler tends to use more stages in this case
compared to if you use less control blocks (this issue was also mentioned
in the uP4 talk). As I understood, Lyra is a higher layer solution for
portability over multiple DSLs, so I guess that would be handy if even in
the long term portability would be an issue. I think Lyra’s composition
part could deal with composing multiple modules / programs on a single
switch – I guess you referred to this feature, but I don’t think we’d need
a Lyra-like engine in the long run.

So my only question that is remaining: is the proposals from Ibanez & co.
already considered by some of the working groups e.g. LDWG? If yes, I’ll go
thru the details as that is quite likely a good solution for us too.

Thanks!

Gergely

Let’s not get in the habit of redistributing other people’s materials on this list. Better to link to the ACM Digital Library, ACM SICOMM sponsored open-access versions, or the author’s copy on their webpages — none of these run afoul of copyright law. -N On Sun, Jun 6, 2021 at 4:04 PM <hemant@mnkcg.com> wrote: > > > In case anyone doesn’t have the Ibanez paper, it is attached. > > > > Hemant > > > > *From:* Hemant Singh via P4-design <p4-design@lists.p4.org> > *Sent:* Friday, June 04, 2021 12:34 PM > *To:* Gergely.Pongracz@ericsson.com; jnfoster@cs.cornell.edu > *Cc:* p4-design@lists.p4.org > *Subject:* [P4-design] Re: new language features for "5G" and "edge" use > cases > > > > Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet > tools chain. Just like multiple timers are easy to support on FPGA, the > Linux user space can support them easily as well. Both DPDK and VPP > (Fd.io) run in Linux user space. The p4c bmv2 backend also runs in Linux > user space. The PNA (Portable NIC Architecture) has also asked for > timers/events. I am not sure if any p4 lang changes to adopt Ibanez’s > proposal – the proposal uses enqueue/deque blocks and metadata to supports > event driven processing. I think a new architecture model for event driven > processing could be added to the p4 spec. Thereafter, implement p4c changes > for the p4c UBPF or bmv2 backend. The UBPF backend is simpler to modify. > > > > Since I work with p4c, I can try to help with p4c changes. On Monday, > June 7th we have a P4 LDWG meeting we could discuss what I have mentioned > above. > > > > Thanks and best wishes, > > > > Hemant > > > > > > *From:* Gergely Pongracz via P4-design <p4-design@lists.p4.org> > *Sent:* Friday, June 04, 2021 5:14 AM > *To:* Nate Foster <jnfoster@cs.cornell.edu> > *Cc:* p4-design <p4-design@lists.p4.org> > *Subject:* [P4-design] Re: new language features for "5G" and "edge" use > cases > > > > Hi Nate, > > > > Sorry for the long delay. I uploaded our example code here: > https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst > > The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – > we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for > compiling and running. > > > > Buffering would be executed in the RANDownlink() control block, now we add > a special header and send out the packet towards the service IP of the BaaS > (buffer-as-a-service) which runs as a Kubernetes service for now. We could > clone the packet just as well and send it directly to the downlink path > while sending the copy to the buffer, but now it is sent to the buffer and > on successful buffering the BaaS service returns the packet – this way we > know that the original packet is buffered and timeout counter is started. > > > > Regarding to your questions: > > 1. You are right, maybe it could be solved by an extern similarly as > we solve it with a non-P4 component. On the other hand I don’t particularly > like having too much architectures around as that kills one of the main > advantages of P4 (to my knowledge) which is portability. So I’d rather go > for a language change with this – for me the only reason not doing that > could be if the task would be impossible to support by some hardware > targets. You know the language much better, but I’d say buffering a few > packets could be similar to having a bit more registers. So buffering > itself doesn’t seem a huge issue for me. Running timers and assigning > events to them on the other hand might be a bigger change as potentially > there would be a large amount of parallel timers – and of course there are > good data structures for that, but are they hardware friendly enough? > Ibanez’s presentation suggests it can be done fairly simply on FPGA. > 2. According to the presentation I think the proposed solution – > especially if all proposed primitives on slide 6 would be implemented – is > a superset of what we’d need (I’d say for us enqueue, dequeue and timer > expiration would be enough). So if Ibanez’s proposal would be part of the > language, we wouldn’t need more (at least for now). > 3. Yes, if you have a look at the code you’ll see that we already use > control blocks for modularizing the code. With Tofino sometimes it’s not > straightforward as the compiler tends to use more stages in this case > compared to if you use less control blocks (this issue was also mentioned > in the uP4 talk). As I understood, Lyra is a higher layer solution for > portability over multiple DSLs, so I guess that would be handy if even in > the long term portability would be an issue. I think Lyra’s composition > part could deal with composing multiple modules / programs on a single > switch – I guess you referred to this feature, but I don’t think we’d need > a Lyra-like engine in the long run. > > > > So my only question that is remaining: is the proposals from Ibanez & co. > already considered by some of the working groups e.g. LDWG? If yes, I’ll go > thru the details as that is quite likely a good solution for us too. > > Thanks! > > > > Gergely > > > > > > >