OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] virtio spec change proposal


On Wed, Oct 10, 2018 at 06:37:57AM -0400, Yuri Benditovich wrote:
> 
> 
> âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> 
>     From: "Michael S. Tsirkin" <mst@redhat.com>
>     To: "Yuri Benditovich" <ybendito@redhat.com>
>     Cc: virtio-comment@lists.oasis-open.org
>     Sent: Wednesday, October 10, 2018 4:54:07 AM
>     Subject: Re: [virtio-comment] virtio spec change proposal
> 
>     On Tue, Oct 09, 2018 at 06:26:53PM -0400, Yuri Benditovich wrote:
>     >
>     >
>     >
>     âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
>     >
>     >     From: "Michael S. Tsirkin" <mst@redhat.com>
>     >     To: "Yuri Benditovich" <ybendito@redhat.com>
>     >     Cc: virtio-comment@lists.oasis-open.org
>     >     Sent: Tuesday, October 9, 2018 6:08:00 PM
>     >     Subject: Re: [virtio-comment] virtio spec change proposal
>     >
>     >     On Tue, Oct 02, 2018 at 09:43:25AM -0400, Yuri Benditovich wrote:
>     >     > Hello all,
>     >     >
>     >     > We'd like to propose an extension of VIRTIO specification 1.0
>     >     > Please let us know if there are some objections.
>     >
>     >     I like the proposal. Some comments below, hope this helps with
>     >     preparing the spec patch.
>     >
>     >     > In case there is no objections we will prepare and send respective
>     patch.
>     >     >
>     >     > Feature: RSC (receive side coalescing) compatibility with Windows
>     HCK
>     >     (hardware
>     >     > compatibility kit) requiremenets.
>     >     >
>     >     > Description:
>     >     >
>     >     > 1. Current state:
>     >     > VIRTIO device supports RSC, i.e. able to coalesce TCP/UDP v4 and v6
>     >     fragments
>     >     > and build complete packet from them.
>     >     > This feature is controlled by VIRTIO_NET_F_GUEST_TSO4,
>     >     VIRTIO_NET_F_GUEST_TSO6,
>     >     > VIRTIO_NET_F_GUEST_UFO feature bits.
>     >     > When it is enabled and received packet was built from fragments,
>     the
>     >     device
>     >     > indicates that by setting field 'gso_type' field to one of
>     >     > VIRTIO_NET_HDR_GSO_TCPV4, VIRTIO_NET_HDR_GSO_TCPV6 or
>     >     VIRTIO_NET_HDR_GSO_UDP.
>     >     > Functionally this is good enough, but such an implementation is not
>     >     compatible
>     >     > with Windows certification requirements, resulting failures in
>     Windows
>     >     > certification tests.
>     >     >
>     >     > 2. Expectations of Windows HCK testing software.
>     >     > Rules for coalescing are described in
>     >     > https://docs.microsoft.com/en-us/windows-hardware/drivers/network/
>     >     > rules-for-coalescing-tcp-ip-packets
>     >     > Guidelines for indication of coalesced segments
>     >     > https://docs.microsoft.com/en-us/windows-hardware/drivers/network/
>     >     > indicating-coalesced-segments
>     >     > These guidelines are applicable to the network driver, but the
>     network
>     >     driver
>     >     > shall be able to extract
>     >     > the required information from the received packet. The information
>     >     required for
>     >     > the driver defined in
>     >     > https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/
>     content/
>     >     ndis/
>     >     > ns-ndis-_ndis_rsc_nbl_info
>     >     > and includes following per-packet information:
>     >     > - number of coalesced segments that were combined in the packet
>     >     > - number of duplicated ACKs that were encountered while received
>     packet
>     >     was
>     >     > combined
>     >     >
>     >     > Current implementation of RSC does not provide such information in
>     the
>     >     packet.
>     >     >
>     >     > 3. Existing workaround
>     >     > There is a workaround in the field for the problem described above.
>     >     > It uses feature bits 41 and 42 for control of TCP v4 and TCP v6 RSC
>     and
>     >     > provides required information in the virtio header.
>     >     > For that it extends existing structure of virtio header from
>     regular 12
>     >     bytes
>     >     > to 16 bytes in case one or both of these feature bits are
>     acknowledged.
>     >     > The disadvantage of this solution is that such an extension of
>     packet
>     >     header
>     >     > may complicate further extensions of the specification.
>     >
>     >     So do we need to reserve bits 41 and 42 for now, as well?
>     >     Can be a separate patch.
>     >
>     > Yes, this would be good to reserve them (if this change
>     > is acceptable) or define them for the same RSC info
>     > purpose with extended header (as the production driver expects
>     >  from 2016). If the change will be accepted, we'll change
>     > the driver and this will take several years until we
>     > consider only obsolete drivers use it).
>     >
>     >
>     >
>     >     > 4. Proposal of specification change.
>     >     >
>     >     > a) Taking in account the workaround (3) declare feature bits 41 and
>     42 as
>     >     > reserved.
>     >     >
>     >     > b) Define feature bits 43 for extended RSC information in coalesced
>     >     packets
>     >     > (VIRTIO_NET_F_GUEST_RSC)
>     >
>     >     Maybe VIRTIO_NET_F_GUEST_RSC_INFO? This does not control whether to
>     do RSC.
>     >     Or maybe VIRTIO_NET_F_GUEST_RSC_EXTENDED? This allows coalescing
>     >     of duplicate acks and variable sized packets ...
>     >
>     >
>     > VIRTIO_NET_F_GUEST_RSC_INFO seems
> 
>     My point is it's not just info, it also controls whether card
>     will coalesce acks etc, right?
> 
> 
> OK, let's call it VIRTIO_NET_F_GUEST_RSC_EXTENDED.
> 
> 
> 
> 
>     >
>     >     > c) Define additional flag bit 2 (value = 4) that the device can set
>     in
>     >     'flags'
>     >     > field of existing virtio header with meaning of
>     >     VIRTIO_NET_HDR_F_RSC_INFO. The
>     >     > device sets this bit only in coalesced packets and only if feature
>     bits
>     >     43 was
>     >     > acknowledged.
>     >
>     >     And not valid for driver, right?
>     >
>     > Can you please clarify, what is not valid for driver?
> 
>     I mean for when driver inits the header for the outgoing packets.
> 
>     > I mean is case the feature (43) is reported by host/device
>     > and acked by the driver (for example, the driver of the day
>     > never acks this feature being unaware of it). But current
>     > drivers never look at mask 0x04 in flags.
> 
>     I mean VIRTIO_NET_HDR_F_RSC_INFO is never set by driver.
> 
> Of course the driver must not use this flag on TX and
> the device must ignore it on TX.
> I would not like to state 'if the driver sets this flag
> on TX the behaviour is unpredictable'.


We can state driver MUST NOT set it.

> 
> 
> 
>     >
>     >
>     >     > If VIRTIO_NET_HDR_F_RSC_INFO is set in the packet, the device:
>     >     >
>     >     > c.1) populates 16-bit value of 'number of coalesced segments' in
>     >     'csum_start'
>     >     > field of virtio header
>     >     >
>     >     > c.2) populates 16-bits value of 'number of duplicated ACKs' in
>     >     'csum_offset'
>     >     > field of virtio header
>     >
>     >     Maybe its a good time to document behaviour when the bit
>     >     is not set:  
>     >     - duplicate acks not coalesced (number = 0)
>     >     - only full MTU segments coalesced (number = round_up(len / mtu))
>     >
> 
> No problem.
> 
> 
> 
>     >
>     > If duplicate acks even coalesced this is not reported and
>     > we can't guess what exactly happens.
> 
>     I don't think current devices coalesce them.
>     We never specified the actual semantics but I think
>     it was also the assumption that it matches Linux GSO and
>     IIUC that implies behaviour that does not discard information.
> 
>     I think if you can't keep count you should only coalesce
>     MTU sized packets. This ensures that
>     - acks are not coalesced
>     - packets can be re-fragmented without loss
> 
>     and this in turn allows things like bridging within guest
>     so keep the feature enabled.
> 
> 
> 
> 
>     >     > d) The device should populate proper checksum in coalesced packets
>     >     > (VIRTIO_NET_HDR_F_DATA_VALID should be set). If due to some reason
>     the
>     >     checksum
>     >     > is not populated, VIRTIO_NET_HDR_F_DATA_VALID must be not set. In
>     case
>     >     > VIRTIO_NET_HDR_F_DATA_VALID is not set and
>     VIRTIO_NET_HDR_F_RSC_INFO is
>     >     set in
>     >     > the packet, the driver shall compute packet checksum without rely
>     on
>     >     checksum
>     >     > information in 'csum_start' and 'csum_offset'.
>     >
>     >     More specifically, please formulate exactly what should device do.
>     >
>     > As in existing spec, the device should populate the checksum
>     > and report it is valid. But not must, IMO.
> 
>     Confused.
>     This is an incoming packet with VIRTIO_NET_HDR_F_RSC_INFO.
>     Where do you want device to populate the checksum?
> 
> 
> Checksum (as always) shall be populated in the reported coalesced
> TCP packet (IP and TCP checksum) or part of it.
> Somebody shall do that anyway - or the device before
> completing RX buffer to the guest or driver if there
> is no indication that the checksum is good.

Well we can't very well have driver just overwrite the checksum without
anyone validating it, right? That would defeat it's purpose, won't it?
VIRTIO_NET_HDR_F_DATA_VALID means basically "checksum is wrong but
ignore that, data is fine". Without that, I guess there is the
assumption that device validated the checksums and dropped packets with
an invalid checksum so driver can overwrite it?


> https://docs.microsoft.com/en-us/windows-hardware/drivers/network/
> rules-for-coalescing-tcp-ip-packets
>
> Now it is less confusing?

That seems to describe an interface between driver and OS.
Not sure how it's relevant.
Could you summarize what you are trying to say here please?


> 
>     >
>     >
>     >     > e) new feature VIRTIO_NET_F_GUEST_RSC depends on
>     VIRTIO_NET_F_CTRL_VQ,
>     >     > VIRTIO_NET_F_CTRL_GUEST_OFFLOADS and (VIRTIO_NET_F_GUEST_TSO4 or
>     >     > VIRTIO_NET_F_GUEST_TSO6).

Wrt dependency, how does it interfact with dynamic guest offloads?
I guess we need to allow it anyway, and then have it be ignored
without respective offloads?


>     >     >
>     >     > Thanks,
>     >     > Yuri Benditovich,
>     >     > Red Hat, Inc.
>     >     >
>     >
>     >     This publicly archived list offers a means to provide input to the
>     >     OASIS Virtual I/O Device (VIRTIO) TC.
>     >
>     >     In order to verify user consent to the Feedback License terms and
>     >     to minimize spam in the list archive, subscription is required
>     >     before posting.
>     >
>     >     Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>     >     Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>     >     List help: virtio-comment-help@lists.oasis-open.org
>     >     List archive: https://lists.oasis-open.org/archives/virtio-comment/
>     >     Feedback License: https://www.oasis-open.org/who/ipr/
>     feedback_license.pdf
>     >     List Guidelines: https://www.oasis-open.org/policies-guidelines/
>     >     mailing-lists
>     >     Committee: https://www.oasis-open.org/committees/virtio/
>     >     Join OASIS: https://www.oasis-open.org/join/
>     >
>     >
>     >
> 
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]