OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE


On Sun, Mar 20, 2022 at 04:17:48PM +0100, Christian Schoenebeck wrote:
> On Sonntag, 20. März 2022 14:55:59 CET Michael S. Tsirkin wrote:
> > On Sun, Mar 20, 2022 at 02:32:23PM +0100, Christian Schoenebeck wrote:
> > > On Sonntag, 20. März 2022 13:31:51 CET Michael S. Tsirkin wrote:
> > > > On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck wrote:
> > > > > On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > > > > > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck
> > > > > > <qemu_oss@crudebyte.com>
> > > > > > 
> > > > > > wrote:
> > > > > > > This new feature flag allows to decouple the maximum amount of
> > > > > > > descriptors in indirect descriptor tables from the Queue Size.
> > > > > > 
> > > > > > if we are extending these limits, I suggest reusing the feature flag
> > > > > > to
> > > > > > also add a limit on total s/g list size. making it separate from
> > > > > > queue
> > > > > > size
> > > > > > was requested a while ago.
> > > > > 
> > > > > What do you mean with "total s/g list size"? The maximum bulk data
> > > > > size
> > > > > per
> > > > > message?
> > > > > Sum of both in and out s/g lists' bulk data or only for one of them?
> > > > > Or maximum size of exactly only one memory segment?
> > > > 
> > > > I don't really know what does "bulk data size" mean. Suggest we use
> > > > terminology from the spec. A buffer includes a group of direct and/or
> > > > indirect descriptors, in turn indirect descriptors point to direct
> > > > descriptors.
> > > 
> > > I already described why I think it makes sense not calling it "buffer" in
> > > this particular context. So I am against changing this to "buffer".
> > 
> > Well spec just defines what buffers are.
> > If you are using a different term then you need to define it in the
> > spec.
> 
> Sorry, but that sounds like nitpicking to me. From split-ring.tex:
> 
> "When the driver wants to send a buffer to the device, it fills in
> a *slot* in the descriptor table (or chains several together)"
> So a "slot in the descriptor table" does not need further specification, but 
> the term "vring slot" does?

One of the reasons is that it's an old text from pre-standard days.
It's therefore informal. We are trying to do better and unfortunately
this means you might need to clean up some old text to add your
feature.

Packed ring has the concept of "elements" maybe fixing up
split to use that teminology too can work.


Anothe reason I am not excited about "vring slots" is because
it is not concise - it's easy to just say "slots" and introduce
confusion.



> 
> > > If other people
> > > support your position then I'll change it though of course.
> > > 
> > > About "bulk data size": "Bulk data" is the user data actually being used
> > > on
> > > application/driver level, i.e. above virtio level, and "Bulk data size"
> > > the
> > > size of that data. See ASCII illustration here:
> > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > 
> > > The terminology "bulk data" is already used in the spec already BTW.
> > 
> > It does not refer to anything specific though, just generally
> > to vqs for passing lots of data as opposed to config space used
> > to pass small amount of data.
> 
> Which is telling pretty much precise enough what it is, at least IMO.

Well saying generally there's a "bulk of data" just means there's a lot.
If you are talking about it's size then I guess you somehow
distinguish this data from some other data?

Please look at the Virtqueues chapter and see if any existing terms
fit your usage. It's hard enough for people to learn the spec without
us changing terminology each release.

> > > > What has been requested a while is ability to limit per vq the
> > > > # of direct descriptors in a buffer.
> > > 
> > > So in cases where indirect descriptors are *not* used. This series is
> > > about
> > > indirect descriptors only though.
> > 
> > No, in all cases.
> 
> ?

what users in the field asked for is ability to increase VQ size to hold
more than 1024 buffers. However QEMU can not support requests with more
than 1024 elements. Thus the wish to limit buffer size below VQ size.
Whether there's an indirect element in the chain does not matter
for this use-case.


> > > > Since IIUC what you want to do is allow more descriptors
> > > > than VQ size, then one way to achieve that is just to have
> > > > a per VQ limit on descriptor size and have that limit > VQ size.
> > > 
> > > Sorry, I can't follow you. What do you mean with "descriptor size"?
> > > For me a
> > > descriptor has a predefined constant struct size. You mean the size of one
> > > memory segment referenced by one indirect descriptor? And why would it be
> > > better than what this series suggests?
> > 
> > My bad. What I meant is a "per VQ limit on # of direct descriptors
> > per buffer".
> 
> Which is out of the scope of what this series was about.

Maybe the scope is too limited then.


> > > > Another thing related is that people wanted to block indirect
> > > > descriptors for some VQs. Not yet sure how to combine that
> > > > with this proposal, worth thinking about.
> > > 
> > > This series allows both, increasing *and* decreasing the number of
> > > indirect
> > > descriptors per VQ already.
> > 
> > I don't see how you block indirect descritors for a queue with this.
> > Did I miss it?
> 
> By setting the proposed "Queue Indirect Size" value to zero?

I guess ... it is better to call this out explicitly in the spec.


> > Also I think you missed the fact that
> > a direct descriptor can point to an indirect one, the
> > result is that max # of descriptors in a buffer is then:
> > 
> > queue size - 1 + indirect table size
> > 
> > I don't see how your proposal limits the # of descriptors
> > below queue size since guest is never forced to use
> > indirect.
> 
> No, I didn't miss that. The suggested changes were about the amount of 
> *indirect* descriptors, not about the amount of *direct* descriptors. The 
> amount of direct descriptors was still limited to the "Queue Size".

Confused.
The amount of indirect descriptors?
Not the size in each one? But your commit log says
	descriptors in indirect descriptor tables
descriptors in the indirect descriptor tables are not indirect
descriptors.

At this point I don't understand the motivation for the change.


> I am aware that QEMU currently has a limit per "buffer" which adds the amount 
> of direct descriptors *and* the amount of indirect ones together for that 
> limit (which I also mentioned from the Github issue summary BTW). Which is a 
> device specific implementation feature of QEMU and would not stop QEMU though 
> to handle this correctly by reducing the negotiated "Queue Indirect Size" 
> value appropriately.

Except people want a deeper queue, I can probably find & forward links
to mailing list discussions.

> > > > > And are you suggesting this should become part of this series already?
> > > > 
> > > > yes since it's touching mostly same areas in the spec.
> > > :
> > > :/ Please note that I sent the first draft on this issue already in
> > > :November
> > > 
> > > last year, and have not seen any response from your side so far. I
> > > actually
> > > assumed we were already at a point where it was just about precise wording
> > > et al., not restarting to redesign everything again from scratch now.
> > Sorry about that.
> 
> I'm pretty sure you are.

Yes, I wish we could move faster.  If you are asking for advice about
avoiding such situations in the future then that would be to iterate
faster. It's just on v3 since november, and you did get comments on the
previous revisions.

> > > > > > > The new term "Queue Indirect Size" is introduced for this purpose,
> > > > > > > which is a transport specific configuration whose negotiation is
> > > > > > > further specified for each transport with subsequent patches.
> > > > > > > 
> > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > > > ---
> > > > > > > 
> > > > > > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > > > > > >  packed-ring.tex |  2 +-
> > > > > > >  split-ring.tex  |  8 ++++++--
> > > > > > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > index c6f116c..685525d 100644
> > > > > > > --- a/content.tex
> > > > > > > +++ b/content.tex
> > > > > > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic
> > > > > > > Facilities
> > > > > > > of a
> > > > > > > Virtio Device / Feature B
> > > > > > > 
> > > > > > >  \begin{description}
> > > > > > >  \item[0 to 23, and 50 to 127] Feature bits for the specific
> > > > > > >  device
> > > > > > >  type
> > > > > > > 
> > > > > > > -\item[24 to 40] Feature bits reserved for extensions to the queue
> > > > > > > and
> > > > > > > +\item[24 to 41] Feature bits reserved for extensions to the queue
> > > > > > > and
> > > > > > > 
> > > > > > >    feature negotiation mechanisms
> > > > > > > 
> > > > > > > -\item[41 to 49, and 128 and above] Feature bits reserved for
> > > > > > > future
> > > > > > > extensions.
> > > > > > > +\item[42 to 49, and 128 and above] Feature bits reserved for
> > > > > > > future
> > > > > > > extensions.
> > > > > > > 
> > > > > > >  \end{description}
> > > > > > >  
> > > > > > >  \begin{note}
> > > > > > > 
> > > > > > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration
> > > > > > > structure
> > > > > > > layout}\label{sec:Virtio Transport
> > > > > > > 
> > > > > > >  present either a value of 0 or a power of 2 in
> > > > > > >  \field{queue_size}.
> > > > > > > 
> > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device
> > > > > > > MUST
> > > > > > > provide the
> > > > > > > +Queue Indirect Size supported by device, which is a transport
> > > > > > > specific
> > > > > > > +configuration. It MUST allow the driver to set a lower value.
> > > > > > > +
> > > > > > > 
> > > > > > >  \drivernormative{\paragraph}{Common configuration structure
> > > > > > > 
> > > > > > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI
> > > > > > > Device
> > > > > > > Layout
> > > > > > > / Common configuration structure layout}
> > > > > > > 
> > > > > > >  The driver MUST NOT write to \field{device_feature},
> > > > > > >  \field{num_queues},
> > > > > > > 
> > > > > > > \field{config_generation}, \field{queue_notify_off} or
> > > > > > > \field{queue_notify_data}.
> > > > > > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature
> > > > > > > Bits}\label{sec:Reserved
> > > > > > > Feature Bits}
> > > > > > > 
> > > > > > >    that the driver can reset a queue individually.
> > > > > > >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > > > > > > 
> > > > > > > Virtqueue Reset}.
> > > > > > > 
> > > > > > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates
> > > > > > > that
> > > > > > > the
> > > > > > > +  Queue Indirect Size, i.e. the maximum amount of descriptors in
> > > > > > > indirect
> > > > > > > +  descriptor tables, is independent from the Queue Size.
> > > > > > > +
> > > > > > > +  Without this feature, the Queue Size limits the length of the
> > > > > > > descriptor
> > > > > > > +  chain, including indirect descriptor tables as in
> > > > > > > \ref{sec:Basic
> > > > > > > Facilities of
> > > > > > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > > > > > > Indirect
> > > > > > > +  Descriptors}, i.e. both the maximum amount of slots in the
> > > > > > > vring
> > > > > > > and
> > > > > > > the
> > > > > > > +  actual bulk data size transmitted per vring slot.
> > > > > > 
> > > > > > spect does not call these  slots elsewhere.
> > > > > 
> > > > > Yes, I intentionally used "vring slot" instead of "buffer" as I find
> > > > > the
> > > > > latter too vague in this context. A "buffer" can be a memory segment,
> > > > > a
> > > > > set of memory segments and what not. "vring slot" OTOH makes it clear
> > > > > that it is about exactly one, atomic pointer (hence with fixed size)
> > > > > in a
> > > > > Ring Buffer, as depicted in the ASCII illustration here:
> > > > > 
> > > > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > 
> > > > > The maximum amount of vring slots is therefore the maximum amount of
> > > > > messages that can be emplaced into a Ring Buffer, independent of any
> > > > > "bulk data buffer size".
> > > > > 
> > > > > > +
> > > > > > 
> > > > > > > +  With this feature enabled, the Queue Size only limits the
> > > > > > > maximum
> > > > > > > amount
> > > > > > > +  of slots in the vring, but does not limit the actual bulk data
> > > > > > > size
> > > > > > > +  being transmitted when indirect descriptors are used.
> > > > > > > Decoupling
> > > > > > > these
> > > > > > > +  two configuration parameters this way not only allows much
> > > > > > > larger
> > > > > > > bulk
> > > > > > > data
> > > > > > > +  being transferred per vring slot, but also avoids complicated
> > > > > > > synchronization
> > > > > > > +  mechanisms if the device only supports a very small amount of
> > > > > > > vring
> > > > > > > slots. Due
> > > > > > > +  to the 16-bit size of a descriptor's "next" field there is
> > > > > > > still an
> > > > > > > absolute
> > > > > > > +  limit of $2^{16}$ descriptors per indirect descriptor table.
> > > > > > > However
> > > > > > > the
> > > > > > > +  actual maximum amount supported by either device or driver
> > > > > > > might be
> > > > > > > less,
> > > > > > > +  and therefore the bus specific Queue Indirect Size value MUST
> > > > > > > additionally
> > > > > > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > > > > > > subsequently
> > > > > > > +  negotiate the actual amount of maximum indirect descriptors
> > > > > > > supported
> > > > > > > +  by both sides.
> > > > > > 
> > > > > > still not sure what exactly is the value. e.g. in a buffer including
> > > > > > indirect and direct descriptors.
> > > > > > 
> > > > > > +
> > > > > > 
> > > > > > >  \end{description}
> > > > > > >  
> > > > > > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved
> > > > > > >  Feature
> > > > > > >  Bits}
> > > > > > > 
> > > > > > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > > > > > index a9e6c16..e26d112 100644
> > > > > > > --- a/packed-ring.tex
> > > > > > > +++ b/packed-ring.tex
> > > > > > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > > > > > 
> > > > > > >  The device limits the number of descriptors in a list through a
> > > > > > >  transport-specific and/or device-specific value. If not limited,
> > > > > > >  the maximum number of descriptors in a list is the virt queue
> > > > > > > 
> > > > > > > -size.
> > > > > > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been
> > > > > > > negotiated.
> > > > > > > 
> > > > > > >  \subsection{Next Flag: Descriptor Chaining}
> > > > > > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > > > > > > 
> > > > > > > diff --git a/split-ring.tex b/split-ring.tex
> > > > > > > index de94038..eaa90c3 100644
> > > > > > > --- a/split-ring.tex
> > > > > > > +++ b/split-ring.tex
> > > > > > > @@ -268,8 +268,12 @@ \subsubsection{Indirect
> > > > > > > Descriptors}\label{sec:Basic
> > > > > > > Facilities of a Virtio Devi
> > > > > > > 
> > > > > > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor
> > > > > > >  (ie.
> > > > > > >  only
> > > > > > >  one table per descriptor).
> > > > > > > 
> > > > > > > -A driver MUST NOT create a descriptor chain longer than the Queue
> > > > > > > Size of
> > > > > > > -the device.
> > > > > > 
> > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver
> > > > > > MUST
> > > > > > 
> > > > > > > +NOT create a descriptor chain longer than the Queue Size of the
> > > > > > > device.
> > > > > > > +
> > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > > > > > > +descriptors per indirect descriptor table MUST NOT exceed the
> > > > > > > negotiated
> > > > > > > +Queue Indirect Size.
> > > > > > 
> > > > > > it is not negotiated is it?
> > > > > 
> > > > > What makes you think it is not negotiated?
> > > 
> > > Also see my previous question here ^
> > 
> > Sorry, what I mean is that you don't define what does negotiation
> > involve. I think you mean this:
> > 
> > 	The driver SHOULD write to \field{queue_indirect_size} if its maximum
> > 	number of descriptors per vring slot is lower than that reported by the
> > 	device.
> > 
> > but driver can just read the value and that's it - and then the value
> > that is set by device applies, right?
> > 
> > If you are going to use terms such as negotiated you need to define what
> > they mean. In this case I would just say something like
> > "the value of Queue Indirect Size".
> 
> Which makes me wonder why you just didn't say that in the first place? And I 
> don't agree that it wasn't defined, because I actually think I did:
> 
> +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> +  descriptor tables, is independent from the Queue Size."
> 
> Or is that definition of the new term "Queue Indirect Size" not clear enough 
> to you?

Maybe ... but I don't think this will jump out to the reader. I feel
we abuse of the word negotiate to the point where reader only has a
vague idea what it means. That's why I struggled to give concise
comments.  Consider:

+  and therefore the transport specific Queue Indirect Size value MUST
+  additionally be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
+  subsequently negotiate the actual amount of maximum indirect descriptors
+  supported by both sides.

sounds like a tongue-twister to me. Can't we find a term different from
"negotiate"?


It's already used for features, please find something else for this.
And why is a term even necessary?  How is this different from other
writeable fields?  Consider an example of similar functionality
from the spec:

\item[\field{queue_size}]
        Queue Size.  On reset, specifies the maximum queue size supported by
        the device. This can be modified by the driver to reduce memory requirements.
        A 0 means the queue is unavailable.


Hope this helps.



> > > > > > >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and
> > > > > > >  VIRTQ_DESC_F_NEXT
> > > > > > >  in \field{flags}.
> > > > > > > 
> > > > > > > --
> > > > > > > 2.30.2
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > This publicly archived list offers a means to provide input to the
> > > > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > > > 
> > > > > > > In order to verify user consent to the Feedback License terms and
> > > > > > > to minimize spam in the list archive, subscription is required
> > > > > > > before posting.
> > > > > > > 
> > > > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > > > List archive:
> > > > > > > https://lists.oasis-open.org/archives/virtio-comment/
> > > > > > > Feedback License:
> > > > > > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > > > List Guidelines:
> > > > > > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > > > Join OASIS: https://www.oasis-open.org/join/
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]