OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ


On Tue, May 16, 2023 at 06:45:58PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, May 16, 2023 12:33 AM
> 
> [..]
> > > > > A better way is to have an eventq of depth = num_vfs, like many
> > > > > other virtio devices have it.
> > > > >
> > > > > An eventq can hold per VF interrupt entry including the isr value
> > > > > that you suggest above.
> > > > >
> > > > > Something like,
> > > > >
> > > > > union eventq_entry {
> > > > > 	u8 raw_data[16];
> > > > > 	struct intx_entry {
> > > > > 		u8 event_opcode;
> > > > > 		u8 group_type;
> > > > > 		u8 reserved[6];
> > > > > 		le64 group_identifier;
> > > > > 		u8 isr_status;
> > > > > 	};
> > > > > };
> > > > >
> > > > > This eventq resides on the owner parent PF.
> > > > > isr_status is read on clear like today.
> > > >
> > > > This is what I wrote no?
> > > > lore.kernel.org/all/20230507050146-mutt-send-email-
> > > > mst%40kernel.org/t.mbox.gz
> > > >
> > > > 	how about a special command that is used when device would
> > > > 	normally send INT#x? it can also return ISR to reduce latency.
> > > >
> > > In response to your above suggestion of AQ command, I suggested the
> > > eventq that contains the isr_status that reduces latency as you suggest.
> > 
> > I don't see why we need to keep adding queues though.
> > Just use one of admin queues.
> > 
> Admin queue is cmd->rsp mode.
> The purpose is different here. A device wants to async, notify the interrupt for each VF.
> Like a nic rq.
> Filling up the AQ full of these commands is sub-optimal for those VFs who may generate interrupt.

I don't get the difference sorry. what you do with rq
is exactly fill it up with buffers. here you fill adminq with
commands.



> > > An eventq is a variation of it, where device can keep reporting the events
> > without doing the extra mapping and without too many commands.
> > 
> > I don't get the difference then. The format you showed seems very close to
> > admin command. What is the difference? How do you avoid the need to add a
> > command per VF using INTx#?
> > 
> The main difference is it behaves like nic rq.
> Driver posts empty descriptors, device reports the consumed entries.
> Specific entry is not attached to any specific VF.
> The eventq_entry structure is written by the device for a VF there fired the interrupt fired.

If this is what you want, we already have commands that do not refer to
a specific VF, specifically VIRTIO_ADMIN_CMD_LIST_QUERY.
But I think this is misguided see below.


> > How do you avoid the need to add a command per VF using INTx#?
> The empty buffer is posted pointing to eventq_entry.
> Total number of buffers posted = number_of_vfs in intx mode.
> So entry is not attached to a specific VF, entries are in a pool (like nic rq).

There is a very simple problem with your approach:
you need to report to VF read of ISR since that
clears the interrupt line.
This command has to be per VF.
As I tried to explain before, I proposed combining the two:
same command clears ISR for a VF and if ISR is not 0
returns immediately. If it is 0 the buffer is queued and
used later for interrupt. I prefer the approach
where the buffer is reused by the same VF,
relying on out of order use.
But yes, if you want use in-order, this does not work.
Write the VF that triggered interrupt in command specific
data maybe?



> > > Intel CPU has 256 per core (per vcpu). So they are really a lot.
> > > One needs to connect lot more devices to the cpu to run out of it.
> > > So yes, I would like to try the command to fail.
> > 
> > order of 128 functions then for a 1vcpu VM. You were previously talking about
> > tends of 1000s of functions as justification for avoiding config space.
> > 
> Will try.
> 
> > 
> > > > Do specific customers event use guests with msi-x disabled? Maybe no.
> > > > Does anyone use virtio with msi-x disabled? Most likely yes.
> > > I just feel that INTx emulation is extremely rare/narrow case of some
> > applications that may never find its use on hw based devices.
> > 
> > If we use a dedicated command for this, I guess devices can just avoid
> > implementing the command if they do not feel like it?
> > 
> I didn't follow. The guest device?
> Do you mean guest driver?

Device. Clear the bit in VIRTIO_ADMIN_CMD_LIST_QUERY and
then hypervisor knows it can't support INTx.

> > > A driver will be easily able to fail the call on INTx configuration failing the
> > guest.
> > 
> > There's no configuration - INTx is the default - and no way to fail gracefully for
> > legacy. That is one of the things we should fix, at least hypervisor should be able
> > to detect failures.
> For sure hypervisor can detect the failure on INTx configuration done by the guest and fail the call for this config.
> What we are discussing is not anything legacy specific, it applies to 1.x as well sadly.

Problem is, enabling MSIX changes layout for legacy.
This is my concern really we need to make the decision whether the
reserved msi vector hack is used or eventq will work. Because
reserved msi vector hack changes legacy layout.
Fundamentally, that only matters because you insist on
reusing the same command for common and device config space
access with difference in the offset. Split them out as Jason proposed
and the issue goes away.

> So, we don't need to mix INTx support with register emulation proposal, better to handle in the follow up proposal itself.
> I also have some patches to improve driver resiliency when guest is not involved to work without INTx.
> 
> And INTx can still work from system and device point of view, because INTx generation does not have function id in it.
> Existing PCIe INTx can work if PCI PF device can trigger INTx for the VFs.
> So eventq scheme is still optional and orthogonal to legacy.
> 
> I would like to discuss it separately outside of legacy proposal.

Hmm this sounds convincing. I think if we split device and common
config then we can defer INTx issue.

-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]