OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH V2 6/6] virtio-pci: implement dirty page tracking


On Thu, Nov 09, 2023 at 06:29:59PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 11/9/2023 1:18 AM, Michael S. Tsirkin wrote:
> > On Wed, Nov 08, 2023 at 05:29:00PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 11/7/2023 7:13 PM, Michael S. Tsirkin wrote:
> > > > On Mon, Nov 06, 2023 at 04:03:42AM +0000, Parav Pandit wrote:
> > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Sent: Sunday, November 5, 2023 9:42 PM
> > > > > > 
> > > > > > On Fri, Nov 03, 2023 at 03:47:34PM +0000, Parav Pandit wrote:
> > > > > > > > > [1]
> > > > > > > > > https://lists.oasis-open.org/archives/virtio-comment/202310/msg004
> > > > > > > > > 75.h
> > > > > > > > > tml
> > > > > > > > you still need to explain why this does not work for pass-through.
> > > > > > > It does not work for following reasons.
> > > > > > > 1. Because all the fields that put on the member device are not in direct
> > > > > > control of the hypervisor.
> > > > > > > The device is directly controlled by the guest including the device status and
> > > > > > when it resets the device all the things stored in the device are lost.
> > > > > > 
> > > > > > I think the idea is that when this gateway is in the device then device reset has
> > > > > > to trap. At a high level, ok. But then what?
> > > > > > Is a full scan of all memory required until device reset is complete?
> > > > > > Drivers currently tend to busy poll the reset register, if this takes very long we
> > > > > > might start seeing soft lockup messages. What is the idea then? Maybe for this
> > > > > > we need a separate weaker reset that does not touch this capability?
> > > > > > 
> > > > > You meant the gateway is not in the device, right?
> > > > > 
> > > > > I likely didn't understand. I don't see a relation to timing.
> > > > > 
> > > > > When the device reset is not trapped by the hypervisor, most things does not work, it requires trapping other things to like cvq, device registers and more.
> > > > > It may be fine for those use case, but it does not fullfill the requirement of passthrough mode of hw.
> > > > I wish we'd just stop using the term, it just confuses everyone.
> > > > 
> > > > I feel the point worth making is that currently, all this job is done
> > > > by hypervisors. And they manage fine! vdpa really truly does not need
> > > > the SUSPEND bit because it knows about devices and it
> > > > can just use whatever it wants in any vendor specific way it wants.
> > > So true, this is exact what Intel implements in some productions.
> > > > where all this migration work comes handy is if we say that
> > > > we want our device to all just do what the
> > > > spec says. No vendor specific tricks. And I find it exciting that
> > > > there are people who want to work on this instead of
> > > > each vendor wasting man hours on their own almost the same but
> > > > slightly different driver.
> > > I agree
> > > > I personally think this patch is not great for the trap use-case either.
> > > > Why? For example if device is somewhat slow then it will take it
> > > > hundreds of milliseconds to synchronize the whole guest memory, and
> > > > blocking reset means blocking e.g. guest boot.  I was wrong about soft
> > > > lockup btw - linux does msleep which I think means no soft lockups. But boot is
> > > > blocked and modules are not loaded.
> > > I am not sure SUSPEND can block RESET, I think reset can take immediate
> > > actions, because
> > > once reset, whether suspended does not matter.
> > No, because if you don't suspend device will keep changing memory.
> > You need to
> > 1. suspend
> > 2. get all dirty memory synced
> > 3. reset
> > 
> > 
> > Reset earlier will corrupt guest memory.
> IMHO, it may be fine to lose the dirty pages during reset,
> because without an interrupt, the driver won't process the
> dirty pages, they are still considered as unused(even not all zero pages)
> by CPU, so nothing corrupted.
> 
> And if the driver resets the device, it will reinitialize the device
> and re-config the virtqueue including the ring buffer.

It's too late to invent new consistency semantics for virtio.

-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]