OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] RE: [PATCH V2 2/6] virtio: introduce SUSPEND bit in device status


On Tue, Nov 07, 2023 at 05:24:44PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 11/7/2023 4:33 PM, Michael S. Tsirkin wrote:
> > On Tue, Nov 07, 2023 at 04:21:13PM +0800, Zhu, Lingshan wrote:
> > > > > This can work, right?
> > > > Unfortunately no, as non atomic bitmap cannot reside in the host memory,
> > > as explained before, PCI and CPU supports atomic read/write. Please refer to
> > > PCI spec and CPU ISA.
> > I don't see how atomic read or write does anything useful here but maybe.
> Because the device writs the bitmap and the driver "read and clear"
> the bitmap, so the ops need to be atomic, or they can run into race.
> > You need to explain how you are using atomics in your proposal then.
> Not sure we should talk about much of how atomic works, as explained above
> the operations should be atomic to avoid race conditions or losing
> information.
> Like:
> 
> 1) Device Read
> 2) Device Write
> 3) Device Clear
> 
> Here we lost the bitmap information.

That's an unusual use of the term "race condition". But yes, you need
to spell out how do driver and device interact.

> > 
> > 
> > > > And whatever is in the device gets reset on device reset and/or FLR. So the dirty map detail is lost.
> > > > Similarly the device context is also lost on these two events triggered by guest.
> > > we explained before, when reset, the device should clear everything.
> > then migration will corrupt memory. Not great.
> I think when reset, the device should clear everything, therefore the driver
> should clear the legacy data as well, don't know how corrupt

If you write data in memory CPU will observe it. If you then
migrate the CPU but not the memory then CPU and memory state are
inconsistent. I am surprised I need to say that, maybe I misunderstand
the question.

> > 
> > 
> > 
> > > > > > > As you can see, the dirty page tracking facility has a PASID for
> > > > > > > isolation. But still, the question is, we should better use platform
> > > > > > > dirty page tracking
> > > > > > > 
> > > > > > Nothing to do with PASID, as PASID is owned by the guest.
> > > > > It looks you don't know how PASID work.
> > > > > Host can setup PASID to isolate some facilities, right?
> > > > There are few limitations with PASID.
> > > > a. All platforms do not have PASID and
> > > As we have explained for many times, this is a basic facility,
> > > and the implementation is transport-specific.
> > > 
> > > We given an example of PCI implementation, and PCI support PASID, right?
> > Yes it's a limitation but maybe one we can live with
> > for this feature.  It does mean that we might need solutions
> > for systems without this support. virtio use is not limited
> > to servers or high end systems.
> PASID has been introduced years ago and I know some vendors implemented
> onboard IOMMU can also do isolating.

Introduced yes but when was it actually implemented? Do you know?

> 
> And this is a basic facility, the implementation is transport specific.

That's why if no one wants to support systems without PASID
this is, maybe, ok. But we know there are people who want this.

> > 
> > 
> > > > b. I explained above PASID do not work always as PASID only bifurcates DMA not the device _functionality_.
> > > With a PASID, a cap can be considered to be placed in another logical
> > > address space, which is not accessible to the guest.
> > > > c. PASID to be available to guest as_is what is present on the device
> > > host hypervisor sets the PASID, transparent to the guest.
> > Lingshan whenever people ask you a ton of questions in response to
> > your spec proposal then respose should not be to simply
> > answer on the mailing list and then repost without a lot of changes
> > since spec readers will likely have questions exactly like these
> > and we can not make them go and read this flame war.
> Well, I should say, I have repeated the same answers for too many times.

Don't. Amend the spec proposal instead so readers don't have these
questions.

> > And frankly, most of this TC stopped following this thread a while ago,
> > it seems to be going nowhere.
> I still believe we should release the best quality
> of spec as we can.
> > The response should be to add the explanation in the spec.
> > Look at Parav's live migration proposals with "theory of operation" chapters
> > for an example of how this can be done.
> I am not sure we should talk how PCI work in virtio spec.
> But I can add "pasid for isolation"
> 
> These facilities are not only used for live migration,
> can also work for debugging. Like suspend then read vq state.

Maybe. Then you need to document what is this state.


> I can add more explanation in the cover letter



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]