OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status


On Fri, Aug 18, 2023 at 05:55:42PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/18/2023 12:04 AM, Stefan Hajnoczi wrote:
> > On Thu, Aug 17, 2023 at 05:15:16PM +0200, Eugenio Perez Martin wrote:
> > > On Tue, Aug 15, 2023 at 2:29âPM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > On Tue, Aug 15, 2023 at 06:31:23PM +0800, Zhu, Lingshan wrote:
> > > > > 
> > > > > On 8/14/2023 10:30 PM, Stefan Hajnoczi wrote:
> > > > > > On Tue, Aug 15, 2023 at 03:29:00AM +0800, Zhu Lingshan wrote:
> > > > > > > This patch introudces a new status bit in the device status: SUSPEND.
> > > > > > > 
> > > > > > > This SUSPEND bit can be used by the driver to suspend a device,
> > > > > > > in order to stablize the device states and virtqueue states.
> > > > > > > 
> > > > > > > Its main use case is live migration.
> > > > > > > 
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > Signed-off-by: Eugenio PÃrez <eperezma@redhat.com>
> > > > > > There is an character encoding issue in Eugenio's surname.
> > > > > Oh, I copied his SOB form his email, I will copy from git log to fix this,
> > > > > thanks for point out it.
> > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > ---
> > > > > > >    content.tex | 18 ++++++++++++++++++
> > > > > > >    1 file changed, 18 insertions(+)
> > > > > > This patch hints at the asynchronous nature of the SUSPEND bit (the
> > > > > > driver must re-read the Device Status Field) but doesn't explain the
> > > > > > rationale or any limits.
> > > > > > 
> > > > > > For example, is there a timeout or should the driver re-read the Device
> > > > > > Status Field forever?
> > > > > It depends on the driver, normally we expect this operation can be done
> > > > > successfully
> > > > > like how the driver/device handles FEATURES_OK.
> > > > > 
> > > > > Once failed due to:
> > > > > 1) driver timeout, the driver can reset the device
> > > > > 2) device failure, the device can set NEEDS_RESET.
> > > > I mention this because SUSPEND involves quiescing the device so that no
> > > > requests are in flight and that can take an unbounded amount of time on
> > > > a virtio-blk, virtio-scsi, or virtiofs device. If the driver is busy
> > > > waiting for the device to report the SUSPEND bit, then that could take a
> > > > long time/forever.
> > > > 
> > > > Imagine a virtiofs PCI device implemented in hardware that forwards I/O
> > > > to a distributed storage system. If the distributed storage system has a
> > > > request in flight then SUSPEND needs to wait for it to complete. The
> > > > device has no control over how long that will take, but if it does not
> > > > then corruption could occur later on.
> > > > 
> > > > There are two issues with long SUSPEND times:
> > > > 1. Busy wait CPU consumption. Since there is no interrupt that signals
> > > >     when the bit is set, the best a driver can do is to back off
> > > >     gradually and use timers to avoid hogging the CPU.
> > > > 2. Synchronous blocking. If the call stack that led the driver to set
> > > >     SUSPEND is blocked until the device reports the SUSPEND bit, then
> > > >     other parts of the system could experience blocking. For example, the
> > > >     VMM might be blocked in a vhost ioctl() call, which makes the guest
> > > >     unresponsive.
> > > > 
> > > I think all of this already happens with ring reset or even a plain
> > > device reset, doesn't it?
> > Yes, but keep in mind that the driver has typically already drained
> > requests when it invokes reset. If SUSPEND is used transparently during
> > live migration then there really will be requests in flight because the
> > guest driver is unaware.
> I agree, and as discussed before, I think in this series we should say:
> the device MUST wait untilall descriptors that being processed to finish and
> mark them as used.

Sounds good.

> Then in the following series, we should implement in-flight IO tracker.
> > 
> > > In my opinion the best thing the device can do here is to fail the
> > > request after a certain time, the same way it would fail if the
> > > backend distributed storage system gets disconnected or latency gets
> > > out of bounds.
> > In order to prevent corruption there needs to be a fence in addition to
> > a timeout. In other words, the storage backend needs to guarantee that
> > any requests sent before the fence will be ignored if they are still
> > encountered.
> Please correct me if I misunderstand anything.
> 
> I am not sure a remote target is aware of SUSPEND, but the device can
> fail SUSPEND by setting NEEDS_RESET for sure. IN this series,
> maybe it is best to flush and wait for the IO requests.

Yes, it is necessary to wait for pending I/O.

I was pointing out that timeouts implemented inside the storage
initiator (client) are unsafe. The storage target (server) needs to
participate in stopping in-flight I/O one way or another (timeout,
cancellation, fence, etc).

Stefan

Attachment: signature.asc
Description: PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]