[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status
On Fri, Aug 18, 2023 at 05:55:42PM +0800, Zhu, Lingshan wrote: > > > On 8/18/2023 12:04 AM, Stefan Hajnoczi wrote: > > On Thu, Aug 17, 2023 at 05:15:16PM +0200, Eugenio Perez Martin wrote: > > > On Tue, Aug 15, 2023 at 2:29âPM Stefan Hajnoczi <stefanha@redhat.com> wrote: > > > > On Tue, Aug 15, 2023 at 06:31:23PM +0800, Zhu, Lingshan wrote: > > > > > > > > > > On 8/14/2023 10:30 PM, Stefan Hajnoczi wrote: > > > > > > On Tue, Aug 15, 2023 at 03:29:00AM +0800, Zhu Lingshan wrote: > > > > > > > This patch introudces a new status bit in the device status: SUSPEND. > > > > > > > > > > > > > > This SUSPEND bit can be used by the driver to suspend a device, > > > > > > > in order to stablize the device states and virtqueue states. > > > > > > > > > > > > > > Its main use case is live migration. > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > > > > Signed-off-by: Eugenio PÃrez <eperezma@redhat.com> > > > > > > There is an character encoding issue in Eugenio's surname. > > > > > Oh, I copied his SOB form his email, I will copy from git log to fix this, > > > > > thanks for point out it. > > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > > > > > > > --- > > > > > > > content.tex | 18 ++++++++++++++++++ > > > > > > > 1 file changed, 18 insertions(+) > > > > > > This patch hints at the asynchronous nature of the SUSPEND bit (the > > > > > > driver must re-read the Device Status Field) but doesn't explain the > > > > > > rationale or any limits. > > > > > > > > > > > > For example, is there a timeout or should the driver re-read the Device > > > > > > Status Field forever? > > > > > It depends on the driver, normally we expect this operation can be done > > > > > successfully > > > > > like how the driver/device handles FEATURES_OK. > > > > > > > > > > Once failed due to: > > > > > 1) driver timeout, the driver can reset the device > > > > > 2) device failure, the device can set NEEDS_RESET. > > > > I mention this because SUSPEND involves quiescing the device so that no > > > > requests are in flight and that can take an unbounded amount of time on > > > > a virtio-blk, virtio-scsi, or virtiofs device. If the driver is busy > > > > waiting for the device to report the SUSPEND bit, then that could take a > > > > long time/forever. > > > > > > > > Imagine a virtiofs PCI device implemented in hardware that forwards I/O > > > > to a distributed storage system. If the distributed storage system has a > > > > request in flight then SUSPEND needs to wait for it to complete. The > > > > device has no control over how long that will take, but if it does not > > > > then corruption could occur later on. > > > > > > > > There are two issues with long SUSPEND times: > > > > 1. Busy wait CPU consumption. Since there is no interrupt that signals > > > > when the bit is set, the best a driver can do is to back off > > > > gradually and use timers to avoid hogging the CPU. > > > > 2. Synchronous blocking. If the call stack that led the driver to set > > > > SUSPEND is blocked until the device reports the SUSPEND bit, then > > > > other parts of the system could experience blocking. For example, the > > > > VMM might be blocked in a vhost ioctl() call, which makes the guest > > > > unresponsive. > > > > > > > I think all of this already happens with ring reset or even a plain > > > device reset, doesn't it? > > Yes, but keep in mind that the driver has typically already drained > > requests when it invokes reset. If SUSPEND is used transparently during > > live migration then there really will be requests in flight because the > > guest driver is unaware. > I agree, and as discussed before, I think in this series we should say: > the device MUST wait untilall descriptors that being processed to finish and > mark them as used. Sounds good. > Then in the following series, we should implement in-flight IO tracker. > > > > > In my opinion the best thing the device can do here is to fail the > > > request after a certain time, the same way it would fail if the > > > backend distributed storage system gets disconnected or latency gets > > > out of bounds. > > In order to prevent corruption there needs to be a fence in addition to > > a timeout. In other words, the storage backend needs to guarantee that > > any requests sent before the fence will be ignored if they are still > > encountered. > Please correct me if I misunderstand anything. > > I am not sure a remote target is aware of SUSPEND, but the device can > fail SUSPEND by setting NEEDS_RESET for sure. IN this series, > maybe it is best to flush and wait for the IO requests. Yes, it is necessary to wait for pending I/O. I was pointing out that timeouts implemented inside the storage initiator (client) are unsafe. The storage target (server) needs to participate in stopping in-flight I/O one way or another (timeout, cancellation, fence, etc). Stefan
Attachment:
signature.asc
Description: PGP signature
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]