OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration




On 10/25/2023 4:33 PM, Michael S. Tsirkin wrote:
On Tue, Oct 24, 2023 at 06:27:04PM +0800, Zhu, Lingshan wrote:

On 10/23/2023 7:32 PM, Michael S. Tsirkin wrote:

     On Mon, Oct 23, 2023 at 06:03:10PM +0800, Zhu, Lingshan wrote:

         config space, MMIO, registers work for years, what is wrong with them?

     Nothing as such. They don't seem to be appropriate for all use-case
     where people want to utilize virtio. I think a new transport
     will be needed to address these.

New transport for new type of devices for sure, like transport vq for SIOV.

I agree admin vq or admin cmds are useful in some use cases, that is
another story, should be case by case.

For now, let's don't talk about all-use cases, just for current task, for live
migration.

So IMHO, I still think we should use config space registers to control live
migration process.


No because it forces integrating migration process with device driver.
Which is ok for some use-cases but not all of them.  Find some other
control plane for this.
what interesting migration process? It needs to work with host driver anyway


                 Config space is control path, DMA is data-path, let's better not mix them,
                 we never expect to use config space to transfer data.

                 So we need DMA to transfer data, for example I take advantages of device DMA
                 to logging dirty pages, This also applies to in-flight descriptors.

             As long as you do, I personally see little benefit to retrieve parts of
             state with memory mapped accesses.

         registers only control, and I personally believe a single register is much
         better
         than processing admin commands, more light-weight, more reliable, working
         for years.

     Yea. It would be, if we could do everything through that register.
     But we can't really. Migration has too much data to pass around
     for that to be reasonable.

data are not transferred by registers, they only control.

We transfer data by DMA, the device writes DMA dirty pages information(bitmap)
to host isolated memory region.


If you do that then I don't see any reason not to use admin
commands for that - either through a vq or a simpler
interface.
I mean Device DMA writing a bitmap to report dirty pages.
Why you want admin cmds doing that?


         Config space interfaces are fundamental for virtio-pci.


     They are in fact fundamental to virtio. Multiple transports to
     use config space are also fundamental.

I agree. So I also agree to build admin vq live migration solution based on our
basic facilities, as Jason ever proposed.

I'm not sure it's even a vq. I suggest a minimal interface to send
admin commands. Could be used by migration, as transport, and more.
so you still need to explain why admin cmds are better than registers for
this live migration task.

Or you are saying admin cmds are better than config space, I am not sure I
agree with this statement.



                 And we are implementing virito live migration, not only for PCI.

                 So both me and Jason keep repeating: We are implementing basic facilities,
                 and the implementation is transport specific.

             But the register based facilities you proposed are extremely limited and
             seem to only work for migration. For example, it seems mostly useless for
             debugging because retrieving state is rather complex and would
             interfere with normal working of the device.

         If you want to prove the register controlling interfaces are extremely
         limited than admin vq or admin cmds,
         you are also proving config space registers are extremely limited than
         admin vq.

     Yes. Migration needs ability to pass large amounts of data around, and
     is too complex a functionality to work reliably without ability to
     report errors.

what errors? when device DMA?
missing some dirty pages? If the device can detect such errors, it can recover
by itself,
or how can driver fix this?
Not just pages, there's a lot of internal device state.

You fix for example by reporting that state does not work
for a current device, and guest can be restarted on migration
source.
If re-read fail, means status checking fail, then recover or failed migration
If migration fail, like timeout, then resume.

If guest restart, the hypervisor is aware of this process and the device reset as well.
What's wrong for this process?


for control path, virtio uses re-read for many years and it works well.
Let's not even get started with how live migration currently "works
well".  I happen to be familiar with it intimately.  We tried to
maintain migration compatiblity as best we could and we tend to break it
every second release.
I mean re-read to check, like re-read device status to make sure the device is
suspended, like how virtio handle feartures_ok. this "re-read" work well.



I
believe we have
went through this issue before.


         So the question still here: do you want to replace current virtio-pci common
         cfg
         with admin vq or admin cmds?

     I think we need to add a new transport that will use admin commands.
     Which one to use would be up to a specific device.

For new device type like SIOV, yes we need a new transport, transport vq.

Let's focus on this live migration feature, if there are new features in the
future
requires admin vq, let's discuss when they proposed.




         And debug what? If you want to introduce more functionalities, we should
         discuss
         case by case.

         If debugging vq state, it is as easy as read queue_size, I don't see the
         limitations
         as queue_size work for years.

     No one reads queue_size. In fact for years we didn't have any debugging
     functionality and we are fine. If we are adding it, it really needs to
     be accessible when driver and device are wedged.

OK, I don't disagree to implement new device debugging features.

But let's focus on current live migration task.




         I still believe our goal is to do our best, with our capabilities, to build
         the most optimal virtio spec
         as we can do. Not other goals.

         Thanks
         Zhu Lingshan



                 We have proposed to build admin vq based on our register solution, this can
                 somehow even help tp resolve the nested issue.

                 But I see the proposed has been rejected.

                 I still believe the goal is to build a best spec, not "just can work" with
                 limitations.






This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]