OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration


On Wed, Nov 01, 2023 at 06:47:57AM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, November 1, 2023 12:07 PM
> > 
> > On Wed, Nov 01, 2023 at 05:42:56AM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Wednesday, November 1, 2023 11:01 AM
> > > >
> > > > On Wed, Nov 01, 2023 at 02:54:47AM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Sent: Tuesday, October 31, 2023 3:44 PM
> > > > > >
> > > > > > On Tue, Oct 31, 2023 at 05:42:29PM +0800, Zhu, Lingshan wrote:
> > > > > > > > Your answer is not relevant to this discussion at all.
> > > > > > > > Why?
> > > > > > > > Because we were discussing the schemes where registers are not
> > used.
> > > > > > > > One example of that was IMS. It does not matter MSI or MSIX.
> > > > > > > > As explained in Intel's commit message, the key to focus for
> > > > > > > > IMS is "queue
> > > > > > memory" not some hw register like MSI or MSI-X.
> > > > > > > you know the device always need to know a address and the data
> > > > > > > to send a MSI, right?
> > > > > >
> > > > > > So if virtio is to use IMS then we'll need to add interfaces to
> > > > > > program IMS, I think. As part of that patch - it's reasonable to
> > > > > > assume - we will also need to add a way to retrieve IMS so it
> > > > > > can be
> > > > migrated.
> > > > > >
> > > > > > However, what this example demonstrates is that the approach
> > > > > > taken by this proposal to migrate control path structures -
> > > > > > namely, by defining a structure used just for migration - means
> > > > > > that we will need to come up with a migration interface each time.
> > > > > > And that is unfortunate.
> > > > > >
> > > > > When the device supports a new feature it has supported new
> > functionality.
> > > > > Hence the live migration side also got updated.
> > > > > However, the live migration driver does not have to understand
> > > > > what is inside
> > > > the control path structures.
> > > > > It is just byte stream.
> > > > > Only if the hypervisor live migration drive involved in emulating,
> > > > > it will parse
> > > > and that is fine as like other control structures.
> > > >
> > > > The point is that any new field needs to be added in two places now
> > > > and that is not great at all.
> > > >
> > > Most control structs are well defined. So only its type field is added to
> > migrating driver side.
> > > This is very low overhead field and handled in generic way for all device types
> > and for all common types.
> > 
> > Weird, not what I see.  E.g. you seem to have a structure duplicating queue
> > fields. Each new field will have to be added there in addition to the transport.
> > 
> Didn't follow.
> Which structure is duplicated?
> PCI does not even have a structure for q configuration.

So this:

+struct virtio_dev_ctx_pci_vq_cfg {
+        le16 vq_index;
+        le16 queue_size;
+        le16 queue_msix_vector;
+        le64 queue_descÍ
+        le64 queue_driverÍ
+        le64 queue_deviceÍ
+};


duplicates a bunch of fields from this:

struct virtio_pci_common_cfg {
        /* About the whole device. */
        __le32 device_feature_select;   /* read-write */
        __le32 device_feature;          /* read-only */
        __le32 guest_feature_select;    /* read-write */
        __le32 guest_feature;           /* read-write */
        __le16 msix_config;             /* read-write */
        __le16 num_queues;              /* read-only */
        __u8 device_status;             /* read-write */
        __u8 config_generation;         /* read-only */

        /* About a specific virtqueue. */
        __le16 queue_select;            /* read-write */
        __le16 queue_size;              /* read-write, power of 2. */
        __le16 queue_msix_vector;       /* read-write */
        __le16 queue_enable;            /* read-write */
        __le16 queue_notify_off;        /* read-only */
        __le32 queue_desc_lo;           /* read-write */
        __le32 queue_desc_hi;           /* read-write */
        __le32 queue_avail_lo;          /* read-write */
        __le32 queue_avail_hi;          /* read-write */
        __le32 queue_used_lo;           /* read-write */
        __le32 queue_used_hi;           /* read-write */
};


Except it's incomplete and I suspect that's actually a bug.


Here's an idea: have a record per field. Use transport offsets
as tags.




> > > > We need a stronger compatiblity story here I think.
> > > >
> > > > One way to show how it's designed to work would be to split the
> > > > patches. For example, add queue notify data and queue reset separately.
> > > I didn't follow the suggestion. Can you explain splitting patches and its relation
> > to the structure?
> Did I miss your response? Or it is in below msix?

exactly.

> > >
> > > >
> > > > Another is to add MSIX table migration option for when MSIX table is
> > > > passed through to guest.
> > > Yes, this will be added in future when there is actual hypervisor for it.
> > 
> > You are tying the architecture to an extremely implementation specific detail.
> >
> Which part?
>  
> Not really. can you please which software will use MSI-X table migration?

Like, all and any? All hypervisors migrate the msi-x table.

> > Hypervisors *already* have migrate the MSIX table. Just in a hypervisor specific
> > way. 
> MSI-X table is in the PCI BAR memory.

Exactly. And that means hypervisor should not read it from the device
directly - e.g. with an encrypted device it won't be able to.

> > queue vector is an index into this table. So the index is migrated through
> > the device but the table itself has to be trapped and emulated by hypervisor?
> Do you have a hypervisor and a platform that has not done MSI-X table emulation for which you are asking to add?
> I don't know any.
> I am asking to add it when there is a _real_ user of it. Why it cannot be added when the _real_ user arrive?

Real meaning actual hardware and software implementing it?  By this
definition there's no real user for migration in the spec at all - all
of it can be done by device specific means. What's your point? By now we
have a reasonably good idea what hypervisors need to make migration
portable. Let's either put all of it in the spec or not bother at all.

In other words, we need a bright line. I suggest a simple one: memory is
migrated by device, config space by hypervisor.  If not, suggest another
one - but it needs a reasonable rule based on a hardware not whatever
software found expedient to use.


-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]