OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state




On 9/26/2023 2:03 PM, Parav Pandit wrote:

From: Zhu, Lingshan <lingshan.zhu@intel.com>
Sent: Tuesday, September 26, 2023 11:07 AM
1. cover letter is missing the problem statement and use case
I only reply to this section of comments, this does not mean I agree with you on
your other statements, Instead I agree with Jason on his replies to you.

In my cover letter:
"The main usecase of these new facilities is Live Migration."

:)
Two letter word do not explain the use case of why is asking to mediating a native virtio device.
this solution work for fundamental virtualization: trap and emulate, just like other virtio config space
fields.

Do you know how device status or vq_enable work? I suggest to read QEMU code.

And yet you call is the basic facilities.
Anyways you know the misaligned response in email and cover letter is evident.
No, I don't, they are basic facilities as you can see, this is loud and clear.

Did you miss it?
No.
It misses the detail as I described in the theory of operation described in [1].

[1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00061.html
Details are in the following patch in this series

2. why queue suspend and both device suspend are introduced, only one
should be there. The design description is missing.
there are no queue suspend, they are device suspend and vq state accessors.
please read the patch if you want to comment.
3. Even though it claims under some random basic facility, cover letter clearly
states the main use case is "live migration".
it is not random, they are precisely defined virtio basic facilities.
4. Patch 4 is not needed at all. When device is suspended, it is _suspended_. It
does not do any bifurcation.
The device should not accept vq reset and the driver should reset vqs, please
read previous discussions with MST and please don't ignore the conclusions
5. only suspend bit of patch2 is not enough to cover P2P. One needs suspend
and freeze both covered in series [1].
we have discussed this for many times, P2P is out of virtio spec, do you want to
mediate every PCI state/functionality?
6. Finally the whole description of 1 to 4 need to be split in the device
operation, so that both passthrough and medication can utilize it using admin
cmd and otherwise.
Do you see any reasons this solution can not be used for passthrough and
mediation?
Right. Proposed solution does not meeting following requirements addressed in [1].

[1] 20230906081637.32185-1-lingshan.zhu@intel.com/T/#m7efbaadbc73f033c2793d9eb1eb0afa210aae4be">https://lore.kernel.org/virtio-comment/20230906081637.32185-1-lingshan.zhu@intel.com/T/#m7efbaadbc73f033c2793d9eb1eb0afa210aae4be

[1] I replied to Jason in previous email.
I will repeat here. They are covered in [1].

1. Missing P2P support
As I asked before, please don't ignore, please answer:
"we have discussed this for many times, P2P is out of virtio spec,
do you want to mediate every PCI state/functionality?"
2. Missing dirty page tracking
This will be included in V2, as we have repeated for many times,
we want this series to be small and focus, that is why dirty page tracking
and in-flight descriptors are not here. but they will in V2.
3. Incremental device context framework for short downtime
Do you observe significant downtime in QEMU/vhost?

Why do you think this series can introduce more downtime,

Do you know this series can re-use QEMU/vhost?

Have you really read QEMU live migration code?

Jason has ever suggested you read it.
3.a Ability to do inflight descriptor tracking
as told you many times, in V2
4. Ability to do the work for multiple member devices in parallel.
As told you many times, they are per-device facilities, for example,
per-vf device, that means, migrate the VF by its own facilities.

Is that clear enough for you?


Or does features_OK work for passthrough or mediation? Any difference?
It does not work. Passthrough devices are not trapped by the hypervisor.
Really? Features_ok does not work for passthrough? Seriously?

Since Zhu, told that dirty tracking and inflight descriptors will be done, I
presume he will propose to do over admin q or command interface.
And since all can run over the admin commands, the plumbing done in 1 to 4
can be made using admin commands.
No
Such negative assertion does not help.
Explain why part, like how I explained above.
OK, we can repeat:

Again! They are self-contained basic facilities, they should better not depend on others like admin vq.

And please refer to previous discussions, where Jason and I pointed out admin vq is not a qualified solution for live migration because of: 1)nested 2)baremetal LM 3)QOS 4)security.

We don't want to repeat the discussions, it looks like endless circle with no direction.

Until now we could not establish creating yet another DMA interface that is
better than q interface.
So ...
To me both the methods will start looking more converged to me over admin
command and queues.
I don't think so, again, we are introducing basic facilities and these facilities
don't depend on or rely on admin vq.
If so, stop the work "live migration" from the cover letter.
Reliance of admin command (again not vq, be careful what you constantly claim).
Not reliance on admin queue or admin command does/does not make it basic facility.
why admin commands are must? These facilities are self contained, right?

Admin commands and queues are already in basic facilities section today.
So claiming that hey one is using admin commands that means it is non_basic facility is not correct.
Still why you think admin command is a must? It is clear that this proposal can work without admin vq,
and even better!

Passthrough will use them over owner device.
Mediation somehow need to do over member device.
Mediation will not use any device suspend command because it needs to
keep bisecting everything.
please read QEMU vhost live migration solution
Can you please share the pointer to it?

I am familiar with [2] and it does not require device suspend flow as things are bisected.

[2] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#introduction
If so, I believe you may find out that this solution can work perfect with vhost, right?



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]