[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
On 9/26/2023 11:40 AM, Parav Pandit wrote:
I only reply to this section of comments, this does not mean I agree with you on your other statements, Instead I agree withFrom: Jason Wang <jasowang@redhat.com> Sent: Tuesday, September 26, 2023 8:16 AM On Mon, Sep 25, 2023 at 6:41âPM Parav Pandit <parav@nvidia.com> wrote:From: Jason Wang <jasowang@redhat.com> Sent: Friday, September 22, 2023 8:38 AMDevice context has no overlap.I can give you one example, e.g debugging.Almost every feature needs debugging. :) So I am omitting it for time being.Well, I don't think so. We have a lot of handy tools for that (ethtool -d?).Sure add something specific for debug and explicitly mention that it is for debug like -d. Every feature and functionality needs debug, not specifically device context. So add infra for debug. Device migration series is not the vehicle to piggy back on.1) define virtqueue state, inflight descriptors in the section of basic facility but not under the admin commandsIt will be part of the device context such a way that so that one can only readthe vq state only instead of full device context.This will work.I'm not sure what it looks like, but I think they are well decoupled in this series. E.g driver can choose to just read e.g last_avail_idx and report to ethtool or watchdog.Once its done it will be visible how it looks like. The key is it needs to cover BOTH use cases.As I replied in other thread, I see several problems: 1) layer violation, PCI specific state were mixed into the basic facilitiesAfter agreeing to see merged donctext, now you are hinting that you donât agree to merge two. I disagree if you are leaning towards that direction. I hope my deduction from your above comment is incorrect. There is no violation. PCI specific device context will be captured in PCI specific section. Device type contexts will be captured in those device type sections. TLVs will cover many of the device context information.2) I don't see a good definition on "device context" 3) TLV works fine for queue but not registersPlease definition of device context in [1]. [1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00064.htmlWhat needs to be done first is to describe what device context means and what it contains. Not the actual data structure since it may vary.Sure, it is already defined in device migration theory of operation section in [2]. I will try to take it out and put in device management section, so that device migration can refer to it and Some other basic facility also can refer to it (which must need to explain a use case beyond silly debug point). [2] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00062.html3) define transport specific interfaces or admin commands to access themAs you also envisioned, it is done using admin commands to access it.That's fine, but it should allow other ways.For passthrough admin commands fits the design. Sure, you need to draft it how to do other ways.. > >Does this work? It seems you refused such steps in the past.I didnât. There must be some confusion in many emails we both exchanged, becausealready posted v0 has it split such as device context.For dirty page tracking I couldnât find a solid use case without devicemigration, so I asked which you already replied above. First, you never explain why such coupling gives us any benefit.It is not coupled. Device migration uses this facility. So it is matter of text organization in the spec. Not the design.Second, I've given you sufficient examples but you tend to ignore them. Why not go through Qemu codes then you will see the answer.You gave example of debug, and profiling. You didnât explain the use case of how to actually connect and how to profile etc.Actually, I would like to leave 2) as it's very complicated which might not converge easily.I will split current "device migration" section to two. 1. device management 2. device migration Device management covers device mode, device context and dirty pagetracking. I don't see any connection between "management" and "device context".You have better name than device management? May be device operation. May be it is just better to have the device context just the way it is in [1] under basic facility.Device migration refers to device management section. We can omit the dirty page tracking commands for a moment and first closeon the device mode and device context as first step.Since it is already part of v0 and it is needed, I will keep it in subsequent v1 butmoved to device management section. Could you please answer what's wrong with the first 4 patches in this series?1. cover letter is missing the problem statement and use case
Jason on his replies to you. In my cover letter: "The main usecase of these new facilities is Live Migration." Did you miss it?
there are no queue suspend, they are device suspend and vq state accessors. please read the patch if you want to comment.2. why queue suspend and both device suspend are introduced, only one should be there. The design description is missing.
3. Even though it claims under some random basic facility, cover letter clearly states the main use case is "live migration".
it is not random, they are precisely defined virtio basic facilities.
The device should not accept vq reset and the driver should reset vqs, please read previous discussions with MST4. Patch 4 is not needed at all. When device is suspended, it is _suspended_. It does not do any bifurcation.
and please don't ignore the conclusions
we have discussed this for many times, P2P is out of virtio spec, do you want to mediate every PCI state/functionality?5. only suspend bit of patch2 is not enough to cover P2P. One needs suspend and freeze both covered in series [1].
Do you see any reasons this solution can not be used for passthrough and mediation?6. Finally the whole description of 1 to 4 need to be split in the device operation, so that both passthrough and medication can utilize it using admin cmd and otherwise.
Or does features_OK work for passthrough or mediation? Any difference?
Since Zhu, told that dirty tracking and inflight descriptors will be done, I presume he will propose to do over admin q or command interface. And since all can run over the admin commands, the plumbing done in 1 to 4 can be made using admin commands.
No
I don't think so, again, we are introducing basic facilities and these facilities don'tUntil now we could not establish creating yet another DMA interface that is better than q interface. So ... To me both the methods will start looking more converged to me over admin command and queues.
depend on or rely on admin vq.
Passthrough will use them over owner device. Mediation somehow need to do over member device. Mediation will not use any device suspend command because it needs to keep bisecting everything.
please read QEMU vhost live migration solution
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]