[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
On Mon, Apr 03, 2023 at 06:00:13PM -0400, Parav Pandit wrote: > > > On 4/3/2023 5:04 PM, Michael S. Tsirkin wrote: > > On Mon, Apr 03, 2023 at 08:25:02PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Monday, April 3, 2023 2:02 PM > > > > > > > > Because vqs involve DMA operations. > > > > > It is left to the device implementation to do it, but a generic wisdom > > > > > is not implement such slow work in the data path engines. > > > > > So such register access vqs can/may be through firmware. > > > > > Hence it can involve a lot higher latency. > > > > > > > > Then that wisdom is wrong? tens of microseconds is not workable even for > > > > ethtool operations, you are killing boot time. > > > > > > > Huh. > > > What ethtool latencies have you experienced? Number? > > > > I know an order of tens of eth calls happens during boot. > > If as you said each takes tens of ms then we are talking close to a second. > > That is measureable. > I said it can take, doesn't have to be always same for all the commands. > Better to work with real numbers. :) > > Let me take an example to walk through. > > If a cvq or aq command takes 0.5msec, total of 100 such commands will take > 50msec. > > Once a while if two of commands say take 5msec, will result in 50 -> 60 > msec. Not too bad. then it seems it should not be a problem to tunnel config over AQ then? > > > OK then. Then if it is a dead end then it looks weird to add a whole new > > config space as memory mapped. > > > I am aligned with you to not add any new register as memory mapped for 1.x. > Or access through device own's tvq is fine if such q can be initialized > before during device reset (init) phase. > > I explained that legacy registers are sub-set of existing 1.x. > They should not consume extra memory. > > Lets walk through the merits and negatives of both to conclude. > > > > > Let me try again. > > > If hardware vendors do not want to bear the costs of registers then they > > will not implement devices with registers, and then the whole thing will > > become yet another legacy thing we need to support. If legacy emulation > > without IO is useful, then can we not find a way to do it that will > > survive the test of time? > legacy_register_transport_vq for VF can be a option, but not for PF > emulation. OK. Do we really care? Are you guys selling lots of high end cards without SRIOV that it matters? > More below. > > > > > > Again, I want to emphasize that register read/write over tvq has merits with trade-off. > > > And so the mmr has merits with trade-off too. > > > > > > Better to list them and proceed forward. > > > > > > Method-1: VF's register read/write via PF based transport VQ > > > Pros: > > > a. Light weight registers implementation in device for new memory region window > > > > Is that all? I mentioned more. > > > b. device reset is more optimal with transport VQ > c. a hypervisor may want to check (but not necessary) register content > d. Some unknown guest VM driver which modifies mac address and still expect > atomicity can benefit if hypervisor wants to do extra checks It's not hard to be more specific. Old Linux kernels are like this, this was fixed with: commit 7e58d5aea8abb993983a3f3088fd4a3f06180a1c Author: Amos Kong <akong@redhat.com> Date: Mon Jan 21 01:17:23 2013 +0000 Currently we write MAC address to pci config space byte by byte, this means that we have an intermediate step where mac is wrong. This patch introduced a new control command to set MAC address, it's atomic. about 10 years ago. > > > Cons: > > > a. Higher DMA read/write latency > > > b. Device requires synchronization between non legacy memory mapped registers and legacy regs access via tvq > > > > Same as a separate mmemory bar really. Just don't do it. Either access > > legacy or non legacy. > > > It is really not same to treat them equally as tvq encapsulation is > different, and hw wouldn't prefer to treat them equally like regular memory > writes. I think yoiu missunderstand what I said. You listed a problem: the same device can be accessed through both a modern and a legacy interface. I said that it is not a problem at all, there is no reason to use both. > Transitional device exposed by hypervisor contains both legacy I/O bar and > also the memory mapped registers. So a guest vm can access both. But it must not, and some devices break if you do. > > > c. Can only work with the VF. Cannot work for thin hypervisor, which can map transitional PF to bare metal OS > > > (also listed in cover letter) > > > > Is that a significant limitation? Why? > It is a functional limitation for the PF, as PF has no parent. > and PF can also utilize memory BAR. Yes it's a limitation, I just don't see why we care. > > > > > Method-2: VF's register read/write via MMR (current proposal) > > > Pros: > > > a. Device utilizes the same legacy and non-legacy registers. > > > > > b. an order of magnitude lower latency due to avoidance of DMA on register accesses > > > (Important but not critical) > > > > And no cons? Even if you could not see them yourself did I fail to express myself to such > > an extent? > > > Method-1 pros covered the advantage of it over method-2, but yes worth to > list here for completeness. > > Cons: > requires creating new memory region window in the device for configuration > access Parav please take a look at the discussion so far as collect more cons that were mentioned for the proposal, I definitely listed some and I don't really want to repeat myself. I expect a proposal to be balanced, not a sales pitch. > > > > > No. Interrupt latency is in usec range. > > > > > The major latency contributors in msec range can arise from the device side. > > > > > > > > So you are saying there are devices out there already with this MMR hack > > > > baked in, and in hardware not firmware, so it works reasonably? > > > It is better to not assert a solution a "hack", > > > > Sorry if that sounded offensive. a hack is not necessary a bad thing. > > It's a quick solution to a very local problem, though. > > > It is a solution because device can do at near to zero extra memory for > existing registers. > Anyways, we have better technical details to resolve. :) > Lets focus on it. > > > Yes motivation is one of the things I'm trying to work out here. > > It does however not help that it's an 11 patch strong patchset > > adding 500 lines of text for what is supposedly a small change. > > > Many of the patches are rework and incorrect to attribute to the specific > feature. > > Like others it could have been one giant patch... but we see value in > smaller patches.. > > Using tvq is even bigger change than this. The main thing is that there's no new ID so the PF device itself will stay usable with existing drivers. > So we shouldn't be afraid of > making transitional device actually work using it with larger spec patch. > > > > Regarding tvq, I have some idea on how to improve the register read/writes so that its optimal for devices to implement. > > > > Sounds useful, and maybe if tvq addresses legacy need then focus on > > that? > > > > tvq specific for legacy register access make sense. > Some generic tvq is abstract and dont see any relation here. > > So better to name it as legacy_reg_transport_vq (lrt_vq). Again this assumes tvq will be rewritten on top of AQ. I guess legacy can then become a new type of AQ command? And maybe you want a memory mapped register for AQ commands? I know Jason really wanted that. > How about having below format? > > /* Format of 16B descriptors for lrt_vq > * lrt_vq = legacy register tranport vq. > */ > struct legacy_reg_req_vf { > union { > struct { > le32 reg_wr_data; > le32 reserved; > } write; > struct { > le64 reg_read_addr; > }; > }; > le8 rd_wr : 1; /* rd=0, wr=1 */ > le8 reg_byte_offset : 7; > le8 req_tag; /* unique request tag on this vq */ > le16 vf_num; > > le16 flags; /* new flag below */ > le16 next; > }; > > #define VIRTQ_DESC_F_Q_DEFINED 8 > /* Content of the VQ descriptor other than flags field is VQ > * specific and defined by the VQ type. > */ Any way to allow accesses of arbitrary length? -- MST
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]