OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [RFC PATCH v2] virtio-rtc: Add device specification.


On Mon, Aug 21, 2023 7:42, Parav Pandit <parav@nvidia.com> wrote:
>> From: Peter Hilber <peter.hilber@opensynergy.com>
>> Sent: Friday, August 4, 2023 9:38 PM
>>
>> On 31.07.23 05:15, Parav Pandit wrote:
>> >
>> >> From: Peter Hilber <peter.hilber@opensynergy.com>
>> >> Sent: Thursday, July 27, 2023 3:51 PM
>> >
>> >
>> >>>> In the spec, clock ids are 64 bit wide to have unique ids when
>> >>>> supporting clock
>> >>> It is unclear the namespace of uniqueness.
>> >>> Do you mean unique id across multiple virtio devices?
>> >>
>> >> The idea was not to have a unique id across multiple virtio devices.
>> >> Rather, the idea was to simplify lifetime management for clocks in a
>> >> single virtio device (if such lifetimes would be introduced in the
>> >> future) by not having to worry about race conditions due to clock id reuse.
>> >>
>> > If the id not unique, 16-bit is likely more than enough.
>> > There should not be any race condition with respect to clock id as its per
>> device.
>> >
>>
>> I think 64 bit is better since the cost is negligible, and e.g. the device then does
>> not have to care how it issues ids while the driver is suspended.
>
> Can you please explain how 64-bit better when the number of clocks per device is not so much?
> Even with 256 or 64K clocks device wont care how the ids are issued.

64 bit width should address the potential future case where two clocks get
hot-plugged and hot-unplugged at a high rate. If the device hands out a new
id each time a clock is re-plugged, and if the driver suspends and resumes
during this, an 8 bit id could result in the driver interpreting ids
wrongly.

>> > Both the modes are in use, hence we want to support #1 and #2 both without
>> duplicating the interfaces.
>> >
>>
>> What I wanted to refer to in my reply are the Linux kernel /dev/ptp* interfaces,
>> which are separate from the network device interfaces.
>>
> Sure. It can be.
> When the netdevice has the RTC features supports it will create /dev/ptp* and /sys/class/net/netdev devices.
>
> All the plumbing that is done is for dual use.
> So command etc definitions is for dual use as I explained before.
> Those can be used for virtio-net device when it has in-built rtc.
>
> When there is only rtc device with its own device-type, same functionality and commands are there.
> Driver will create only /dev/ptp* device.
>
> From SW point of view it is just matter of layering..
> Already done by more than 13+ industry devices..

I think there was a misunderstanding. I was trying to say: When it is
better to show distinct devices to userspace, would it not also be better
to show distinct devices at the Virtio level?

>> >> The extra effort to implement another simple virtio device (rather
>> >> than putting everything in a single device) seems to be small when
>> >> compared to the complexity of creating an extensible specification
>> >> which should work both standalone and as part of an already existing, but
>> still growing, device.
>> > It may sound as small effort with only a focused view on RTC only device.
>> > But both types are needed.
>> > From actual implementation point of view and use case wise, yet another PCI
>> device at scale of virtual machines is not desired, especially when it has tight
>> usage completing on network device.
>> >
>> >
>>
>> If the issue is PCI transport related resource consumption, the PCI transport
>> could be extended to support multi-function devices, in a way which is agnostic
>> to the device types involved.
>>
> All things needed by rtc is already present on net device. So there is no point of improving PCI transport etc.
> PCI transport resource improvement goes just beyond PCI transport improvement. Such as APIC, IOMMU, device registers and more.
> This is equally an overkill.

I understand it would be more effort. But to me it looks better as a
generic solution, which avoids imposing restrictions on the RTC device.

> And defining and duplicating exact same commands in net is equally sub-optimal, specially when we know its need for last several months.
>
> This is the right junction to reuse functionality on two device types without duplication.

In my opinion it would still be preferable to have one RTC device, and have
the RTC device cooperate with the net device where needed.

>> >> E.g., the
>> >> RTC device should also get an alarm, with an associated eventq, in the
>> future.
>> > Sure, it is ok. it will be optional when it is RTC device.
>> >
>> >> Also, I wonder if the driver implementation should then work both as
>> >> a standalone Virtio driver and as part of the net driver.
>> >>
>> > Driver layer will be fine. Linux stack evolved over last 3 years to support
>> multi-functionality device.
>> > A RTC/PTP driver will be a stand alone entity that works on underlying layered
>> way to issue read cmd, config etc.
>> >
>>
>> As per my above reply, I think it would be better to tackle the problem at the
>> PCI transport level, then there would be no impact on individual device drivers.
>>
> There is no impact by better definition usable on two devices.

My main concerns about sharing the spec with the net device are:

- the RTC device would act as a guinea pig, since no other devices share
  device-specific requirements so far

- a more complex spec impacting two devices

- strong coupling between net and RTC driver implementation (or code
  duplication)

- using custom interfaces (between net and RTC driver), rather than generic
  interfaces

- RTC could not use device configuration

- unrecognized issues, or future generic changes to the Virtio standard,
  may inhibit future development of the RTC device or the net device

- unclear responsibility and timeline for net device related spec and
  proof-of-concept

>> >> Maybe the above use case could also be addressed by a standalone RTC
>> >> device as follows:
>> >>
>> >> 1) The net device exports a UUID object to which the RTC device can
>> >> refer to (so the drivers can link the net device to the RTC clock).
>> >>
>> >> 2) If needed, the RTC device aligns the lifetime of the clock to the net device.
>> >> Then, clock lifetimes are no longer static, and can be aligned with
>> >> the net device. Through an eventq (to be added), the RTC device
>> >> announces clock creation/destruction, which can be synchronized to the net
>> device state.
>> >>
>> > All of the infrastructure, cross device synchronization, its device side
>> management will come with equal complexity.
>> > And we will end up with sub-optimal way to using multiple PCI devices.
>> >
>>
>> As per my above reply, I think it would be better to tackle the problem at the
>> PCI transport level...
>>
>> > Net already has cvq required by RTC. So cvq commands can be common.
>> > Readq is additional, we need to define common between the two and way to
>> discover it.
>> >
>>
>>  I would still try to avoid mixing net and RTC details. RTC
>> should keep a dedicated cvq.
> Can you please explain why a dedicate cvq is needed on net device?

I think the concern about redundant queues could be addressed by merging
the RTC readq and controlq. At the moment separating these queues would not
yield performance benefits, so I think I will just merge the readq and
controlq RTC queues in the next spec iteration.

As to why keeping the net and RTC queues separated: With a dedicated queue,
the RTC driver message processing could remain decoupled from the net
driver. Also, the net cvq format does not align well with the RTC messages.
RTC messages have detailed status codes, while net only has VIRTIO_NET_OK
and VIRTIO_NET_ERR.

>> Then the RTC driver could use generic Virtio
>> functionality to send/receive rather than using an interface which is specific to
>> the net driver cvq.
>>
> Please explain why?
> We have dedicated vqs work ongoing for flow filters due to high rate of requests requirement handled by hw.
>
> So would like to understand if we can extend similar dedicated cvq. Given most of the queries done via cvq for rts are not time sensitive.
> Hence net cvq seems enough.

This should have been addressed by merging the RTC readq and controlq, as
per my above reply.

>> >> Actually, the cross-timestamping support can be specific to the
>> >> combination of clock and hardware counter. This is explained in more
>> >> detail in the context of VIRTIO_RTC_M_READ_CROSS. I will mention this
>> here as well.
>> >>
>> > Hardware counter is a misleading term here. It is really platform (system)
>> specific identifier.
>> >
>>
>> Maybe with the currently defined hardware counters there is a one-to-one
>> correspondence to platforms, but that need not be the case. E.g. it might be
>> more convenient for a device on an Intel x86 platform to read the Always
>> Running Timer (ART) rather than the TSC derived from the ART, and then the
>> ART could become another HW counter in the spec for x86.
>>
> Maybe I am wrong, but it looks like this is platform feature where driver can device to read ART or ARM timer without any device knowledge?

Not sure what is meant by "driver can device".

What do you mean by "platform feature"? As I argued in my previous reply,
the ART could be used instead of the TSC, and then would need to be
distinguished from the TSC. Also, there is no guarantee that all platforms
will always have the same single relevant counter. E.g. on Arm there is the
physical counter view and the virtual counter view (though distinguishing
those might not be so important). So device knowledge is required for
reliable operation.

To me there is also no advantage in narrowing down the hardware counter
terminology to "platform", when it might in principle be other counters,
maybe from peripheral devices.

> Can you please explain, if the device does not have the knowledge of platform/system timer, can it still achieve the cross timestamping at driver level?

In a simple scenario, if everything is working out, the driver might just
guess the right hardware counter. But it should make sense to be able to
detect error conditions, and to be able to choose among multiple hardware
counters should the device support those.

Rationale: For cross-timestamping, the device sends the driver a hardware
counter reading. The driver needs to know how to interpret the hardware
counter reading. If the driver assumes an incompatible hardware counter
(e.g. TSC instead of ART, or HPET instead of TSC), the cross-timestamp
interpretation would be nonsense.

Of course, in general indicating the type of hardware counter might not be
sufficient, e.g. when a heterogeneous multicore platform might not have all
counters synchronized. In general, the device would need to have some extra
implementation-specific knowledge about which hardware counter view the
driver has.

>> As discussed above, additional counters may be used in the future.
>>
> Do you mean additional system counter?

This was meant to refer to the ART, which would need to be distinguished
from the TSC.

>> >>> If so, I guess no need of a feature bit as it is not serving any
>> >>> meaningful
>> >> purpose.
>> >>
>> >> The only purpose is to potentially save the device some
>> >> initialization work or resources in case the feature is not negotiated.
>> >>
>> > Ok. In that case we should move it to per clock enablement instead of feature
>> bit.
>> >
>>
>> As per my above replies, I think it should be kept clock and hardware counter
>> specific. The feature bit could be removed if it seems not helpful.
>>
> Ok. lets remove it.

OK.

> I apologize for my slow response. I was not well when you replied.
> After that lost track of your email in stacks of email until you sent the newer version.

No worries. I was also on leave and I am unfortunately also slow to
respond.

> Lets bring rtc functionality in the two devices. We are almost there, just need to refine the definitions.

I would still prefer to keep the RTC spec separated from the net spec. But
in my understanding the issues which were raised about sharing the RTC spec
with the net spec have been addressed (when taking this reply into account,
too).

I would prefer to first try to upstream the RTC spec without altering the
net spec, while trying to avoid conflicts with the net spec. Maybe most of
the RTC spec could then, as a second step, also be applied to the net
device. The responsibility for this second step and the related proof of
concept work would need to be clarified. To me it still seems better to not
share the spec and address resource problems at the transport level.

Let me go into some detail regarding the above. My main concerns about
sharing the RTC spec with the net device are (as per an above reply):

- the RTC device would act as a guinea pig, since no other devices share
  device-specific requirements so far

- a more complex spec impacting two devices

- strong coupling between net and RTC driver implementation (or code
  duplication)

- using custom interfaces (between net and RTC driver), rather than generic
  interfaces

- RTC could not use device configuration

- unrecognized issues, or future generic changes to the Virtio standard,
  may inhibit future development of the RTC device or the net device

- unclear responsibility and timeline for net device related spec and
  proof-of-concept

To me the main argument in favor of appending the RTC device to the net
device seems to be resource consumption related, so might better be
addressed in a generic way at the PCI transport level.

That said, by also changing the spec to merge the RTC readq and controlq,
in my understanding all the issues which were raised about compatibility
with the net spec have been addressed for now.

Last, thank you very much for your valuable remarks! I hope the resulting
spec will be able to address diverse use cases, including interoperability
with the net device.

Best regards,

Peter


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]