virtio-comment message

Subject: Re: [virtio-comment] [RFC PATCH v2] virtio-rtc: Add device specification.
From: Peter Hilber <peter.hilber@opensynergy.com>
To: Parav Pandit <parav@nvidia.com>, "virtio-comment@lists.oasis-open.org" <virtio-comment@lists.oasis-open.org>
Date: Fri, 29 Sep 2023 20:38:54 +0200
On 25.09.23 11:31, Parav Pandit wrote:
> Hi Peter,
> 

Hi Parav,

I'm sorry to be a bit late with my answer, since I only received your email
via the mailing list, which I do not check so often.

>> From: Peter Hilber <peter.hilber@opensynergy.com>
>> Sent: Wednesday, September 13, 2023 2:34 PM
> 
>>> Can you please explain how 64-bit better when the number of clocks per
>> device is not so much?
>>> Even with 256 or 64K clocks device wont care how the ids are issued.
>>
>> 64 bit width should address the potential future case where two clocks get hot-
>> plugged and hot-unplugged at a high rate. If the device hands out a new id each
>> time a clock is re-plugged, and if the driver suspends and resumes during this,
>> an 8 bit id could result in the driver interpreting ids wrongly.
>>
> In the proposed spec, in a given rtc device, total number of clocks are static at the device discovery time.
> Clock cannot be hotplug or unplugged dynamically.
> Did I understand right?

Yes. The 64 bit width was just to accommodate for potential future
extensions. This does not force the device to use effectively 64 bit wide
ids. I chose 64 bit over 8 bit because I did not expect that this would
become an issue.

> 
> If not, and if the clocks can be added/removed dynamically, if the clock id is not available, the device will wait to hotplug the new one, until the old one is removed.
> Does one need 64K clocks under single device?
> If so, can you please share some thoughts on example application that might find it useful?

The example scenario is the one which is quoted above: The device might not
know when the driver is suspended, and continuously hand out new ids (when
clocks are continuously re-plugged for some reason). If the driver later
resumes at the wrong moment, by chance the same id which was in use before
might now be assigned to another clock.

This and similar scenarios can be prevented by using 64 bit ids.

> 
>>>>> Both the modes are in use, hence we want to support #1 and #2 both
>>>>> without
>>>> duplicating the interfaces.
>>>>>
>>>>
>>>> What I wanted to refer to in my reply are the Linux kernel /dev/ptp*
>>>> interfaces, which are separate from the network device interfaces.
>>>>
>>> Sure. It can be.
>>> When the netdevice has the RTC features supports it will create /dev/ptp*
>> and /sys/class/net/netdev devices.
>>>
>>> All the plumbing that is done is for dual use.
>>> So command etc definitions is for dual use as I explained before.
>>> Those can be used for virtio-net device when it has in-built rtc.
>>>
>>> When there is only rtc device with its own device-type, same functionality
>> and commands are there.
>>> Driver will create only /dev/ptp* device.
>>>
>>> From SW point of view it is just matter of layering..
>>> Already done by more than 13+ industry devices..
>>
>> I think there was a misunderstanding. I was trying to say: When it is better to
>> show distinct devices to userspace, would it not also be better to show distinct
>> devices at the Virtio level?
>>
> If a rtc functionality is in the virtio-rtc only device, it will be as virtio-rtc device.
> If the net device has embedded this functionality, the user will still see a distinct device at the OS level.
> Every OS would create OS level device object to operate on for the underlying hw device.
> So it will be fine to end user.
>  
> 

I agree that this is irrelevant for the end user.

>>> PCI transport resource improvement goes just beyond PCI transport
>> improvement. Such as APIC, IOMMU, device registers and more.
>>> This is equally an overkill.
>>
>> I understand it would be more effort. But to me it looks better as a generic
>> solution, which avoids imposing restrictions on the RTC device.
>>
> There are zero restrictions imposed on the RTC device because the spec is being drafted now.
> May be I didnât convey the idea well.
> 
> The whole idea is the virtio-rtc can be independent device.
> A net-device can have the exact same functionality, so that virtio-rtc functionality can be embedded in it.
> Thereby, both model co-exists.
>  

I understand. I am just still concerned about the dual use of the spec for
the reasons which I gave below.

>>> And defining and duplicating exact same commands in net is equally sub-
>> optimal, specially when we know its need for last several months.
>>>
>>> This is the right junction to reuse functionality on two device types without
>> duplication.
>>
>> In my opinion it would still be preferable to have one RTC device, and have the
>> RTC device cooperate with the net device where needed.
> I read your below comments for the opinion.
> More below.
> 
>>
>>>>>> E.g., the
>>>>>> RTC device should also get an alarm, with an associated eventq, in
>>>>>> the
>>>> future.
>>>>> Sure, it is ok. it will be optional when it is RTC device.
>>>>>
>>>>>> Also, I wonder if the driver implementation should then work both
>>>>>> as a standalone Virtio driver and as part of the net driver.
>>>>>>
>>>>> Driver layer will be fine. Linux stack evolved over last 3 years to
>>>>> support
>>>> multi-functionality device.
>>>>> A RTC/PTP driver will be a stand alone entity that works on
>>>>> underlying layered
>>>> way to issue read cmd, config etc.
>>>>>
>>>>
>>>> As per my above reply, I think it would be better to tackle the
>>>> problem at the PCI transport level, then there would be no impact on
>> individual device drivers.
>>>>
>>> There is no impact by better definition usable on two devices.
>>
>> My main concerns about sharing the spec with the net device are:
>>
>> - the RTC device would act as a guinea pig, since no other devices share
>>   device-specific requirements so far
>>
> He he. No it should not. It is only matter of defining cmd opcode and status values that do not overlap one time.
>  

I don't know how spec sharing would be judged by the TC. At the moment, no
other devices do this.

>> - a more complex spec impacting two devices
>>
> Well it is the functionality, so once you look for slightly larger use cases, overall it is still less complexity than writing two different specs, one for net and one for rtc only device.
> 

Complexity-wise it would be best to have one RTC spec, and link the RTC
device to other devices through UUID if needed.

>> - strong coupling between net and RTC driver implementation (or code
>>   duplication)
>>
> Mostly no. it is the software layer. RTC layer operates over control VQ.
> So net device will provide the needed API. Same API would be available for dedicated rtc device lower layer too.
> No problem.
>  

But Virtio API will usually be time-proven, while drivers certainly have
opportunities to get the custom API wrong. 

>> - using custom interfaces (between net and RTC driver), rather than generic
>>   interfaces
>>
> Multi function devices are common in nature for functionality and scale.
> And OSes supports it already.
> 

Cf. above reply.

>> - RTC could not use device configuration
>>
> In your proposal all things are over cvq already.
> 

But it would prevent future use of the device configuration.

>> - unrecognized issues, or future generic changes to the Virtio standard,
>>   may inhibit future development of the RTC device or the net device
>>
> Unlikely as the reviews eliminate such issues.
>  
>> - unclear responsibility and timeline for net device related spec and
>>   proof-of-concept
>>
> All the responsible people of the technical committee are here.
> In fact the requirements are more closely tracked by its stake owners for net compared to other devices.
> So if you need help to find net side owner, let me know. Others and I are interested to have it in the net side.
> 

With "unclear responsibility" I meant to refer to the responsibility for
drafting a spec extension for virtio-net, and for creating a
proof-of-concept implementation for the virtio-net extension. As I wrote in
my previous email:

> I would prefer to first try to upstream the RTC spec without altering the
> net spec, while trying to avoid conflicts with the net spec. Maybe most of
> the RTC spec could then, as a second step, also be applied to the net
> device. The responsibility for this second step and the related proof of
> concept work would need to be clarified. To me it still seems better to not
> share the spec and address resource problems at the transport level.

>>>>>> Maybe the above use case could also be addressed by a standalone
>>>>>> RTC device as follows:
>>>>>>
>>>>>> 1) The net device exports a UUID object to which the RTC device
>>>>>> can refer to (so the drivers can link the net device to the RTC clock).
>>>>>>
>>>>>> 2) If needed, the RTC device aligns the lifetime of the clock to the net
>> device.
>>>>>> Then, clock lifetimes are no longer static, and can be aligned
>>>>>> with the net device. Through an eventq (to be added), the RTC
>>>>>> device announces clock creation/destruction, which can be
>>>>>> synchronized to the net
>>>> device state.
>>>>>>
>>>>> All of the infrastructure, cross device synchronization, its device
>>>>> side
>>>> management will come with equal complexity.
>>>>> And we will end up with sub-optimal way to using multiple PCI devices.
>>>>>
>>>>
>>>> As per my above reply, I think it would be better to tackle the
>>>> problem at the PCI transport level...
>>>>
>>>>> Net already has cvq required by RTC. So cvq commands can be common.
>>>>> Readq is additional, we need to define common between the two and
>>>>> way to
>>>> discover it.
>>>>>
>>>>
>>>>  I would still try to avoid mixing net and RTC details. RTC should
>>>> keep a dedicated cvq.
>>> Can you please explain why a dedicate cvq is needed on net device?
>>
>> I think the concern about redundant queues could be addressed by merging the
>> RTC readq and controlq. At the moment separating these queues would not
>> yield performance benefits, so I think I will just merge the readq and controlq
>> RTC queues in the next spec iteration.
>>
> Nice. Ok. lets have single cvq and related opcode range for RTC device and we will place them in the net side too.
> 

OK. In the net device, this queue should still be distinct from the net
controlq, which has a less predictable usage pattern than RTC control
requests.

>> As to why keeping the net and RTC queues separated: With a dedicated queue,
>> the RTC driver message processing could remain decoupled from the net driver.
>> Also, the net cvq format does not align well with the RTC messages.
>> RTC messages have detailed status codes, while net only has VIRTIO_NET_OK
>> and VIRTIO_NET_ERR.
>>
> We can extend the virtio net code too. If they are generic, it is likely that net will also find it useful.
> 
>>>> Then the RTC driver could use generic Virtio functionality to
>>>> send/receive rather than using an interface which is specific to the
>>>> net driver cvq.
>>>>
>>> Please explain why?
>>> We have dedicated vqs work ongoing for flow filters due to high rate of
>> requests requirement handled by hw.
>>>
>>> So would like to understand if we can extend similar dedicated cvq. Given
>> most of the queries done via cvq for rts are not time sensitive.
>>> Hence net cvq seems enough.
>>
>> This should have been addressed by merging the RTC readq and controlq, as per
>> my above reply.
>>
> Ok.
>  
>>>>>> Actually, the cross-timestamping support can be specific to the
>>>>>> combination of clock and hardware counter. This is explained in
>>>>>> more detail in the context of VIRTIO_RTC_M_READ_CROSS. I will
>>>>>> mention this
>>>> here as well.
>>>>>>
>>>>> Hardware counter is a misleading term here. It is really platform
>>>>> (system)
>>>> specific identifier.
>>>>>
>>>>
>>>> Maybe with the currently defined hardware counters there is a
>>>> one-to-one correspondence to platforms, but that need not be the
>>>> case. E.g. it might be more convenient for a device on an Intel x86
>>>> platform to read the Always Running Timer (ART) rather than the TSC
>>>> derived from the ART, and then the ART could become another HW counter
>> in the spec for x86.
>>>>
>>> Maybe I am wrong, but it looks like this is platform feature where driver can
>> device to read ART or ARM timer without any device knowledge?
>>
>> Not sure what is meant by "driver can device".
>>
> I meant to say that driver can read the clock from the system clock and device clock.
> At that point, cross timestamping convergence point is driver (and not device).
> Would that be sufficient?
> This way cross timestamping a driver level is still less accurate, but abstracted from the underlying system.
> 
> Or such level of accuracy is insufficient?
> 

Depends on the use case. Also, the driver-side cross-timestamping
performance will then depend a lot on the Virtio latency (while the
device-side cross-timestamping is very robust to Virtio latency).

The driver-side cross-timestamping can already be done now using
VIRTIO_RTC_M_READ. Arguably ioctl PTP_SYS_OFFSET_EXTENDED does such
cross-timestamping already with the current Linux kernel RFC driver [1], as
I described in an earlier email in this thread:

> The Linux kernel has ioctl PTP_SYS_OFFSET_EXTENDED, which will read the
> system time immediately before and after each PTP clock reading, leaving
> it up to userspace how to interpret the clock reading. This ioctl can
> also be used with the RFC Linux kernel driver implementing this spec
> [...]

But I think from the POV of the spec the driver-side cross-timestamping
would then be behavior internal to the driver, so does not need to be
discussed in the spec. In this case, Virtio only passes the device clock
reading via VIRTIO_RTC_M_READ.

Thank you for your comments,

Peter

[1] https://lore.kernel.org/all/20230818012014.212155-1-peter.hilber@opensynergy.com/T/#t
References:
- Re: [virtio-comment] [RFC PATCH v2] virtio-rtc: Add device specification.
  - From: Peter Hilber <peter.hilber@opensynergy.com>
- RE: [virtio-comment] [RFC PATCH v2] virtio-rtc: Add device specification.
  - From: Parav Pandit <parav@nvidia.com>