OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [RFC] virtio-pmem: PMEM device spec


Hi Stefan,

Thank you for the review. Please find my reply inline.

> > diff --git a/virtio-pmem.tex b/virtio-pmem.tex
> > new file mode 100644
> > index 0000000..04e07bb
> > --- /dev/null
> > +++ b/virtio-pmem.tex
> > @@ -0,0 +1,134 @@
> > +\section{PMEM Device}\label{sec:Device Types / PMEM Device}
> > +
> > +The virtio pmem is a fake persistent memory (NVDIMM) device
> > +used to bypass the guest page cache and provide a virtio
> > +based asynchronous flush mechanism. This avoids the need
> 
> Is there anything "fake" about virtio-pmem from the perspective of the
> device interface?

I think I should remove fake Keyword here.

> 
> What you are describing is one use case.  But on a platform that doesn't
> have existing physical NVDIMM interfaces, ACPI, etc maybe virtio-pmem
> would be used to pass through physical NVDIMMs from the host?  If you
> agree, then it might make sense to describe the device simply as a
> persistent memory device and give "fake NVDIMM" as an example use case
> in a separate paragraph.  This would make the text more future-proof.

Yes, good point point. I will add a separate paragraph for "fake NVDIMM"
as an example usecase. 

> 
> > +of a separate page cache in guest and keeps page cache only
> > +in the host. Under memory pressure, the host makes use of
> > +effecient memory reclaim decisions for page cache pages
> 
> efficient

o.k

> 
> Please use a spell-checker like aspell or ispell.

Sure.

> 
> > +of all the guests. This helps to reduce the memory footprint
> > +and fit more guests in the host system.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / PMEM Device / Device ID}
> > +  25
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / PMEM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] req_vq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / PMEM Device / Feature
> > bits}
> > +
> > +There are currently no feature bits defined for this device.
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / PMEM
> > Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_pmem_config {
> > +	le64 start;
> > +	le64 size;
> > +};
> > +\end{lstlisting}
> > +
> > +\field{start} contains the guest physical address of the start of the
> > +physical address range to be hotplugged into the guest address space
> > +using the pmem API.
> 
> What pmem API?  This specification shouldn't be written from a Linux
> kernel internals point of view.  It should be possible to implement BSD,
> Windows, etc drivers too.

o.k. will change this to generic naming common to other operating systems.

> 
> Just saying that this is the physical address of the start of the
> persistent memory range is enough.
> 
> > +\field{size} contains the length of this address range.
> 
> "in bytes" would be nice because it makes the units explicit.

yes.

> 
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / PMEM Device /
> > Device Initialization}
> > +
> > +Device hotplugs physical memory to guest address space. Persistent memory
> > device
> > +is emulated with file backed memory at host side.
> 
> File-backed memory is a detail of one possible use case.  The spec
> shouldn't be tied to this single use case.

o.k. Will try to make more generic text here as well. I think I need to consider
all the use-cases with real NVDIMM and file backed memory on regular disk?  

> 
> > +
> > +\begin{enumerate}
> > +\item Guest vpmem start is read from \field{start}.
> > +\item Guest vpmem end is read from \field{size}.
> > +\end{enumerate}
> 
> "vpmem" is not defined.  Please use consistent terminology.

o.k

> 
> I suggest structuring this subsection as follows:
> 
>   Initialization proceeds as follows:
> 
>   \begin{enumerate}
>   \item The driver reads the physical start address from \field{start}.
>   \item The driver reads the length of the persistent memory range from
>   \field{size}.
>   \end{enumerate}
> 
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types /
> > PMEM Device / Device Initialization}
> > +
> > +File backed memory SHOULD be memory mapped to guest address space with
> > SHARED
> > +memory mapping.
> 
> I'm not sure this point is appropriate for the virtio spec since it's a
> device implementation detail.  This is outside the scope of the device
> interface and should be dropped.
> 
> If you'd like to describe the file-backed memory use case, please do it
> in a non-normative section and use the appropriate terminology
> (MAP_SHARED) and explanations (shared mapping so that changes will be
> written back to the file).

I omitted MAP_SHARED because I thought its not generic to other operating system 
(e.g Windows).

> 
> > +
> > +\subsection{Driver Initialization}\label{sec:Device Types / PMEM Driver /
> > Driver Initialization}
> > +
> > +Driver hotplugs the physical memory and registers associated
> > +region with the pmem API. Also, configures a flush callback
> > +function with the corresponding region.
> 
> There are Linux internals that don't need to be in the spec.  Some
> kernels may not have a similar pmem API with a flush callback,
> registration, etc.

Agree.

> 
> Please describe the functionality instead:
> 
>   Memory stores to the persistent memory range are not guaranteed to be
>   persistent without further action.  An explicit flush command is
>   required to ensure persistence.  The req_vq is used to perform flush
>   commands.

This is better.

> 
> > +
> > +\drivernormative{\subsubsection}{Driver Initialization: Filesystem direct
> > access}{Device Types / PMEM Driver / Driver Initialization / Direct
> > access}
> > +
> > +Driver SHOULD enable filesystem direct access operations for
> > +read/write on the device.
> 
> This is specific to one possible use case for virtio-pmem and it's a
> driver implementation detail, not a requirement of the device interface.
> 
> Imagine a unikernel application that doesn't have a file system.  They
> just want to access the persistent memory range, they don't care about
> filesystem direct access.

Agree.

> 
> Please focus on the requirements/constraints/assumptions of the device
> interface, not the implementation details of the device emulation or the
> Linux guest driver.  That way the specification will be generally useful
> and will age better as the use cases change.

o.k

> 
> > +
> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio
> > flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> > +
> > +Driver SHOULD add implement a virtio based flush callback.
> 
> If flush is optional, how should a device that does not implement it
> behave:
> 1. Don't implement req_vq
> Or
> 2. Fail flush commands with a certain error code
> ?

I think this needs to be structured again after non ACPI real NVDIMM passthrough
use-case which you pointed above.

> 
> > +Driver SHOULD disable other FLUSH/SYNC mechanisms for the device
> > +when virtio flush is configured.
> 
> What does this mean?

Either Virtio based flush is enabled or alternative flush (e.g MAP_SYNC is enabled).

> 
> > +
> > +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver /
> > Driver Operation}
> > +\drivernormative{\subsubsection}{Driver Operation: Virtqueue
> > command}{Device Types / PMEM Driver / Driver Operation / Virtqueue
> > command}
> > +
> > +Driver SHOULD send VIRTIO_FLUSH command on request virtqueue,
> > +blocks guest userspace process utill host completes fsync/flush.
> 
> userspace processes are outside the scope of the virtio specification.
> There may be no "userspace" inside the guest.  Please stick to
> describing the device interface.
> 
> I think the intention here is:
> 
>   The VIRTIO_FLUSH command persists memory writes that were performed
>   before the command was submitted.  Once the command completes the
>   writes are guaranteed to be persistent.
> 
> And the normative part is:
> 
>   The driver MUST submit a VIRTIO_FLUSH command after performing memory
>   writes that require persistence.
> 
>   The driver MUST wait for the VIRTIO_FLUSH command to complete before
>   assuming previous writes are persistent.

Thanks! this is better.

> 
> > +Driver SHOULD handle multiple fsync requests on files present
> > +on the device.
> 
> fsync is outside the scope of the device interface and, hence, this
> specification.  Perhaps:
> 
>   The driver MAY have multiple VIRTIO_FLUSH commands pending, though
>   each command completion only guarantees persistence of memory writes
>   that were performed before the command was submitted.

o.k

> 
> > +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver /
> > Device Operation}
> > +
> > +\devicenormative{\subsubsection}{Device Operations}{Device Types / PMEM
> > Device / Device Operation / Virtqueue flush}
> > +
> > +Device SHOULD handle multiple flush requests simultaneously using
> > +host filesystem fsync/flush call.
> 
> fsync/flush is an implementation detail.  I'll stop reviewing at this
> point to avoid repetition :).

Thanks you very much for in detail review. I will work on the suggestions.

Best regards,
Pankaj
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]