OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [PATCH] add virtio-pmem device spec


> 
> > This patch proposes a virtio specification for new
> > virtio pmem device. Virtio pmem is a paravirtualized
> > device which solves two problems:
> > 
> > - Provides emulation of persistent memory on host regular
> >   (non NVDIMM) storage.
> > - Allows the guest to bypass the page cache.
> > 
> > This is changed version from previous RFC [1], incorporated
> > changes suggested by Stefan, Michael & Cornerlia.
> > 
> > [1] https://lists.oasis-open.org/archives/virtio-dev/201903/msg00083.html
> > 
> > Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
> > ---
> >  conformance.tex |  23 ++++++++++-
> >  content.tex     |   1 +
> >  virtio-pmem.tex | 118
> >  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 140 insertions(+), 2 deletions(-)
> >  create mode 100644 virtio-pmem.tex
> 
> (...)
> 
> > diff --git a/virtio-pmem.tex b/virtio-pmem.tex
> > new file mode 100644
> > index 0000000..fa64c6b
> > --- /dev/null
> > +++ b/virtio-pmem.tex
> > @@ -0,0 +1,118 @@
> > +\section{PMEM Device}\label{sec:Device Types / PMEM Device}
> > +
> > +virtio pmem is an emulated persistent memory device. The
> > +device is plugged to the guest using virtio bus.
> 
> "virtio bus" sounds too Linux-specific. Maybe just say "virtio pmem is
> an emulated persistent memory devices using virtio."?

o.k
> 
> > +
> > +Device works as fake nvdimm device when emulated on host regular
> 
> s/Device/The devices/

o.k

> 
> (more of these below)
> 
> > +(non NVDIMM) device. Device provides a virtio based asynchronous
> > +flush mechanism to persists the guest writes. This avoids the
> 
> s/persists/persist/

o.k

> 
> > +need of separate caching inside the guest and host side caching
> > +is used. Under memory pressure, host makes efficient memory reclaim
> 
> s/host/the host/
> 
> > +decisions on uniform view of memory.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / PMEM Device / Device ID}
> > +  27
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / PMEM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] req_vq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / PMEM Device / Feature
> > bits}
> > +
> > +There are currently no feature bits defined for this device.
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / PMEM
> > Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_pmem_config {
> > +	uint64_t start;
> > +	uint64_t size;
> > +};
> > +\end{lstlisting}
> > +
> > +\field{start} contains the physical address of the start of the persistent
> > memory range.
> > +\field{size} in bytes length of address range.
> 
> "contains the length of the address range in bytes"?

done.

> 
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / PMEM Device /
> > Device Initialization}
> > +
> > +Device hot-plugs physical memory to guest address space. Persistent memory
> > device
> > +is emulated at host side.
> > +
> > +\begin{enumerate}
> > +  \item The driver reads the physical start address from \field{start}.
> > +  \item The driver reads the length of the persistent memory range from
> > \field{size}.
> > +  \end{enumerate}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types /
> > PMEM Device / Device Initialization}
> > +
> > +Host memory region MUST be mapped to guest address space in a
> 
> s/Host memory region/The host memory region/
> 
> > +way so that updates are visible to other processes mapping the
> > +same memory region.
> > +
> > +\subsection{Driver Initialization}\label{sec:Device Types / PMEM Driver /
> > Driver Initialization}
> > +
> > +Memory stores to the persistent memory range are not guaranteed to be
> > +persistent without further action. An explicit flush command is
> > +required to ensure persistence. The req_vq is used to perform flush
> > +commands.
> > +
> > +\drivernormative{\subsubsection}{Driver Initialization: flush
> > callback}{Device Types / PMEM Driver / Driver Initialization / flush
> > callback}
> > +
> > +Driver MUST implement a virtio based flush callback, otherwise driver
> 
> s/Driver/The driver/
> 
> > +will fail to register.
> 
> I'm not sure what a "virtio based flush callback" is supposed to mean,
> though?
> 
> Also, what does it mean if a driver "fails to register"? I don't think
> you can translate this concept to the spec, it seems to be an
> implementation detail.

right. o.k Will remove this part.

> 
> > +
> > +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver /
> > Driver Operation}
> > +
> > +The VIRTIO_PMEM_REQ_TYPE_FLUSH command persists memory writes that were
> > performed
> 
> Maybe "all memory writes to the persistent memory range", to be more
> clear?
> 
> > +before the command was submitted. Once the command completes the
> 
> s/the/those/, maybe?

feel any one should be ok. 

> 
> > +writes are guaranteed to be persistent.
> > +
> > +\drivernormative{\subsubsection}{Driver Operation: Virtqueue
> > command}{Device Types / PMEM Driver / Driver Operation / Virtqueue
> > command}
> > +
> > +The driver MUST submit a VIRTIO_PMEM_REQ_TYPE_FLUSH command after
> > performing
> > +memory writes that require persistence.
> > +
> > +The driver MUST wait for the VIRTIO_PMEM_REQ_TYPE_FLUSH command to
> > complete before
> > +assuming previous writes are persistent.
> > +
> > +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver /
> > Device Operation}
> > +
> > +\devicenormative{\subsubsection}{Device Operations: Virtqueue
> > flush}{Device Types / PMEM Device / Device Operation / Virtqueue flush}
> > +
> > +Device SHOULD handle multiple flush requests simultaneously using
> > +corresponding host flush mechanisms.
> > +
> > +\devicenormative{\subsubsection}{Device operations: Virtqueue
> > return}{Device Types / PMEM Device / Device Operation / Virtqueue return}
> > +
> > +Device SHOULD return integer '0' for success and '1' for failure.
> > +These errors are converted to corresponding error codes by guest as
> > +per architecture.
> 
> I think what the driver actually does with the 0/1 return values is
> already an implementation detail. Also, why 'SHOULD'? The driver needs
> to be sure that it won't interpret success as failure, and vice versa.
> At least "return 0 on success, !0 otherwise" needs to be a MUST, I
> think.

Agree. Will remove text desrcribing implementation details.

> 
> > +
> > +\subsection{Possible security implications}\label{sec:Device Types / PMEM
> > Device / Possible Security Implications}
> > +
> > +If two devices actually share the same memory, this creates a
> 
> Maybe "Two devices actually sharing the same memory creates a..."?

changed.

> 
> > +potential information leak as access patterns of one driver could
> > +be observable by another driver.
> > +
> > +This can happen for example if two devices are implemented in software
> > +by a hyper-visor, and two drivers are parts of VMs running on the
> > +hyper-visor. In this case, the timing of access to device memory
> 
> I'm not sure whether hyper-visor or hypervisor is the more common
> spelling here.

changed back to hypervisor.

> 
> > +might leak information about access patterns from one VM to another.
> > +
> > +This can include, but might not be limited to:
> > +\begin{enumerate}
> > +\item Configurations sharing a single region of device memory (even in a
> > read-only configuration)
> > +\item Configurations with a shared cache between devices (e.g. linux page
> > cache)
> 
> s/linux/Linux/
> 
> > +\item Configurations with memory deduplication techniques such as KSM
> > similar side-channels
> > + might be present if the device memory is shared with another system, e.g.
> > information about
> > + the hyper-visor/host page cache might leak into a VM guest.
> 
> I'm not quite able to make sense of this sentence -- is there a
> semicolon missing after 'KSM'?

added 
> 
> > +\end{enumerate}
> > +
> > +\subsection{Countermeasures}\label{sec:Device Types / PMEM Device /
> > Possible Security Implications / Countermeasures}
> > +Solution is to avoid sharing resources between devices.
> > +\begin{enumerate}
> > +\item Each VM must have its own device memory, not shared with any other
> > VM or process.
> > +\item If VM workload is a special application and there is no risk, its
> > okay to share the device memory.
> 
> s/VM workload/the VM workload/
> s/its/it is/
> 
> > +\item Don't allow host cache eviction from VM when device memory is shared
> > with other VM or host process.
> > +\end{enumerate}
> 
> Generally, I think this has moved a lot into the right direction; some
> general nits from my side, and some remaining things that sound a bit
> too Linux-specific.

Thank you for the review. I have incorporated changes suggested by you.
Will post a v2 with the changes.

Thanks,
Pankaj


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]