OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [RFC] virtio-pmem: PMEM device spec


Hi Cornelia,

Thank you for the review. Please find my reply inline.

> 
> > This patch proposes a virtio specification for new
> > virtio pmem device. Virtio pmem is a paravirtualized
> > device which allows the guest to bypass page cache.
> > Previous posting of kernel driver is [1] and Qemu
> > device is [2]. Specification has introduction of
> > virtio pmem device with other implementation details.
> > 
> > I have also listed concerns with page cache side channel
> > attacks in previous kernel driver posting [1] and
> > possible countermeasures based on discussion [4].
> > I have also created an virtio-spec issue [5] for this.
> > 
> > Request to provide feedback on device specification.
> > 
> > [1] https://lkml.org/lkml/2019/1/9/471
> > [2] https://marc.info/?l=qemu-devel&m=153555721901824&w=2
> > [3] https://lkml.org/lkml/2019/1/9/541
> > [4] https://lkml.org/lkml/2019/2/4/1151
> > [5] https://github.com/oasis-tcs/virtio-spec/issues/38
> > 
> > Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
> > ---
> >  conformance.tex |  14 ++++++
> >  content.tex     |   3 ++
> >  virtio-pmem.tex | 134
> >  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 151 insertions(+)
> >  create mode 100644 virtio-pmem.tex
> > 
> > diff --git a/conformance.tex b/conformance.tex
> > index ad7e82e..2133a7c 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -181,6 +181,20 @@ A socket driver MUST conform to the following
> > normative statements:
> >  \item \ref{drivernormative:Device Types / Socket Device / Device Operation
> >  / Device Events}
> >  \end{itemize}
> >  
> > +\subsection{PMEM Driver Conformance}\label{sec:Conformance / Driver
> > Conformance / PMEM Driver Conformance}
> > +
> > +A PMEM driver MUST conform to the following normative statements:
> > +
> > +\begin{itemize}
> > +\item \ref{devicenormative:Device Types / PMEM Device / Device
> > Initialization}
> > +\item \ref{drivernormative:Device Types / PMEM Driver / Driver
> > Initialization / Direct access}
> > +\item \ref{drivernormative:Device Types / PMEM Driver / Driver
> > Initialization / Virtio flush}
> > +\item \ref{drivernormative:Device Types / PMEM Driver / Driver Operation /
> > Virtqueue command}
> > +\item \ref{devicenormative:Device Types / PMEM Device / Device Operation /
> > Virtqueue flush}
> > +\item \ref{devicenormative:Device Types / PMEM Device / Device Operation /
> > Virtqueue return}
> 
> This section must only list the _driver_normative statements. The
> devicenormative statements must be moved to a PMEM device conformance
> section below.

Sure, will change.

> 
> > +\end{itemize}
> > +
> > +
> >  \section{Device Conformance}\label{sec:Conformance / Device Conformance}
> >  
> >  A device MUST conform to the following normative statements:
> > diff --git a/content.tex b/content.tex
> > index ede0ef6..6077d32 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -2634,6 +2634,8 @@ Device ID  &  Virtio Device    \\
> >  \hline
> >  24         &   Memory device \\
> >  \hline
> > +25         &   PMEM device \\
> > +\hline
> 
> Probably better to split this out, as you already sent a out a patch
> reserving the id.

Yes. o.k.

> 
> I sure hope we'll be able to process the outstanding device id
> reservations quickly once 1.1 is out of the door :/

o.k. Thank you!

> 
> >  \end{tabular}
> >  
> >  Some of the devices above are unspecified by this document,
> > @@ -5595,6 +5597,7 @@ descriptor for the \field{sense_len},
> > \field{residual},
> >  \input{virtio-input.tex}
> >  \input{virtio-crypto.tex}
> >  \input{virtio-vsock.tex}
> > +\input{virtio-pmem.tex}
> >  
> >  \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> >  
> > diff --git a/virtio-pmem.tex b/virtio-pmem.tex
> > new file mode 100644
> > index 0000000..04e07bb
> > --- /dev/null
> > +++ b/virtio-pmem.tex
> > @@ -0,0 +1,134 @@
> > +\section{PMEM Device}\label{sec:Device Types / PMEM Device}
> > +
> > +The virtio pmem is a fake persistent memory (NVDIMM) device
> > +used to bypass the guest page cache and provide a virtio
> > +based asynchronous flush mechanism. This avoids the need
> > +of a separate page cache in guest and keeps page cache only
> 
> s/of/for/
> s/guest/the guest/
> s/page cache/the page cache/

will change this as suggested.

> 
> > +in the host. Under memory pressure, the host makes use of
> 
> s/makes use/is expected to make use/
> ?

o.k

> 
> > +effecient memory reclaim decisions for page cache pages
> > +of all the guests. This helps to reduce the memory footprint
> > +and fit more guests in the host system.
> > +
> > +\subsection{Device ID}\label{sec:Device Types / PMEM Device / Device ID}
> > +  25
> > +
> > +\subsection{Virtqueues}\label{sec:Device Types / PMEM Device / Virtqueues}
> > +\begin{description}
> > +\item[0] req_vq
> > +\end{description}
> > +
> > +\subsection{Feature bits}\label{sec:Device Types / PMEM Device / Feature
> > bits}
> > +
> > +There are currently no feature bits defined for this device.
> > +
> > +\subsection{Device configuration layout}\label{sec:Device Types / PMEM
> > Device / Device configuration layout}
> > +
> > +\begin{lstlisting}
> > +struct virtio_pmem_config {
> > +	le64 start;
> > +	le64 size;
> > +};
> > +\end{lstlisting}
> > +
> > +\field{start} contains the guest physical address of the start of the
> > +physical address range to be hotplugged into the guest address space
> > +using the pmem API.
> 
> I think we need a reference to what the pmem API is somewhere.

Sure.

> 
> > +\field{size} contains the length of this address range.
> > +
> > +\subsection{Device Initialization}\label{sec:Device Types / PMEM Device /
> > Device Initialization}
> > +
> > +Device hotplugs physical memory to guest address space. Persistent memory
> > device
> 
> s/Device/The device/

o.k

> 
> > +is emulated with file backed memory at host side.
> 
> Is "file backed memory" generic enough, or is the concept too
> unix-specific?

I will change it to "file backed mapping"?

> 
> (I won't comment further on the memory-specific wording in here, will
> leave that to folks more familiar with that area.)
> 
> > +
> > +\begin{enumerate}
> > +\item Guest vpmem start is read from \field{start}.
> > +\item Guest vpmem end is read from \field{size}.
> > +\end{enumerate}
> > +
> > +\devicenormative{\subsubsection}{Device Initialization}{Device Types /
> > PMEM Device / Device Initialization}
> > +
> > +File backed memory SHOULD be memory mapped to guest address space with
> > SHARED
> > +memory mapping.
> > +
> > +\subsection{Driver Initialization}\label{sec:Device Types / PMEM Driver /
> > Driver Initialization}
> > +
> > +Driver hotplugs the physical memory and registers associated
> 
> s/Driver/The driver/
> 
> (more instances of that below)

o.k

> 
> > +region with the pmem API. Also, configures a flush callback
> > +function with the corresponding region.
> > +
> > +\drivernormative{\subsubsection}{Driver Initialization: Filesystem direct
> > access}{Device Types / PMEM Driver / Driver Initialization / Direct
> > access}
> > +
> > +Driver SHOULD enable filesystem direct access operations for
> > +read/write on the device.
> > +
> > +\drivernormative{\subsubsection}{Driver Initialization: Virtio
> > flush}{Device Types / PMEM Driver / Driver Initialization / Virtio flush}
> > +
> > +Driver SHOULD add implement a virtio based flush callback.
> > +Driver SHOULD disable other FLUSH/SYNC mechanisms for the device
> > +when virtio flush is configured.
> > +
> > +\subsection{Driver Operations}\label{sec:Device Types / PMEM Driver /
> > Driver Operation}
> > +\drivernormative{\subsubsection}{Driver Operation: Virtqueue
> > command}{Device Types / PMEM Driver / Driver Operation / Virtqueue
> > command}
> > +
> > +Driver SHOULD send VIRTIO_FLUSH command on request virtqueue,
> > +blocks guest userspace process utill host completes fsync/flush.
> 
> s/utill/until/

o.k

> 
> > +Driver SHOULD handle multiple fsync requests on files present
> > +on the device.
> > +
> > +\subsection{Device Operations}\label{sec:Device Types / PMEM Driver /
> > Device Operation}
> > +
> > +\devicenormative{\subsubsection}{Device Operations}{Device Types / PMEM
> > Device / Device Operation / Virtqueue flush}
> > +
> > +Device SHOULD handle multiple flush requests simultaneously using
> 
> s/Device/The device/
> 
> (more below)

o.k

> 
> > +host filesystem fsync/flush call.
> > +
> > +\devicenormative{\subsubsection}{Device operations}{Device Types / PMEM
> > Device / Device Operation / Virtqueue return}
> > +
> > +Device SHOULD return integer '0' for success and '-1' for failure.
> > +These errors are converted to corresponding error codes by guest as
> > +per architecture.
> 
> I think that is an implementation-specific detail that does not touch
> the device/driver interface?

yes. I thought we should explain about device return values and accordingly
driver handles them.

Best regards,
Pankaj

> 
> > +
> > +\subsection{Possible security implications}\label{sec:Device Types / PMEM
> > Device / Possible Security Implications}
> > +
> > +There could be potential security implications depending on how
> > +memory mapped host backing file is used. By default device emulation
> > +is done with SHARED mapping. There is a contract between guest and host
> > +process to access same backing file for read/write operations.
> > +
> > +If a malicious guest or host userspace map the same backing file,
> > +attacking process can make use of known cache side channel attacks
> > +to predict the current state of shared page cache page. If both
> > +attacker and victim somehow execute same shared code after a
> > +flush/evict call, with difference in execution timing attacker
> > +could infer another guest local data or host data. Though this is
> > +not easy and same challenges exist as with bare metal host system
> > +when userspace share same backing file.
> > +
> > +\subsection{Countermeasures}\label{sec:Device Types / PMEM Device /
> > Possible Security Implications / Countermeasures}
> > +
> > +\subsubsection{ With SHARED mapping}\label{sec:Device Types / PMEM Device
> > / Possible Security Implications / Countermeasures / SHARED}
> > +
> > +If device backing backing file is shared with multiple guests or host
> > +processes, this may act as a metric for page cache side channel attack.
> > +As a counter measure every guest should have its own(not shared with
> > +another guest) SHARED backing file and gets populated a per host process
> > +page cache pages.
> > +
> > +\subsubsection{ With PRIVATE mapping}\label{sec:Device Types / PMEM Device
> > / Possible Security Implications / Countermeasures / PRIVATE}
> > +There maybe be chances of side channels attack with PRIVATE
> > +memory mapping similar to SHARED with read-only shared mappings.
> > +PRIVATE is not used for virtio pmem making this usecase
> > +irrelevant.
> > +
> > +\subsubsection{ Workload specific mapping}\label{sec:Device Types / PMEM
> > Device / Possible Security Implications / Countermeasures / Workload}
> > +For SHARED mapping, if workload is single application inside
> > +guest and there is no risk with sharing of data between guests.
> > +Guest sharing same backing file with SHARED mapping can be
> > +used as a valid configuration.
> > +
> > +\subsubsection{ Prevent cache eviction}\label{sec:Device Types / PMEM
> > Device / Possible Security Implications / Countermeasures / Cache
> > eviction}
> > +Don't allow cache evict from guest filesystem trim/discard command
> > +with virtio pmem. This rules out any possibility of evict-reload
> > +page cache side channel attacks if backing disk is shared(SHARED)
> > +with mutliple guests. Though if we use per device backing file with
> > +shared mapping this countermeasure is not required.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]