OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: virtio-blk vs. caching on FVP Base


(1) So, I looked into the virtio-blk + cache modelling problem. To
recap, the problem can be stated as follows:

- Binary to run: "ArmVExpress-FVP-AArch64" build of current upstream
- Platform: FVP Base Model, Aarch64
- Platform setting: -C cache_state_modelled=1

When the first virtio-blk request is submitted via the virtio ring (by
writing the Queue Notify MMIO register), the Model issues the following

  WARNING: Virtio available ring flags unrecognised (0xafaf). Not
  fetching descriptor.

and the virtio transfer doesn't progress.

When "cache_state_modelled" is zero, this doesn't happen, and the
virtio-block device works correctly.

(2) The guest call chain triggering the warning follows:

VirtioBlkReadBlocks()            [OvmfPkg/VirtioBlkDxe/VirtioBlk.c]
  SynchronousRequest()           [OvmfPkg/VirtioBlkDxe/VirtioBlk.c]
    VirtioFlush()                [OvmfPkg/Library/VirtioLib/VirtioLib.c]
      VirtioMmioSetQueueNotify() [OvmfPkg/Library/VirtioMmioDeviceLib/VirtioMmioDeviceFunctions.c]
        MmioWrite32()            [MdePkg/Library/BaseIoLibIntrinsic/IoLibArm.c]

I hex-dumped the ring contents (all three pages) inside the guest. It
looked normal; in particular the available ring flags were valid.
Nowhere in the ring could I find the reported 0xafaf value.

(3) The virtio guest drivers take care to perform all synchronization /
barriers that the virtio specification [1] lists. In particular please
see the "MUST perform a suitable memory barrier" references in:

  3.2   Device Operation
  3.2.1 Supplying Buffers to The Device

(Side note: I just noticed a typo there in step 4, namely "suitable a

The edk2 guest drivers conform to this requirement by:
- accessing the ring area through pointers to volatile objects,
- implementing the "suitable memory barrier" by calling the
  MemoryFence() BaseLib function.

The "Driver Writer's Guide for UEFI 2.3.1" [2] lists these two tools for
memory synchronization, in:

  4     General Driver Design Guidelines
  4.2   Maximize Platform Compatibility
  4.2.6 Memory ordering

MemoryFence() is also recommended in

  18     PCI Driver Design Guidelines
  18.5   PCI DMA
  18.5.2 Weakly ordered memory transactions
  18.5.3 Bus Master Read and Write Operations

(4) On the platform in question, the MemoryFence() function is
implemented as

  MemoryFence() [MdePkg/Library/BaseLib/AArch64/MemoryFence.S]
    DMB SY

(5) After trying to parse parts of the ARMv8-A Architecture Reference
Manual [3], I had the idea to strengthen this primitive by replacing the
DMB with DSB. This idea didn't help at all, the FVP Base model printed
the same warning at the same spot.

(6) I grepped the ARM-specific parts of the edk2 tree for cache
invalidation primitives. I found out that I could re-qualify the virtio
ring area as uncacheable memory, by utilizing the
SetMemorySpaceAttributes() Global Coherency Domain DXE service. (See
Volume 2, Driver Execution Environment Core Interface, of the Platform
Initialization Specification [4].)

This approach is visible in the first attached patch. The call in
question resolves as:

VirtioBlkInit()                           [OvmfPkg/VirtioBlkDxe/VirtioBlk.c]
  VirtioRingInit()                        [OvmfPkg/Library/VirtioLib/VirtioLib.c]
    CoreSetMemorySpaceAttributes()        [MdeModulePkg/Core/Dxe/Gcd/Gcd.c]
      CoreConvertSpace()                  [MdeModulePkg/Core/Dxe/Gcd/Gcd.c]
        SetMemoryAttributes()             [ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c]
          ArmCleanInvalidateDataCache()   [ArmPkg/Library/ArmLib/AArch64/AArch64Lib.c]
          ArmInvalidateInstructionCache() [ArmPkg/Library/ArmLib/AArch64/AArch64Support.S]
          ArmInvalidateTlb()              [ArmPkg/Library/ArmLib/Common/AArch64/ArmLibSupport.S]

Note that in

  2.3     Calling Conventions
  2.3.6   AArch64 Platforms Memory types

the UEFI 2.4 Specification [5] lists the following memory type bindings:

  EFI Memory Type  ARM Memory Type:         ARM Memory Type:
                   MAIR attribute encoding  Meaning
                   Attr<n> [7:4] [3:0]
  ---------------  -----------------------  ------------------------------
  EFI_MEMORY_UC    0000 0000                Device-nGnRnE
  (Not cacheable)                           (Device non-Gathering,
                                            non-Reordering, no Early Write

  EFI_MEMORY_WC    0100 0100                Normal Memory
  (Write combine)                           Outer non-cacheable
                                            Inner non-cacheable

  ...              ...                      ...


They both worked in the sense that each got me over the immediate FVP
Base model warning, and *something* was loaded from the virtio-blk disk.
However that "something" (which should have been a grub2 binary) could
not be executed, implying that further corruption was happening.

The straightforward idea is that it's not enough to make the ring area
itself uncacheable: the scatter-gather memory areas *referenced* by the
descriptors located inside the ring have to be uncacheable too.

This would be very inconvenient, however, because the "scatter-gather
memory areas" in question are almost always local variables, of which
are many, and whose size is mostly smaller than a page.

Hence I didn't pursue this further.

(In any case, if this were deemed the "proper" solution, then I'd
suggest to extend the virtio specification [1] with a note about memory
caching attributes, somewhere near the barrier language.)

(7) The problem was ultimately solved (as a proof-of-concept) with the
second attached patch. This patch sort of proves that the current
Aarch64 implementation of MemoryFence() -- ie. DMB SY -- is not strong
enough for MemoryFence()'s advertised purpose.

Once I replaced VirtioLib's MemoryFence() primitives with

    MemoryFence()                       [MdePkg/Library/BaseLib/AArch64/MemoryFence.S]
    ArmCleanInvalidateDataCache()       [ArmPkg/Library/ArmLib/AArch64/AArch64Lib.c]
    MemoryFence()                       [MdePkg/Library/BaseLib/AArch64/MemoryFence.S]

everything started to work.

(8) Hence I'm proposing / requesting that the MemoryFence()
implementation on Aarc64() be strengthened so that it also flushes the
data cache completely. Unfortunately I can't do this myself (ie. post a
patch) because I don't know enough about ARM.

The PoC in (7) is clearly overkill. I only squeezed the
ArmCleanInvalidateDataCache() call between two DMB SY's because I found
something frightening in the ARMv8-A Architecture Reference Manual [3]
about ordering between Data Cache manipulation and DMB instructions.
Hacking the PoC like this was easier than actually understanding the

That said, ArmCleanInvalidateDataCache() seems to execute

  dc      cisw, x0    // Clean and Invalidate this line
  dsb     sy

in some kind of a loop. (I did find DC CISW in the manual [3], but had
no idea how to say "all lines at once" in x0. Which is probably what the
loop is for.) It already includes DSB SY, which should be stronger than
the sole DMB SY that MemoryFence() currently consists of.

This suggests that reimplementing MemoryFence() as a single call to
ArmCleanInvalidateDataCache() would suffice.

Olivier, what do you think?

(9) I've included the virtio-comment list because of the typo notice in
(3), and because of the suggestion to *maybe* mention memory caching
attributes. Sorry for the noise.


[1] http://docs.oasis-open.org/virtio/virtio/v1.0/csprd01/virtio-v1.0-csprd01.pdf
[2] http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=UEFI_Driver_Writer%27s_Guide
[3] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.subset.architecture.reference/index.html
[4] http://www.uefi.org/specs/download
[5] http://www.uefi.org/specs/download

From 45f2043b6f79e5384b9b61bd8cb473a44a9a9dc8 Mon Sep 17 00:00:00 2001
From: Laszlo Ersek <lersek@redhat.com>
Date: Wed, 22 Jan 2014 23:55:47 +0100
Subject: [PATCH] VirtioLib: re-qualify the virtio ring area as UC / WC

Marking the ring area as EFI_MEMORY_UC or EFI_MEMORY_WC prevents it from
being cached. It doesn't solve the whole visibility problem though because
all areas *referenced* by the descriptors would have to undergo the same
treatment. Since that's extremely impractical, this is a no-go. Instead,
MemoryFence() should just work.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
 OvmfPkg/Library/VirtioLib/VirtioLib.inf   |  1 +
 OvmfPkg/Include/IndustryStandard/Virtio.h |  1 +
 OvmfPkg/Library/VirtioLib/VirtioLib.c     | 38 +++++++++++++++++++++++++++++--
 3 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/OvmfPkg/Library/VirtioLib/VirtioLib.inf b/OvmfPkg/Library/VirtioLib/VirtioLib.inf
index fb5897a..27d1412 100644
--- a/OvmfPkg/Library/VirtioLib/VirtioLib.inf
+++ b/OvmfPkg/Library/VirtioLib/VirtioLib.inf
@@ -32,5 +32,6 @@
+  DxeServicesTableLib
diff --git a/OvmfPkg/Include/IndustryStandard/Virtio.h b/OvmfPkg/Include/IndustryStandard/Virtio.h
index fcf7b67..df001b9 100644
--- a/OvmfPkg/Include/IndustryStandard/Virtio.h
+++ b/OvmfPkg/Include/IndustryStandard/Virtio.h
@@ -148,10 +148,11 @@ typedef struct {
 #pragma pack()
 typedef struct {
   UINTN               NumPages;
   VOID                *Base;     // deallocate only this field
+  UINT64              OrigAttr;  // ... and restore these GCD attributes
   volatile VRING_DESC *Desc;     // QueueSize elements
   VRING_AVAIL         Avail;
   VRING_USED          Used;
   UINT16              QueueSize;
diff --git a/OvmfPkg/Library/VirtioLib/VirtioLib.c b/OvmfPkg/Library/VirtioLib/VirtioLib.c
index 54cf225..b4ea4ae 100644
--- a/OvmfPkg/Library/VirtioLib/VirtioLib.c
+++ b/OvmfPkg/Library/VirtioLib/VirtioLib.c
@@ -18,10 +18,12 @@
 #include <Library/BaseLib.h>
 #include <Library/BaseMemoryLib.h>
 #include <Library/DebugLib.h>
 #include <Library/MemoryAllocationLib.h>
 #include <Library/UefiBootServicesTableLib.h>
+#include <Pi/PiDxeCis.h>
+#include <Library/DxeServicesTableLib.h>
 #include <Library/VirtioLib.h>
@@ -43,10 +45,16 @@
   @retval EFI_OUT_OF_RESOURCES  AllocatePages() failed to allocate contiguous
                                 pages for the requested QueueSize. Fields of
                                 Ring have indeterminate value.
+  @return                       Error codes from the SetMemorySpaceAttributes()
+                                Global Coherency Domain service, if the ring
+                                area could not be re-qualified as uncacheable
+                                memory. Fields of Ring have indeterminate
+                                value.
   @retval EFI_SUCCESS           Allocation and setup successful. Ring->Base
                                 (and nothing else) is responsible for
@@ -55,12 +63,14 @@ EFIAPI
 VirtioRingInit (
   IN  UINT16 QueueSize,
   OUT VRING  *Ring
-  UINTN          RingSize;
-  volatile UINT8 *RingPagesPtr;
+  UINTN                           RingSize;
+  EFI_STATUS                      Status;
+  volatile UINT8                  *RingPagesPtr;
   RingSize = ALIGN_VALUE (
                sizeof *Ring->Desc            * QueueSize +
                sizeof *Ring->Avail.Flags                 +
                sizeof *Ring->Avail.Idx                   +
@@ -78,10 +88,26 @@ VirtioRingInit (
   Ring->NumPages = EFI_SIZE_TO_PAGES (RingSize);
   Ring->Base = AllocatePages (Ring->NumPages);
   if (Ring->Base == NULL) {
+  Status = gDS->GetMemorySpaceDescriptor (
+                  (EFI_PHYSICAL_ADDRESS)(UINTN) Ring->Base,
+                  &Descriptor);
+  Ring->OrigAttr = Descriptor.Attributes;
+  Status = gDS->SetMemorySpaceAttributes (
+                  (EFI_PHYSICAL_ADDRESS)(UINTN) Ring->Base,
+                  RingSize,
+                  EFI_MEMORY_WC);
+  if (EFI_ERROR (Status)) {
+    FreePages (Ring->Base, Ring->NumPages);
+    return Status;
+  }
   SetMem (Ring->Base, RingSize, 0x00);
   RingPagesPtr = Ring->Base;
   Ring->Desc = (volatile VOID *) RingPagesPtr;
   RingPagesPtr += sizeof *Ring->Desc * QueueSize;
@@ -134,10 +160,18 @@ VOID
 VirtioRingUninit (
+  EFI_STATUS Status;
+  Status = gDS->SetMemorySpaceAttributes (
+                  (EFI_PHYSICAL_ADDRESS)(UINTN) Ring->Base,
+                  Ring->NumPages * EFI_PAGE_SIZE,
+                  Ring->OrigAttr);
   FreePages (Ring->Base, Ring->NumPages);
   SetMem (Ring, sizeof *Ring, 0x00);

From 83b74495ef9853374bcc0a41591c8b80df4eba66 Mon Sep 17 00:00:00 2001
From: Laszlo Ersek <lersek@redhat.com>
Date: Thu, 23 Jan 2014 00:30:57 +0100
Subject: [PATCH] VirtioLib: flush data cache as part of MemoryFence() on

Just a proof of concept (that actually works). Note that the VirtioNetDxe
driver contains custom ring massaging (ie. explicit MemoryFence() calls),
and that is not updated here. However, for the VirtioBlkDxe case, this
patch suffices.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
 OvmfPkg/Library/VirtioLib/VirtioLib.inf |  2 ++
 OvmfPkg/Library/VirtioLib/VirtioLib.c   | 22 +++++++++++++++++-----
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/OvmfPkg/Library/VirtioLib/VirtioLib.inf b/OvmfPkg/Library/VirtioLib/VirtioLib.inf
index fb5897a..6decc95 100644
--- a/OvmfPkg/Library/VirtioLib/VirtioLib.inf
+++ b/OvmfPkg/Library/VirtioLib/VirtioLib.inf
@@ -25,12 +25,14 @@
+  ArmPkg/ArmPkg.dec
+  ArmLib
diff --git a/OvmfPkg/Library/VirtioLib/VirtioLib.c b/OvmfPkg/Library/VirtioLib/VirtioLib.c
index 54cf225..d6b2d68 100644
--- a/OvmfPkg/Library/VirtioLib/VirtioLib.c
+++ b/OvmfPkg/Library/VirtioLib/VirtioLib.c
@@ -18,13 +18,25 @@
 #include <Library/BaseLib.h>
 #include <Library/BaseMemoryLib.h>
 #include <Library/DebugLib.h>
 #include <Library/MemoryAllocationLib.h>
 #include <Library/UefiBootServicesTableLib.h>
+#include <Library/ArmLib.h>
 #include <Library/VirtioLib.h>
+MemoryFenceX (
+  )
+  MemoryFence ();
+  ArmCleanInvalidateDataCache ();
+  MemoryFence ();
   Configure a virtio ring.
@@ -280,18 +292,18 @@ VirtioFlush (
     Indices->HeadDescIdx % Ring->QueueSize;
   // virtio-0.9.5, Updating the Index Field
-  MemoryFence();
+  MemoryFenceX();
   *Ring->Avail.Idx = NextAvailIdx;
   // virtio-0.9.5, Notifying the Device -- gratuitous notifications are
   // OK.
-  MemoryFence();
+  MemoryFenceX();
   Status = VirtIo->SetQueueNotify (VirtIo, VirtQueueId);
   if (EFI_ERROR (Status)) {
     return Status;
@@ -302,18 +314,18 @@ VirtioFlush (
   // synchronous, lock-step progress.
   // Keep slowing down until we reach a poll period of slightly above 1 ms.
   PollPeriodUsecs = 1;
-  MemoryFence();
+  MemoryFenceX();
   while (*Ring->Used.Idx != NextAvailIdx) {
     gBS->Stall (PollPeriodUsecs); // calls AcpiTimerLib::MicroSecondDelay
     if (PollPeriodUsecs < 1024) {
       PollPeriodUsecs *= 2;
-    MemoryFence();
+    MemoryFenceX();
-  MemoryFence();
+  MemoryFenceX();
   return EFI_SUCCESS;

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]