[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] [RFC PATCH] virtio-blk: add zoned block device specification
On Sat, Jun 04, 2022 at 12:09:46AM +0000, Dmitry Fomichev wrote: > On Fri, 2022-06-03 at 09:08 +0100, Stefan Hajnoczi wrote: > > (Resent from my subscriber email.) > > > > On Fri, 3 Jun 2022 at 03:30, Dmitry Fomichev <Dmitry.Fomichev@wdc.com> wrote: > > > > > > On Thu, 2022-06-02 at 08:23 +0100, Stefan Hajnoczi wrote: > > > > (Resending from my mailing list subscriber address so this makes it onto > > > > the list.) > > > > > > > > On Thu, 2 Jun 2022 at 00:56, Dmitry Fomichev <dmitry.fomichev@wdc.com> > > > > wrote: > > > > Here is the non-union representation of the structs. Note that this > > > > means existing request processing code can handle IN/OUT/FLUSH even > > > > when VIRTIO_BLK_F_ZONED is negotiated. We don't need two different > > > > code paths for these common commands: > > > > > > > > /* IN/OUT/FLUSH/DISCARD/WRITE_ZEROES */ > > > > struct virtio_blk_req { > > > > le32 type; > > > > le32 reserved; > > > > le64 sector; > > > > u8 data[]; > > > > u8 status; > > > > }; > > > > > > > > /* ZONE_APPEND */ > > > > struct virtio_blk_req_zone_append { > > > > le32 type; > > > > le32 reserved; > > > > le64 sector; > > > > u8 data[]; > > > > le64 zone_result; > > > > u8 status; > > > > }; > > > > > > > > /* ZONE_OPEN, ZONE_CLOSE, ZONE_FINISH, ZONE_RESET */ > > > > struct virtio_blk_req_zone_mgmt_send { > > > > le32 type; > > > > le32 reserved; > > > > le64 sector; > > > > /* ALL zone operation flag */ > > > > __u8 all; > > > > __u8 unused1[3]; > > > > u8 data[]; > > > > u8 status; > > > > }; > > > > > > > > /* ZONE_REPORT */ > > > > struct virtio_blk_req_zone_mgmt_receive { > > > > le32 type; > > > > le32 reserved; > > > > le64 sector; > > > > /* Partial zone report flag */ > > > > __u8 partial; > > > > __u8 unused2[3]; > > > > u8 data[]; > > > > u8 status; > > > > }; > > > > > > We felt that it is less confusing to have a single ZBD request structure so > > > there > > > is 1:1 relationship between the VIRTIO_BLK_F_ZONED bit value and the choice > > > of > > > the request - > > > > > > VIRTIO_BLK_F_ZONED off -> virtio_blk_req > > > VIRTIO_BLK_F_ZONED on -> virtio_blk_zoned_req > > > > > > We also want to make sure that zone management/send/receive and append > > > requests > > > are always the same size and it should be easier with a single request > > > structure. > > > > > > It is possible to use a simpler, but less structured union here - > > > > > > struct virtio_blk_zoned_req { > > > le32 type; > > > le32 reserved; > > > le64 sector; > > > union zoned_params { > > > /* ALL zone operation flag */ > > > __u8 all; > > > /* Partial zone report flag */ > > > __u8 partial; > > > __u32 rsvd1; > > > } zone; > > > u8 data[]; > > > le64 zone_result; > > > u8 reserved1[3]; > > > u8 status; > > > } > > > > > > Using the single code path is very important and the PoC Linux driver code > > > that I > > > have sketched up achieves that. Since the request header is identical for the > > > both cases, I hope that it can be done in QEMU too. > > > > There are two possibilities for optimizing the request layout: > > 1. For drivers and devices that implement both !VIRTIO_BLK_F_ZONED and > > VIRTIO_BLK_F_ZONED. Here it's best to share the same request layout > > for commands available in both modes. In other words IN and OUT > > requests to zoned devices should use struct virtio_blk_req so code can > > be shared and only commands unique to zoned devices use struct > > virtio_blk_zoned_req. > > 2. For drivers and devices that implement only one of > > !VIRTIO_BLK_F_ZONED and VIRTIO_BLK_F_ZONED. Here consistent request > > layout is best so the code parses all commands in the same way. In > > other words, IN and OUT requests to zoned devices use struct > > virtio_blk_zoned_req so that the implementation doesn't have to bother > > with struct virtio_blk_req at all. > > > > I think you chose #2 and I gravitated towards #1. Which one is optimal > > depends on the use case. Feel free to stick with #2. > > > > OK. We certainly considered #1, but we found that in Linux virtio-blk driver > implementation, the request size is fixed. There is the member called vblk- > >tag_set.cmd_size and it is set to sizeof(struct virtblk_req) + <sg list size> as > a part of the device scan routine. Since the Zone Append request has to be larger > than the regular (non-zoned) one by at least 8 bytes to accommodate the ALBA > field, we can't mix and match the two for the same device. Here is how this > device initialization part can be modified to handle the VIRTIO_BLK_F_ZONED > feature bit if we stick with #2 - > > > @@ -810,13 +1287,18 @@ static int virtblk_probe(struct virtio_device *vdev) > } > > memset(&vblk->tag_set, 0, sizeof(vblk->tag_set)); > - vblk->tag_set.ops = &virtio_mq_ops; > vblk->tag_set.queue_depth = queue_depth; > vblk->tag_set.numa_node = NUMA_NO_NODE; > vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; > - vblk->tag_set.cmd_size = > - sizeof(struct virtblk_req) + > - sizeof(struct scatterlist) * VIRTIO_BLK_INLINE_SG_CNT; > + if (virtio_has_feature(vdev, VIRTIO_BLK_F_ZONED)) { > + vblk->tag_set.ops = &virtio_mq_zoned_ops; > + vblk->tag_set.cmd_size = sizeof(struct virtblk_zoned_req); > + } else { > + vblk->tag_set.ops = &virtio_mq_ops; > + vblk->tag_set.cmd_size = sizeof(struct virtblk_req); > + }; > + vblk->tag_set.cmd_size += > + sizeof(struct scatterlist) * VIRTIO_BLK_INLINE_SG_CNT; > vblk->tag_set.driver_data = vblk; > vblk->tag_set.nr_hw_queues = vblk->num_vqs; I've included a #1 solution below in case you like it. cmd_size doesn't need to be changed, sizeof(struct virtblk_req) is fine. Here is struct virtblk_req extended to handle the ALBA field: struct virtblk_req { struct virtio_blk_outhdr out_hdr; + struct { + __virtio64 alba; + u8 status; + } in; - u8 status; struct sg_table sg_table; struct scatterlist sg[]; }; The in struct could also be a union of several structs. I kept it simple assuming just one request (ZONE_APPEND) has additional device-to-driver (in) fields. The request layout variation is taken into account when building the virtqueue scatterlist: static int virtblk_add_req(struct virtqueue *vq, struct virtblk_req *vbr, - struct scatterlist *data_sg, bool have_data) + struct scatterlist *data_sg, bool have_data, bool have_alba) { - struct scatterlist hdr, status, *sgs[3]; + struct scatterlist hdr, in, *sgs[3]; unsigned int num_out = 0, num_in = 0; sg_init_one(&hdr, &vbr->out_hdr, sizeof(vbr->out_hdr)); sgs[num_out++] = &hdr; if (have_data) { if (vbr->out_hdr.type & cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT)) sgs[num_out++] = data_sg; else sgs[num_out + num_in++] = data_sg; } + if (have_alba) + sg_init_one(&in, &vbr->in, sizeof(vbr->in)); + else + sg_init_one(&in, &vbr->in.status, sizeof(vbr->in.status)); + sgs[num_out + num_in++] = ∈ - sg_init_one(&status, &vbr->status, sizeof(vbr->status)); - sgs[num_out + num_in++] = &status; return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC); }
Attachment:
signature.asc
Description: PGP signature
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]