Prev: block: Apply segment size and boundary limits to integrity data
Next: Apply segment size and segment boundary to integrity data
From: Martin K. Petersen on 15 Jul 2010 12:10 >>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes: Christof> While experimenting with the data integrity support in the Christof> Linux kernel, i found that the block layer integrity code can Christof> send integrity data segments for a request that do not adhere Christof> to the queue limits. The integrity data segment can be larger Christof> than queue_max_segment_size and the segment does not adhere to Christof> the queue_segment_boundary. Correct. That was a deliberate design decision. Modern HBAs allow essentially indefinite chaining and our block layer segmentation controls are to some extent legacy baggage. I did not want to put in a set of constraints on the DI scatterlist because I was afraid it would encourage vendors to actually them. Christof> It appears to me that the right way would be to apply the same Christof> restrictions that are in place for data segments also to Christof> integrity data segments. The patch works for my experiments Christof> and applies on top of the current Linux tree (2.6.35-rc5). Who says constraints on the integrity scatterlist are the same as on the data ditto? In my experience they are not. If you must do this, then the DI constraints should be separate from the data segmentation ones. But I'm interested in what motivated this change to begin with. Your change also has repercussions when merging requests and bios. We'd need to honor the DI segmentation constraints when merging. Otherwise we may end up going beyond the controller limits when mapping the sgl. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on 15 Jul 2010 12:40 >>>>> "Jens" == Jens Axboe <axboe(a)kernel.dk> writes: Jens> That sounds like a very batch design decision. Either the limits Jens> are explicitly given and different, or if not we have to assume Jens> that they are the same as the data limits at least. Imagine a controller that has a 4KB segment, 1 entry limit. If we capped the DI sgl the same way as the data we'd only be able to issue 512-byte requests unless the DI entries happened to be contiguous in memory. For several types of I/O the DI sgl is much longer than the data sgl. Especially if the submitter is using buffer_heads to map 512-byte blocks. And consequently we require vendors to be able to handle the pathological case in which any data scatterlist honoring the segmentation constraints given by the driver can be matched with an integrity scatterlist in which there is a separate entry for each logical block. No vendor has had any problems with this. Therefore there are no block layer data integrity queue limits. If a device appears that does in fact have constraints I have no problems intruducing a set of suitable knobs. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on 20 Jul 2010 00:50 >>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes: Christof, Christof> The motivation stems from research how the integrity data can Christof> be mapped to the hardware interface used by the zfcp Christof> driver. When passing data segments to the zfcp hardware Christof> controller, there is the constraint that each data segment has Christof> a maximum size of 4k and a segment must not cross a 4k Christof> boundary. Ok. Christof> Right now, this is done by reporting the maximum segment size Christof> and segment boundary accordingly from zfcp. When issuing a Christof> request, zfcp simply walks the sg list and passes the segments Christof> to the hardware controller, no mapping or readjustment is Christof> necessary in the driver. In that case I don't really have a problem with adhering to the queue segment size and segment boundary for the integrity metadata. As long as we avoid using the max number of data segments to cap the integrity scatterlist because that'll definitely break a lot of stuff. Does the zfcp hardware have scatterlist length constraints? >> Your change also has repercussions when merging requests and bios. >> We'd need to honor the DI segmentation constraints when merging. >> Otherwise we may end up going beyond the controller limits when >> mapping the sgl. Christof> Meaning the integrity data sg list would have more entries Christof> than max_segments? I have not seen this during my experiments, Christof> but then i likely have not hit every case of a possible Christof> request layout. It happens all the time. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on 21 Jul 2010 00:30 >>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes: Christof> To have a simple approach that covers the case with one Christof> integrity data segment per user data segment, we only report Christof> half the size for the scatterlist length when running Christof> DIX. This guarantees that the other half can be used for Christof> integrity data. Yup, a few of our partners did something similar. My concern is the scenario where we submit lots of 512-byte writes that get merged into (in your case) 4 KB segments. Each of those 512-byte writes could come with an 8-byte integrity metadata tuple. And so you'd need 8 DI scatterlist elements per data element. Christof> Meaning the integrity data sg list would have more entries Christof> than max_segments? I have not seen this during my experiments, Christof> but then i likely have not hit every case of a possible Christof> request layout. dd to the block device is usually a good way to issue long scatterlists. Christof> Ok, i have to look into that as well. It would be an issue Christof> with the approach we are looking at now: If there are Christof> max_segments data segments, and more than max_segments Christof> integrity data segments, we will overrun the hardware Christof> constraint. Ok. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on 3 Aug 2010 00:50
>>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes: Christof> To summarize the limits i see in the zfcp hardware: Christof> - Maximum size of 4k per segment Christof> - The segments must not cross page boundaries Christof> - The number of segments per request is limited The interesting thing here is that your hw has a limit for the total number of segments whereas other DIX HBAs have separate limits for data and integrity scatterlists. Christof> What would be the preferred approach for handling the Christof> integrity data limits in the block layer? Introduce new queue Christof> limits for integrity data, or assume that the limits for Christof> integrity data are the same as for user data? I can update my Christof> patch accordingly and include a check for the maximum number Christof> of segments. I've been messing with this tonight. It's not entirely trivial because of the housekeeping involved, having to accomodate different types of hardware, having to avoid breaking existing setups, and having to work with integrity compiled and without. My first attempt at this got quite messy. I think I have found a way but it's bedtime here. Give me a day or two to get back to you with something that'll hopefully work for everyone. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |