From: jasee on
Anahata wrote:
> On Sat, 05 Dec 2009 13:11:04 +0000, jasee wrote:
>
>> Paul Martin wrote:
>>> Set a block size. The default is to read a character at a time.
>>> That's inefficient.
>>
>> I'm setting it to 4096, can I set it higher?
>
> Yes. I think I used 1M once, but I didn't verify whether it actually
> helped with speed.


Somethings got to help, there's about 7gigs of data, at the current rate it
look 8hrs to do 40megs, its going to take over a month :-).


From: Denis McMahon on
Anahata wrote:
> On Sat, 05 Dec 2009 13:11:04 +0000, jasee wrote:
>
>> Paul Martin wrote:
>>> Set a block size. The default is to read a character at a time. That's
>>> inefficient.
>> I'm setting it to 4096, can I set it higher?
>
> Yes. I think I used 1M once, but I didn't verify whether it actually
> helped with speed.

I'd imagine a good starting value for block size would be the minimum
allocation unit on the volume, or a 2^n multiple thereof.

Rgds

Denis McMahon
From: Ian Northeast on
On Sat, 05 Dec 2009 15:23:58 +0000, jasee wrote:

> Anahata wrote:
>> On Sat, 05 Dec 2009 13:11:04 +0000, jasee wrote:
>>
>>> Paul Martin wrote:
>>>> Set a block size. The default is to read a character at a time. That's
>>>> inefficient.
>>>
>>> I'm setting it to 4096, can I set it higher?
>>
>> Yes. I think I used 1M once, but I didn't verify whether it actually
>> helped with speed.
>
>
> Somethings got to help, there's about 7gigs of data, at the current rate
> it look 8hrs to do 40megs, its going to take over a month :-).

That's excessively slow regardless of block size. Copying to /dev/null I
can get 90MB/s with a recentish SATA disc. Copying to another
partition/file shouldn't be significantly slower. This suggests that you
have a bad disc not just a corrupt partition. Are you seeing I/O errors in
the log?

Regards, Ian

From: jasee on
Ian Northeast wrote:
> On Sat, 05 Dec 2009 15:23:58 +0000, jasee wrote:
>
>> Anahata wrote:
>>> On Sat, 05 Dec 2009 13:11:04 +0000, jasee wrote:
>>>
>>>> Paul Martin wrote:
>>>>> Set a block size. The default is to read a character at a time.
>>>>> That's inefficient.
>>>>
>>>> I'm setting it to 4096, can I set it higher?
>>>
>>> Yes. I think I used 1M once, but I didn't verify whether it actually
>>> helped with speed.
>>
>>
>> Somethings got to help, there's about 7gigs of data, at the current
>> rate it look 8hrs to do 40megs, its going to take over a month :-).
>
> That's excessively slow regardless of block size. Copying to
> /dev/null I can get 90MB/s with a recentish SATA disc. Copying to
> another partition/file shouldn't be significantly slower. This
> suggests that you have a bad disc not just a corrupt partition. Are
> you seeing I/O errors in the log?
>

Yes, I'm seeing a series of i/o errors. I know I have a bad disk! If I
didn't have a bad disk I wouldn't have bad clusters and I wouldn't be trying
to get the data off! (Catch 22!)

I think it may be because the partition is marked as dirty (it's ntfs) so
can't be mounted. Usually Linux is more tolerant than NT in this respect.*

However it doesn't entirely explain why it starts at a reasonable rate then
gradually slows down to a snails rate.

*PS if I let nt's chkdsk at it it'll neatly convert all the quite useful
files into chk000* folders in the root where I'll never be able to sort them
out and delete _all_ the important ones.


From: Richard Kettlewell on
Paul Martin <pm(a)nowster.org.uk> writes:
> jasee wrote:

>> THe main problem with any of these facilities is the slowness;
>> which is exactly what you don't want with disk recovery. dd disk
>> access is apparently 100 times slower than normal disk access and
>> from what I've done I can
>
> Set a block size. The default is to read a character at a
> time. That's inefficient.

Experimentally the default for coreutils dd is to be 512 bytes at a
time, although if this is documented I don't immediately see where.

For cached data (including for block devices) larger block sizes do
significantly improve performance.

bs=512 240MB/s
bs=1024 440MB/s
bs=2048 770MB/s
bs=4096 1.3GB/s
bs=8192 1.8GB/s

The obvious interpretation is that the cost of system calls dominates.

I'd expect that IO would dominate for uncached data, but haven't done
the experiment.

--
http://www.greenend.org.uk/rjk/