Prev: DesconfiguraciĆ³n al conectrar laptop a la tele :S
Next: Downgraded Flash Non-Free 32 to Backports
From: Alan Chandler on 19 Jun 2010 14:30 I have a server with a pair of raided (RAID1) disks using partition 1,2 and 4 as /boot root and and and lvm volume respectively. The two disks are /dev/sda and /dev/sdb. They have just replaced two smaller disks where the root partiton was NOT a raid device - it was just /dev/sda2 although there was a raided boot partition in the first partition. Hardware only supports 2 sata channels. I wanted to revert to root partition to the same state as one I just took out, so I failed and removed sdb mdadm /dev/md0 --fail /dev/sdb1 --remove /dev/sdb1 mdadm /dev/md1 --fail /dev/sdb2 --remove /dev/sdb2 mdadm /dev/md2 --fail /dev/sdb4 --remove /dev/sdb4 for each of the partitions, and shut the machine down. I unplugged /dev/sdb and plugged in the old disk in its place and booted up knoppix. I asked knoppix to recreate the md devices mdadm --assemble --scan and it found 4 raid devices. The three on sda and the one (from the old sda, now on sdb). So I mounted /dev/md1 and /dev/sdb2 and reverted the root partition. I shut the machine down again. I now removed the old disk and plugged back in the new /dev/sdb that I had failed and removed in the first step. HOWEVER (the punch line). When this system booted, it was not the old reverted one but how it was before I started this cycle. In other words it looked as though the disk which I had failed and removed was being used If I did mdadm --detail /dev/md1 (or any of the other devices) it shows /dev/sdb as the only device on the raid pair. To sync up again I am having to add in the various /dev/sda partitions. SO THE QUESTION IS. What went wrong. How does a failed device end up being used to build the operational arrays, and the other devices end up not being included. -- Alan Chandler http://www.chandlerfamily.org.uk -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C1D0A6B.60302(a)chandlerfamily.org.uk
From: Stan Hoeppner on 19 Jun 2010 19:00 Alan Chandler put forth on 6/19/2010 1:20 PM: > > I have a server with a pair of raided (RAID1) disks using partition 1,2 > and 4 as /boot root and and and lvm volume respectively. The two disks > are /dev/sda and /dev/sdb. They have just replaced two smaller disks > where the root partiton was NOT a raid device - it was just /dev/sda2 > although there was a raided boot partition in the first partition. > Hardware only supports 2 sata channels. > > I wanted to revert to root partition to the same state as one I just > took out, so I failed and removed sdb _why_? This doesn't make any sense. -- Stan -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C1D4AB2.9040602(a)hardwarefreak.com
From: Andrew Reid on 19 Jun 2010 22:20 On Saturday 19 June 2010 14:20:27 Alan Chandler wrote: [ Details elided ] > HOWEVER (the punch line). When this system booted, it was not the old > reverted one but how it was before I started this cycle. In other words > it looked as though the disk which I had failed and removed was being used > > If I did mdadm --detail /dev/md1 (or any of the other devices) it shows > /dev/sdb as the only device on the raid pair. To sync up again I am > having to add in the various /dev/sda partitions. > > SO THE QUESTION IS. What went wrong. How does a failed device end up > being used to build the operational arrays, and the other devices end up > not being included. My understanding of how mdadm re-arranges the array (including for failures, etc.) is that it writes metadata into the various partitions, so I agree with you that this is weird -- I would have expected the RAID array to come up with the sda devices as the only devices present. There are two things I can think of, neither quite right, but maybe they'll motivate someone else to figure it out: (1) Device naming can be tricky when you're unplugging drives. Maybe the devices now showing up as "sdb" actually are the original "sda" devices. Can you check UUIDs? This explanation also requires that you didn't actually revert the disk, you only thought you did, but then didn't catch it because the conjectural device-renaming convinced you that the RAID was being weird. (2) How did you revert the root partition? If you copied all the files, then I have nothing else to add. If you did "dd" between the partitions, however, you may have creamed the md metadata, and caused the system to think the sdb device was the "good" one. This explanation is unsatisfactory because, even if it's right, it only explains why that partition should be reversed, not the others, although if you didn't revert the others, they're copies, and you can't tell them apart anyways. Also, what happened to /etc/mdadm/mdadm.conf on the reverted root partition? Is it nonexistent on the one you're now booting from? There's potential for confusion there also, although I think the initramfs info will suffice until the next kernel update. -- A. -- Andrew Reid / reidac(a)bellatlantic.net -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/201006192115.16300.reidac(a)bellatlantic.net
From: Alan Chandler on 20 Jun 2010 03:40 On 19/06/10 23:54, Stan Hoeppner wrote: > Alan Chandler put forth on 6/19/2010 1:20 PM: >> >> I have a server with a pair of raided (RAID1) disks using partition 1,2 >> and 4 as /boot root and and and lvm volume respectively. The two disks >> are /dev/sda and /dev/sdb. They have just replaced two smaller disks >> where the root partiton was NOT a raid device - it was just /dev/sda2 >> although there was a raided boot partition in the first partition. >> Hardware only supports 2 sata channels. >> >> I wanted to revert to root partition to the same state as one I just >> took out, so I failed and removed sdb > > _why_? This doesn't make any sense. > The new system I had just built used the Nouveau driver for my Geforce graphics chip, and that in combination with standard settings for Hauppage Nova T 500, was stuttering and then locking up when watching TV with Mythtv. The symptoms were that the Nova T stuff was was failing The old system was built with proprietary NVidia Driver and maybe (I can't remember) built from source Nova T stuff. That had worked perfectly for the last 6 months or so with no locking up. The (supposed) quickest way to try it was to revert to that system. But the disks I had taken out where too small for the other jobs I need this box to do so it was a question of copying the fully configured system over. -- Alan Chandler http://www.chandlerfamily.org.uk -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C1DC0DE.3090409(a)chandlerfamily.org.uk
From: Alan Chandler on 20 Jun 2010 03:50 On 20/06/10 02:15, Andrew Reid wrote: > On Saturday 19 June 2010 14:20:27 Alan Chandler wrote: > > [ Details elided ] > >> HOWEVER (the punch line). When this system booted, it was not the old >> reverted one but how it was before I started this cycle. In other words >> it looked as though the disk which I had failed and removed was being used >> >> If I did mdadm --detail /dev/md1 (or any of the other devices) it shows >> /dev/sdb as the only device on the raid pair. To sync up again I am >> having to add in the various /dev/sda partitions. >> >> SO THE QUESTION IS. What went wrong. How does a failed device end up >> being used to build the operational arrays, and the other devices end up >> not being included. > > My understanding of how mdadm re-arranges the array (including for > failures, etc.) is that it writes metadata into the various partitions, > so I agree with you that this is weird -- I would have expected the > RAID array to come up with the sda devices as the only devices present. > > There are two things I can think of, neither quite right, but maybe > they'll motivate someone else to figure it out: > > (1) Device naming can be tricky when you're unplugging drives. > Maybe the devices now showing up as "sdb" actually are the original > "sda" devices. Can you check UUIDs? This explanation also requires > that you didn't actually revert the disk, you only thought you did, > but then didn't catch it because the conjectural device-renaming > convinced you that the RAID was being weird. Of course that was my first thought. But I was doing this via SSH from an machine, so the terminal screen contents survived the power down. It was clear what I had done and which disks had failed etc > > (2) How did you revert the root partition? If you copied all the > files, then I have nothing else to add. Yes I did a file copy (using rsync -aH) .... > > Also, what happened to /etc/mdadm/mdadm.conf on the reverted root > partition? Is it nonexistent on the one you're now booting from? > There's potential for confusion there also, although I think the > initramfs info will suffice until the next kernel update. > This point is a possiblility as I didn't check the mdadm.conf file, but the initramfs was the same one throughout. I got into more trouble, because in order to correct stuff (but before the failed disk had even started to be resynced - I had asked it, but a much bigger partition was in the processes, so it hadn't started) I powered down, removed both disks from the system and put an old disk back and powered up copied some files across to a third disk powered down and replaced the two raided disks back. When I powered up again, it switched again and said the two disks were in sync on the partitions that hadn't started. This left the file system in an unusable state. Fortunately the more important big partition that was only partially synced carried on syncing in the same configuration (although I believe it started again from scratch rather than carrying on from where it left off). What I think was happening was that the BIOS was changing the boot order whenever I changed the disks and I then ended up booting from an incorrectly synced partition. -- Alan Chandler http://www.chandlerfamily.org.uk -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C1DC4C6.5070306(a)chandlerfamily.org.uk
|
Pages: 1 Prev: DesconfiguraciĆ³n al conectrar laptop a la tele :S Next: Downgraded Flash Non-Free 32 to Backports |