From: Antonio Perez on
d.sastre.medina(a)gmail.com wrote:

> On Sat, May 29, 2010 at 05:44:22PM -0400, Tom H wrote:
>> On Sat, May 29, 2010 at 7:06 AM, David Sastre Medina
>> <d.sastre.medina(a)gmail.com> wrote:
>> >
>> > Grub2 is failing to boot a softRAID1 + LVM2 squeeze box.

I use an equivalent setup and it was all automatically setup
correctly with the `update-grub2` command (once the system has
booted correctly).

Keep reading.

I have an 'md1' as '/boot' and an lvm2 '/' on 'md2', this is what my system uses:

grub.cfg:
menuentry "Debian GNU/Linux, with Linux 2.6.32-trunk-amd64" --class debian --class gnu-linux --class gnu --class os {
insmod raid
insmod mdraid
insmod ext2
set root='(md1)'
search --no-floppy --fs-uuid --set 1be9c4e5-70cd-4662-81e6-44e76cff20d8
echo Loading Linux 2.6.32-trunk-amd64 ...
linux /vmlinuz-2.6.32-trunk-amd64 root=UUID=25defa7a-93cb-40eb-9a76-c326f0b2dffc ro vga=792
echo Loading initial ramdisk ...
initrd /initrd.img-2.6.32-trunk-amd64

blkid: `blkid /dev/md[1,2]` Use blkid -g first to clear any old stored key.
/dev/md1: UUID="1be9c4e5-70cd-4662-81e6-44e76cff20d8" TYPE="ext2"
/dev/md2: UUID="25defa7a-93cb-40eb-9a76-c326f0b2dffc" TYPE="ext2"

grub-probe: grub-probe -t fs_uuid /boot, grub-probe -t fs_uuid /
1be9c4e5-70cd-4662-81e6-44e76cff20d8
25defa7a-93cb-40eb-9a76-c326f0b2dffc

mdadm: `sudo mdadm -D /dev/md[1,2] | grep UUID`
UUID : ff7e23a3:dc6327b6:73d158fc:63c6b3dd
UUID : 157b664b:7b41974f:73d158fc:63c6b3dd

It's booting fine all the time.

>> > root(a)sysresccd /root % mdadm --detail /dev/md0 /dev/md0:
>>
>> > UUID : 8052f7d4:54a97fbb:731031f6:bc3d041c

That UUID it's not the same that grub will use for boot.

>> I see two possible problems when looking at your grub.cfg.
>>
>> 1. There isn't an "insmod lvm" within the menuentry stanza. ext2,
>> raid, and mdraid are insmod'd twice in the header and once in the
>> menuentry and lvm is inmod'd just once in the header. (This is one of
>> the grub2 mysteries; why multiple insmods of the same modules?). I
>> doubt that this is the source of the problem (the first insmod must be
>> enough!) but you could add "insmod lvm" within the menuentry.
>
> Already tried that. No success.

That is not your problem IMO.

>> 2. In the uuid of the search line, what is
>> 785366b0-d597-4e9c-9284-b6b9161236ed? One of your /dev/sX1's uuid?
>> Since raid and mdraid are loaded, can't you/shouldn't you use the md0
>> uuid above?

> I also tried that. It fails.
> That UUID belongs to /root_vg-root_lv, where the root filesystem
> resides.
> The UUID can be confirmed at the grub propmt issuing
> grub> ls (root_vg-root_ls)

No, the `root` partition from the point of view of grub is the partition
where it is going to boot, i.e. /boot, then, the kernel will need the
`root` FS to use, that will be the UUID for /root_vg-root_lv in the `linux`
line.

> Note that `boot' is a multidisk partition (sda1 and sdb1, which assemble
> md0), thus root='(md0)' makes sense from a grub point of view.

Correct.

> And md1 is the result of assembling sda2 and sdb2. This md device has only
> one VG on top of it, root_vg, with several LVs in it, one of these LVs
> being my root_lv.

That looks OK.

> This my default menuentry now:
> menuentry "Debian GNU/Linux, with Linux 2.6.32-3-686-bigmem" --class debian --class gnu-linux --class gnu --class os {
> insmod raid
> insmod mdraid
> insmod lvm
> insmod ext2
> set root='(md0)'
> search --no-floppy --fs-uuid --set 785366b0-d597-4e9c-9284-b6b9161236ed
> echo Loading Linux 2.6.32-3-686-bigmem ...
> linux /vmlinuz-2.6.32-3-686-bigmem root=/dev/mapper/root_vg-root_lv ro rootdelay=15 quiet
> echo Loading initial ramdisk ...
> initrd /initrd.img-2.6.32-3-686-bigmem
> }

> The `set root' entry says what is *root* for grub, I understand this as:
> where are /boot/grub/grub.cfg, /vmlinuz-`uname -r` and /initrd.img-`uname
> -r` So IMHO it should be called boot='(md0)' for better undestanding and
> disambiguation from the *other* root in the `linux' line.

Yes, that's exactly it.

> The GRUB root device is not the same as the Linux kernel root= parameter.
> BTW this command is undocummented in the wiki, still uses grub-legacy's
> info, which doesn't apply anymore, given the `root' command has been
> replaced.

But grub has nothing to do with this parameter, it is a kernel `boot parameter`
well, more of a initrd boot parameter, but that is a different area:
http://www.mjmwired.net/kernel/Documentation/kernel-parameters.txt
line 2193.

> The `search' line, as stated in the grub wiki:

> Search devices by file, filesystem label or filesystem UUID. If --set
> is specified, the first device found is set to a variable. If HD
> variable name is specified, "root" is used.

I believe there is a mistake, and, that the `HD` should be `NO`. Meaning
that if no variable name is supplied, the value is assigned to the `root` variable.

This effectively repeats what the previous command did, IMO.

> I take this to mean that the first device found _which UUID is_ 785...
> (the UUID of my root_gv-root_lv) will be the `root' filesystem.

Well, the root for grub, not the root for the kernel.

> And yet another definition of `root' after the `linux' call.
> That one states that:

> root=/dev/mapper/root_vg-root_lv which could be written also as:
> root=LABEL=root or even
> root=UUID=785366b0-d597-4e9c-9284-b6b9161236ed

Yes, all are correct and I strongly recommend to use the UUID value from the blkid command.

Warning: The command blkid needs a `blkid -g` first to clear the stored UUIDs in it's cache.

> The three of them should be right. None of them work.

Your problem seems to be that the KERNEL can't find the root
FileSystem, nothing that grub could do to solve it.

> If a suppress the `quiet' option from the `linux' line, what I can see
> is LVM initializing *before* mdadm has get its job done:

> "Volume group "root_vg-root_lv not found
> Skipping volume group root_vg
> Unable to find LVM volume root_vg-swap_lv
> mdadm:/dev/md0 has been started with two drives
> mdadm:/dev/md1 has been started with two drives
> Gave up waiting fot root device."

That confirms it, it's a kernel problem not finding the correct `root` filesystem.
Use blkid UUID on that line.

> So it looks like a timming issue *but*, I have tried to issue manually
> the commands in the right order at the grub prompt:
> 1) insmod-ing raid, mdraid, lvm and ext2; setting root to md0;
> 2) searching for devices (also a variant without this step);
> 3) calling linux with the right root device
> (all three variants of this step: dev name, UUID and LABEL and with
> different rootdelay timmings, always without `quiet') and, finally;
> 4) calling initrd.

> Failure again. No way root_vg to be found.

Once you have booted into this system, `update-grub` should set
this file correctly, grub.cfg will be updated on any kernel change.

Make sure `update-grub` is correctly creating a good grub.cfg before a re-boot.

> One further question: after a reboot, while at the grub screen, before
> doing anything else, if a enter the command line and type `ls' at the
> prompt, I can see all of my LVs, and listing anyone of them returns:
> device name, filesystem type, label, last modification time and UUID.
> Where does this info come from? Supossedly, there aren't mods loaded to
> read that yet, until after `insmod' loads them, are there?

That's the 'core.img' code for grub, which needs to correctly read
all UUIDs to really perform it's job correctly.


--
Antonio Perez


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/2301386.kC03pvyZki(a)rnqqfki.eternal-september.org
From: Tom H on
On Sun, May 30, 2010 at 7:13 AM, <d.sastre.medina(a)gmail.com> wrote:
> On Sat, May 29, 2010 at 05:44:22PM -0400, Tom H wrote:
>> On Sat, May 29, 2010 at 7:06 AM, David Sastre Medina
>> <d.sastre.medina(a)gmail.com> wrote:
>> >
>> > Grub2 is failing to boot a softRAID1 + LVM2 squeeze box.
>> >
>> > root(a)sysresccd /root % mdadm --detail /dev/md0
>> > /dev/md0:
>> > ...
>> >           UUID : 8052f7d4:54a97fbb:731031f6:bc3d041c
>>
>> I see two possible problems when looking at your grub.cfg.
>>
>> 1. There isn't an "insmod lvm" within the menuentry stanza. ext2,
>> raid, and mdraid are insmod'd twice in the header and once in the
>> menuentry and lvm is inmod'd just once in the header. (This is one of
>> the grub2 mysteries; why multiple insmods of the same modules?). I
>> doubt that this is the source of the problem (the first insmod must be
>> enough!) but you could add "insmod lvm" within the menuentry.
>
> Already tried that. No success.
>
>> 2. In the uuid of the search line, what is
>> 785366b0-d597-4e9c-9284-b6b9161236ed? One of your /dev/sX1's uuid?
>> Since raid and mdraid are loaded, can't you/shouldn't you use the md0
>> uuid above?
>
> I also tried that. It fails.
> That UUID belongs to /root_vg-root_lv, where the root filesystem
> resides.
> The UUID can be confirmed at the grub propmt issuing
> grub> ls (root_vg-root_ls)
>
> Note that `boot' is a multidisk partition (sda1 and sdb1, which assemble
> md0), thus root='(md0)' makes sense from a grub point of view. And md1
> is the result of assembling sda2 and sdb2. This md device has only one VG
> on top of it, root_vg, with several LVs in it, one of these LVs being my
> root_lv.
>
> This my default menuentry now:
>
> menuentry "Debian GNU/Linux, with Linux 2.6.32-3-686-bigmem" --class
> debian --class gnu-linux --class gnu --class os {
>        insmod raid
>        insmod mdraid
>        insmod lvm
>        insmod ext2
>        set root='(md0)'
>        search --no-floppy --fs-uuid --set 785366b0-d597-4e9c-9284-b6b9161236ed
>        echo    Loading Linux 2.6.32-3-686-bigmem ...
>        linux   /vmlinuz-2.6.32-3-686-bigmem root=/dev/mapper/root_vg-root_lv ro rootdelay=15 quiet
>        echo    Loading initial ramdisk ...
>        initrd  /initrd.img-2.6.32-3-686-bigmem
> }
>
> The `set root' entry says what is *root* for grub, I understand this as:
> where are /boot/grub/grub.cfg, /vmlinuz-`uname -r` and /initrd.img-`uname -r`
> So IMHO it should be called boot='(md0)' for better undestanding and
> disambiguation from the *other* root in the `linux' line.
> The GRUB root device is not the same as the Linux kernel root= parameter.
> BTW this command is undocummented in the wiki, still uses grub-legacy's
> info, which doesn't apply anymore, given the `root' command has been
> replaced.
>
> The `search' line, as stated in the grub wiki:
>
> Search devices by file, filesystem label or filesystem UUID. If --set
> is specified, the first device found is set to a variable. If HD
> variable name is specified, "root" is used.
>
> I take this to mean that the first device found _which UUID is_ 785...
> (the UUID of my root_gv-root_lv) will be the `root' filesystem.
>
> And yet another definition of `root' after the `linux' call.
> That one states that:
>
> root=/dev/mapper/root_vg-root_lv  which could be written also as:
> root=LABEL=root  or even
> root=UUID=785366b0-d597-4e9c-9284-b6b9161236ed
>
> The three of them should be right. None of them work.
>
> If a suppress the `quiet' option from the `linux' line, what I can see
> is LVM initializing *before* mdadm has get its job done:
>
> "Volume group "root_vg-root_lv not found
>  Skipping volume group root_vg
>  Unable to find LVM volume root_vg-swap_lv
>  mdadm:/dev/md0 has been started with two drives
>  mdadm:/dev/md1 has been started with two drives
>  Gave up waiting fot root device."
>
> So it looks like a timming issue *but*, I have tried to issue manually
> the commands in the right order at the grub prompt:
> 1) insmod-ing raid, mdraid, lvm and ext2; setting root to md0;
> 2) searching for devices (also a variant without this step);
> 3) calling linux with the right root device
>  (all three variants of this step: dev name, UUID and LABEL and with
>  different rootdelay timmings, always without `quiet') and, finally;
> 4) calling initrd.
>
> Failure again. No way root_vg to be found.
>
> One further question: after a reboot, while at the grub screen, before
> doing anything else, if a enter the command line and type `ls' at the
> prompt, I can see all of my LVs, and listing anyone of them returns:
> device name, filesystem type, label, last modification time and UUID.
> Where does this info come from? Supossedly, there aren't mods loaded to
> read that yet, until after `insmod' loads them, are there?

1. The "set root=" and "search .." lines are setting the root for grub
as you said, which is the partition where grub.cfg is (I wish that the
grub developers had called it groot or grubroot). Since you have a
separate /boot partition, using your root-lv's uuid doesn't make
sense.

2. If you are going to the grub prompt from the grub menu by pressing
"c" then the various modules have been loaded.

At the grub prompt, if "ls" shows you (md0), can you see your kernel
and initrd with "ls (md0)? If you can, you should be able to boot with
linux (md0)/vmlinuz... root=/dev/path/to/root-lv ro
initrd (md0)/initrd...
boot


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/AANLkTildv_48efPx4v4bdFvIIGqdEiX1IZvg0GkvzrRJ(a)mail.gmail.com
From: d.sastre.medina on
Thanks for the comments and help.

I have edited /boot/grub/grub.cfg to match my devices by UUID as
proposed, getting UUIDs from blkid /dev/md{0,1} (previously I run
blkid -g):

menuentry "Debian GNU/Linux, with Linux 2.6.32-3-686-bigmem" --class
debian --class gnu-linux --class gnu --class os {
insmod raid
insmod mdraid
(insmod lvm present in some tests)
insmod ext2
set root='(md0)'
search --no-floppy --fs-uuid --set <here blkid /dev/md0>
echo Loading Linux 2.6.32-3-686-bigmem ...
linux /vmlinuz-2.6.32-3-686-bigmem root=UUID=<here blkid /dev/md1> ro rootdelay=15
echo Loading initial ramdisk ...
initrd /initrd.img-2.6.32-3-686-bigmem
}

It doesn't boot. Note a added `rootdelay=15' and removed `quiet' to be
able to see what happens.
Still LVM tries to initialize before mdadm (is that correct, anyway?),
and ends up waiting for the root filesystem until I'm dropped to busybox.
While at the initramfs prompt, simply by issuing `vgchange -ay' and
exiting (Ctrl-D) I can boot the box. Luckily, I had a PS2 keyboard
around, because the USB one wouldn't work.

The output of `vgchange -ay' might be informative. Here is one
example, but one line is output for each LV:

udevd-work[number]:kernel provided name 'dm-2' and
NAME='mapper/root_vg-var_lv' disagree, please use SYMLINK+= or change
the kernel to provide the proper name.

This looks like #581715 or #581593.
In both reports, however, it is said it's a harmless warning.
I can confirm that, at least, while testing the

linux /vmlinuz-2.6.32-3-686-bigmem root=/dev/mapper/root_vg-root_lv

variant, I can boot the system from initramfs after
`vgchange -ay' + <Ctrl-D>
regardless the warnings.

After booting, running `upgrade-grub or upgrade-grub2' won't help.
This looks like a problem elsewhere (udev, initramfs-tools, ...), but
not grub.

--
Huella de clave primaria: 0FDA C36F F110 54F4 D42B D0EB 617D 396C 448B 31EB
From: Antonio Perez on
d.sastre.medina(a)gmail.com wrote:

> Thanks for the comments and help.
>
> I have edited /boot/grub/grub.cfg to match my devices by UUID as
> proposed, getting UUIDs from blkid /dev/md{0,1} (previously I run
> blkid -g):

Good.

> It doesn't boot.

Bummer.

> Note a added `rootdelay=15' and removed `quiet' to be
> able to see what happens.

Good thinking.

> Still LVM tries to initialize before mdadm (is that correct, anyway?),
> and ends up waiting for the root filesystem until I'm dropped to busybox.
> While at the initramfs prompt, simply by issuing `vgchange -ay' and
> exiting (Ctrl-D) I can boot the box. Luckily, I had a PS2 keyboard
> around, because the USB one wouldn't work.

Ah, good, so, grub correctly detects the kernel and initrd files and loads
them. The kernel then takes control and starts the initramfs process which
you are reporting as failing.

This is further down the boot process than any grub or kernel.

Have you tried the:

update-initramfs -u

command to reconstruct the initramfs file used?


> The output of `vgchange -ay' might be informative. Here is one
> example, but one line is output for each LV:

> udevd-work[number]:kernel provided name 'dm-2' and
> NAME='mapper/root_vg-var_lv' disagree, please use SYMLINK+= or change
> the kernel to provide the proper name.
>
> This looks like #581715 or #581593.
> In both reports, however, it is said it's a harmless warning.
> I can confirm that, at least, while testing the

Yes, it's a harmles warning.

> linux /vmlinuz-2.6.32-3-686-bigmem root=/dev/mapper/root_vg-root_lv

> variant, I can boot the system from initramfs after
> `vgchange -ay' + <Ctrl-D>
> regardless the warnings.

> After booting, running `upgrade-grub or upgrade-grub2' won't help.
> This looks like a problem elsewhere (udev, initramfs-tools, ...), but
> not grub.

Try the `update-initramfs -u` command. It may solve the issue, if that
doesn't work, contact the initramfs maintainer, or file a bug, that will
solve the problem.

Good luck.


--
Antonio Perez


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4149443.dcC6GMIcoj(a)rnqqfki.eternal-september.org
From: Stan Hoeppner on
d.sastre.medina(a)gmail.com put forth on 5/30/2010 4:45 PM:
> On Sun, May 30, 2010 at 02:24:41PM -0500, Stan Hoeppner wrote:
>> What happens when you use LILO instead of Grub?
>
> I haven't tried that yet.
>
> First thing would be to know if the bootloader is to blame for not
> having a bootable system. As of now, it would be some timming issues
> related to initramfs-tools' scripts (wild guess).
>
> Then, I'd need to know if LILO supports the configuration described
> before, i.e., md0 contains /boot and md1 contains LVs, one of them
> being /dev/mapper/root_vg-root_lv.
>
> After that, I'd need to test the proper way to install/uninstall
> software from an unbootable machine. I guess d-i allows installing
> LILO on top of grub.
> The purpose is either reinstalling grub-pc, downgrading to grub-legacy,
> or installing LILO.
> Other option would be using rescueCD, chroot into my system and install
> from there. Suggestions are welcome.
>
> But first I'll need to refresh my LILO skills. It's been a while :)

This is very old, but I think the basic information is still valid. BIOS has
changed considerably in the past 10 years, and I'd assume is smarter across
the sea of manufacturers. This should be taken into account regarding the
enumeration of drives and the BIOS boot behavior.

http://www.faqs.org/docs/Linux-mini/Boot+Root+Raid+LILO.html#ss3.1

--
Stan



--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C04C56D.8070201(a)hardwarefreak.com