From: Martin Paul on
After playing around and experimenting with Live Upgrade, I'm now
planning the first LU of a production server. This faces me with the
difficult job to ensure that I will end up with an upgraded system
without losing any configuration data. I've collected some notes; maybe
somebody with more experience with LU can comment on on or another:

The basic procedure (as documented) is simple: Create a new boot
environment (BE), run luupgrade, activate the new BE, reboot and let LU
Sync take care of syncing files from the old BE to the new BE. IMHO,
it's a little more complicated, though:

For the luupgrade step, one has to differ between files which come in
packages and those which have been created (by the admin or the system).
The upgrade will only affect the packaged files, so I ran "pkgchk" to
find out which ones were modified - and got overwhelmed. There are many
files which were modified by the system, to keep configuration data from
installation time (e.g. /etc/inet/hosts, /etc/lvm/md.cf, ...) or which
keep state of a current configuration (/etc/path_to_inst,
/etc/printers.conf) or are there for logging purposes (/var/adm/wtmpx).
For many files it's completely unclear why they aren't in sync with the
contents file, I'd call that a bug.

After the upgrade I will have to check all of these, to see whether they
haven't been replaced by new versions. The same applies to those files
which I have modified for configuration purposes (e.g. /etc/.login,
/etc/default/power, /etc/logadm.conf, /etc/mail/sendmail.cf, ...). I
already verified that the upgrade will e.g. overwrite sendmail.cf,
killing my modifications. I cannot simply overwrite the new files with
my own copies, as the original configuration file might have changed
between Solaris releases, and I have to track these changes in my copy
of the file.

So after I will have verified that all my configuration cganges have
been correctly integrated into the new BE, I face the next problem: In
the meantime, files will have changed on the original BE, and I have to
track these changes (I will mention Live Upgrade Sync later). I've
collected a list of files which have changed in the last 7 days on the
affected machine, and again that's a pretty large list that I'm facing.
Many are log files, but things like /var/spool/calendar/, /var/yp/,
/var/dhcp/, /etc/svc/ definitely have to be taken care of.

This is where Live Upgrade Sync should jump in, but IMO it has two problems:

(1) The list in /etc/lu/synclist is far from complete by default, I will
have to come up with my own list.

(2) The LU Sync takes place at the very end of booting the new BE. If a
file is updated in the new BE during booting, changes will be lost
(PREPEND and APPEND are only partly useful here). I'm especially
concerned about the state of SMF services (/etc/svc/) here. I think it
would have been better for the sync to happen at the end of the shutdown
of the old BE or at beginning of the boot of the new BE. I will probably
run my own syn process right before the "init 6" to overcome these
problems - the only thing lost will be files written in the old BE
during the shutdown.

If somebody is still reading here, you probably agree with me that all
those issues are caused by the fact, that there is a messy mixture of
files in the root system of the OS. There should be a strict separation
between:

- Binaries, libs, data, which are read-only
- System configuration files, again read-only
- Local configuration files, maybe in a shadow-copy of /etc, written
only by the admin, not the system. Local configuration can append to
or overwrite system configuration files
- State data which should survive reboots and upgrades (e.g. SMF state)
- State data which gets zeroed at reboot (think /var/run/)

This would seriously enhance the upgrade process, as the places and
files to look at are minimized. For every file it's clear whether it
should survive an upgrade or not. After looking at all the files to take
care of for upgrade, I think that all of them could be put into one of
the above categories. I don't expect that to happen in Solaris, but I
really hope that this topic has been dealt with in IPS/OpenSolaris.

mp.
--
SysAdmin | Institute of Scientific Computing, University of Vienna
PCA | Analyze, download and install patches for Solaris
| http://www.par.univie.ac.at/solaris/pca/
From: Thomas Maier-Komor on
Martin Paul schrieb:
> After playing around and experimenting with Live Upgrade, I'm now
> planning the first LU of a production server. This faces me with the
> difficult job to ensure that I will end up with an upgraded system
> without losing any configuration data. I've collected some notes; maybe
> somebody with more experience with LU can comment on on or another:
>
> The basic procedure (as documented) is simple: Create a new boot
> environment (BE), run luupgrade, activate the new BE, reboot and let LU
> Sync take care of syncing files from the old BE to the new BE. IMHO,
> it's a little more complicated, though:
>
> For the luupgrade step, one has to differ between files which come in
> packages and those which have been created (by the admin or the system).
> The upgrade will only affect the packaged files, so I ran "pkgchk" to
> find out which ones were modified - and got overwhelmed. There are many
> files which were modified by the system, to keep configuration data from
> installation time (e.g. /etc/inet/hosts, /etc/lvm/md.cf, ...) or which
> keep state of a current configuration (/etc/path_to_inst,
> /etc/printers.conf) or are there for logging purposes (/var/adm/wtmpx).
> For many files it's completely unclear why they aren't in sync with the
> contents file, I'd call that a bug.
>
> After the upgrade I will have to check all of these, to see whether they
> haven't been replaced by new versions. The same applies to those files
> which I have modified for configuration purposes (e.g. /etc/.login,
> /etc/default/power, /etc/logadm.conf, /etc/mail/sendmail.cf, ...). I
> already verified that the upgrade will e.g. overwrite sendmail.cf,
> killing my modifications. I cannot simply overwrite the new files with
> my own copies, as the original configuration file might have changed
> between Solaris releases, and I have to track these changes in my copy
> of the file.
>
> So after I will have verified that all my configuration cganges have
> been correctly integrated into the new BE, I face the next problem: In
> the meantime, files will have changed on the original BE, and I have to
> track these changes (I will mention Live Upgrade Sync later). I've
> collected a list of files which have changed in the last 7 days on the
> affected machine, and again that's a pretty large list that I'm facing.
> Many are log files, but things like /var/spool/calendar/, /var/yp/,
> /var/dhcp/, /etc/svc/ definitely have to be taken care of.
>
> This is where Live Upgrade Sync should jump in, but IMO it has two
> problems:
>
> (1) The list in /etc/lu/synclist is far from complete by default, I will
> have to come up with my own list.
>
> (2) The LU Sync takes place at the very end of booting the new BE. If a
> file is updated in the new BE during booting, changes will be lost
> (PREPEND and APPEND are only partly useful here). I'm especially
> concerned about the state of SMF services (/etc/svc/) here. I think it
> would have been better for the sync to happen at the end of the shutdown
> of the old BE or at beginning of the boot of the new BE. I will probably
> run my own syn process right before the "init 6" to overcome these
> problems - the only thing lost will be files written in the old BE
> during the shutdown.
>
> If somebody is still reading here, you probably agree with me that all
> those issues are caused by the fact, that there is a messy mixture of
> files in the root system of the OS. There should be a strict separation
> between:
>
> - Binaries, libs, data, which are read-only
> - System configuration files, again read-only
> - Local configuration files, maybe in a shadow-copy of /etc, written
> only by the admin, not the system. Local configuration can append to
> or overwrite system configuration files
> - State data which should survive reboots and upgrades (e.g. SMF state)
> - State data which gets zeroed at reboot (think /var/run/)
>
> This would seriously enhance the upgrade process, as the places and
> files to look at are minimized. For every file it's clear whether it
> should survive an upgrade or not. After looking at all the files to take
> care of for upgrade, I think that all of them could be put into one of
> the above categories. I don't expect that to happen in Solaris, but I
> really hope that this topic has been dealt with in IPS/OpenSolaris.
>
> mp.

Hi Martin,

what makes you thing the default synclist is far from complete? In my
experience all system files are handled correctly. You should only make
sure that after lumake/lucreate no further package updates are performed
on the live system. All shared file-system will be kept untouched.

One big difficulty is handling shared filesystem that have packages
installed (e.g. /opt/csw) correctly. ZFS is here very helpful...

BTW: do you plan to do a live upgrade with zfs or with multiple ufs slices?

- Thomas
From: Casper H.S. Dik on
Martin Paul <map(a)par.univie.ac.at> writes:

>For the luupgrade step, one has to differ between files which come in
>packages and those which have been created (by the admin or the system).
>The upgrade will only affect the packaged files, so I ran "pkgchk" to
>find out which ones were modified - and got overwhelmed. There are many
>files which were modified by the system, to keep configuration data from
>installation time (e.g. /etc/inet/hosts, /etc/lvm/md.cf, ...) or which
>keep state of a current configuration (/etc/path_to_inst,
>/etc/printers.conf) or are there for logging purposes (/var/adm/wtmpx).
>For many files it's completely unclear why they aren't in sync with the
>contents file, I'd call that a bug.

It is pretty much the same as a standard upgrade will do; you will
see that many of those files are "volatile" and the upgrade generally
handles them just fine.

Make sure you run "pkgchk -n"; that should cut down the pkgchk output.

>After the upgrade I will have to check all of these, to see whether they
>haven't been replaced by new versions. The same applies to those files
>which I have modified for configuration purposes (e.g. /etc/.login,
>/etc/default/power, /etc/logadm.conf, /etc/mail/sendmail.cf, ...). I
>already verified that the upgrade will e.g. overwrite sendmail.cf,
>killing my modifications. I cannot simply overwrite the new files with
>my own copies, as the original configuration file might have changed
>between Solaris releases, and I have to track these changes in my copy
>of the file.

The best way to make a new sendmail.cf is creating your own
m4 template and rebuild it after the system was upgraded.

Most of the other files are changed mostly because earlier
upgrades or the install itself have changed them. Changing them again
is typically not needed.

>So after I will have verified that all my configuration cganges have
>been correctly integrated into the new BE, I face the next problem: In
>the meantime, files will have changed on the original BE, and I have to
>track these changes (I will mention Live Upgrade Sync later). I've
>collected a list of files which have changed in the last 7 days on the
>affected machine, and again that's a pretty large list that I'm facing.
>Many are log files, but things like /var/spool/calendar/, /var/yp/,
>/var/dhcp/, /etc/svc/ definitely have to be taken care of.

There is a list of files which is copied when you move boot environment.

>This is where Live Upgrade Sync should jump in, but IMO it has two problems:

>(1) The list in /etc/lu/synclist is far from complete by default, I will
>have to come up with my own list.

Agree, it is not complete. Typically I avoid this by rebooting
imediately after an upgrade.

>(2) The LU Sync takes place at the very end of booting the new BE. If a
>file is updated in the new BE during booting, changes will be lost
>(PREPEND and APPEND are only partly useful here). I'm especially
>concerned about the state of SMF services (/etc/svc/) here. I think it
>would have been better for the sync to happen at the end of the shutdown
>of the old BE or at beginning of the boot of the new BE. I will probably
>run my own syn process right before the "init 6" to overcome these
>problems - the only thing lost will be files written in the old BE
>during the shutdown.

It happens twice, I believe. It tries to cater for the fact that the
earlier boot environment is now defunct and you can boot from the
new environment without shutting the old down properly.

>This would seriously enhance the upgrade process, as the places and
>files to look at are minimized. For every file it's clear whether it
>should survive an upgrade or not. After looking at all the files to take
>care of for upgrade, I think that all of them could be put into one of
>the above categories. I don't expect that to happen in Solaris, but I
>really hope that this topic has been dealt with in IPS/OpenSolaris.

I'm afraid not really.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
From: Martin Paul on
Thomas Maier-Komor wrote:
> Martin Paul schrieb:
>> (1) The list in /etc/lu/synclist is far from complete by default, I will
>> have to come up with my own list.
>>
> what makes you thing the default synclist is far from complete? In my
> experience all system files are handled correctly. You should only make
> sure that after lumake/lucreate no further package updates are performed
> on the live system. All shared file-system will be kept untouched.

Here's some files/directories which are missing and which would affect me:

/var/yp/ which includes /var/yp/src with the source files for the NIS
server. If a file there is changed after the lucreate, I end up without
the change in the new BE.

/etc/sfw/private/smbpasswd - when a user changes his Samba password
between lucreate and lucactivate, the change is lost.

/var/spool/calendar/ contains the CDE calendar files. Conservative users
here still use that. Entries added between lucreate and luactivate will
be lost.

Get me right - I can take care of all of these, it's just that LU would
silently drop data in these cases and if it does that in just one small
area, I have to verify everything.

> BTW: do you plan to do a live upgrade with zfs or with multiple ufs slices?

It's an upgrade from 10 8/07 to 10 10/09. The machine has SVM root now
which I plan to keep. I think I read that going to a ZFS root would need
a more recent OS release in the current BE, anyway.

mp.
--
SysAdmin | Institute of Scientific Computing, University of Vienna
PCA | Analyze, download and install patches for Solaris
| http://www.par.univie.ac.at/solaris/pca/
From: Martin Paul on
Casper H.S. Dik wrote:
> Martin Paul <map(a)par.univie.ac.at> writes:
>> The upgrade will only affect the packaged files, so I ran "pkgchk" to
>> find out which ones were modified - and got overwhelmed.
>
> It is pretty much the same as a standard upgrade will do;

I never used that, as I didn't trust it :) LU has the big advantage that
I can verify the result of the upgrade without downtime.

> you will see that many of those files are "volatile" and the upgrade
> generally handles them just fine.

All the files I modify in my usual post-installation scripts are "e" or
"f". The editable files are mostly fine, but some have "renamold" or
other scripts (no idea what "cronroot" does, for example) which prefer
the new version over the old. Most problematic are the "f" files, but I
can justify all the changes I make to these files, as there's no other
way to set a the needed configuration or apply a fix. And for some files
it seems to be a bug that they are "f" and not "e", e.g.:

/etc/default/power
/etc/default/sys-suspend
/opt/staroffice8/share/psprint/psprint.conf

> The best way to make a new sendmail.cf is creating your own
> m4 template and rebuild it after the system was upgraded.

The rebuild is what I do, once for each new Solaris release. Then I keep
the *.cf files in my post-install script. Of course this problem could
have been solved a long time ago in the sendmail init.d/SMF script: It
would create the *.cf files from cf/local/*.mc files if they exist, and
from the cf/cf/*.mc files delivered with the OS otherwise. The *.cf
could then be volatile.

> Most of the other files are changed mostly because earlier
> upgrades or the install itself have changed them. Changing them again
> is typically not needed.

Yes, but you probably know the purpose of every single of th 161249
files in my contents file. Others have investigate, or trust LU/Sun :)

>> (1) The list in /etc/lu/synclist is far from complete by default, I will
>> have to come up with my own list.
>
> Agree, it is not complete. Typically I avoid this by rebooting
> imediately after an upgrade.

I already came to the conclusion that keeping the times between lucreate
<-> luupgrade <-> luactivate;init 6 should be kept as low as possible.

But as I said, for me the big advantage is that I can take time to
verify the result of the upgrade before rebooting.

>> (2) The LU Sync takes place at the very end of booting the new BE.
>
> It happens twice, I believe. It tries to cater for the fact that the
> earlier boot environment is now defunct and you can boot from the
> new environment without shutting the old down properly.

I see it only mentioned once in the log file, at the end of the boot,
but I must admit that I haven't verified that yet.

>> I don't expect that to happen in Solaris, but I
>> really hope that this topic has been dealt with in IPS/OpenSolaris.
>
> I'm afraid not really.

Sorry to hear that. After all the years, OS upgrades are the only thing
I'm afraid of as it's hard to control the outcome. I really think that
this problem could (and should) be solved in a modern OS.

Thanks,

mp.
--
SysAdmin | Institute of Scientific Computing, University of Vienna
PCA | Analyze, download and install patches for Solaris
| http://www.par.univie.ac.at/solaris/pca/