recovering from failed system disk

  • Follow


Hey,

I had a 9k machine with 10.20 hp-ux, with one 4G disk and three 9G disks.
4G was a system one and failed. I unfortunately misplaced hp/ux cdroms,
but fortunately had another machine with 2G system disk, so I swapped 4G
for 2G and booted single.

# ioscan -fnCdisk
Class     I  H/W Path     Driver      S/W State H/W Type  Description
======================================================================
disk      4  10/0.3.0     sdisk       CLAIMED   DEVICE    SEAGATE ST19171W
                         /dev/dsk/c3t3d0   /dev/rdsk/c3t3d0
disk      5  10/0.4.0     sdisk       CLAIMED   DEVICE    SEAGATE ST19171W
                         /dev/dsk/c3t4d0   /dev/rdsk/c3t4d0
disk      6  10/0.5.0     sdisk       CLAIMED   DEVICE    SEAGATE ST19171W
                         /dev/dsk/c3t5d0   /dev/rdsk/c3t5d0
disk      7  10/0.6.0     sdisk       CLAIMED   DEVICE    HP      C2490WD
                         /dev/dsk/c3t6d0   /dev/rdsk/c3t6d0
disk      8  10/12/5.2.0  sdisk       CLAIMED   DEVICE    TOSHIBA CD-ROM XM-5701TA
                         /dev/dsk/c4t2d0   /dev/rdsk/c4t2d0

# vgscan
vgscan: Warning: couldn't query physical volume "/dev/dsk/c0t4d0": # that's disk from another machine
The specified path does not correspond to physical volume attached to 
this volume group
vgscan: Warning: couldn't query physical volume "/dev/dsk/c0t5d0": # that's disk from another machine
The specified path does not correspond to physical volume attached to 
this volume group 
vgscan: Warning: couldn't query all of the physical volumes.
Physical Volume "/dev/dsk/c3t6d0" contains no LVM information
Physical Volume "/dev/dsk/c4t2d0" contains no LVM information

Following Physical Volumes belong to one Volume Group.
Unable to match these Physical Volumes to a Volume Group.
Use the vgimport command to complete the process.
/dev/dsk/c3t3d0
/dev/dsk/c3t4d0

Following Physical Volumes belong to one Volume Group.
Unable to match these Physical Volumes to a Volume Group.
Use the vgimport command to complete the process.
/dev/dsk/c3t5d0

The Volume Group /dev/vg00/group was not matched with any Physical Volumes.

# mkdir /dev/vg01
# mknod /dev/vg01/group c 64 0x030000
# vgimport -v /dev/vg01 /dev/dsk/c3t5d0
Beginning the import process on Volume Group "/dev/vg01".
vgimport: Warning:  Volume Group belongs to different CPU ID.
Can not determine if Volume Group is in use on another system. Continuing.
Logical volume "/dev/vg01/lvol1" has been successfully created
with lv number 1.
Volume group "/dev/vg01" has been successfully created.
Warning: A backup of this volume group may not exist on this machine.
Please remember to take a backup using the vgcfgbackup command after activating the volume group.
# vgchange -a y vg01
Activated volume group 

# mkdir /dev/vg02
# mknod /dev/vg02/group c 64 0x020000
# vgimport /dev/vg02 /dev/dsk/c3t3d0 /dev/dsk/c3t4d0
vgimport: Warning:  Volume Group belongs to different CPU ID.
Can not determine if Volume Group is in use on another system. Continuing.
Warning: A backup of this volume group may not exist on this machine.
Please remember to take a backup using the vgcfgbackup command after activating the volume group.


# ls /dev/vg01
group   lvol1   rlvol1   # this seems to be correct
# ls /dev/vg02
group   lvol1   lvol2   lvol3   rlvol1  rlvol2  rlvol3  # so is this

# fsck -F vxfs /dev/vg01/rlvol1
log replay in progress
file system is not clean, full fsck required
a full file system check may be required
# fsck -F vxfs -o full /dev/vg01/rlvol1
pass0 - checking structural files
vxfs fsck: structural inode 97 (Primary Ilist 1) failed validation clear? (ynq)q

At this point I am a little wary. Did I do wrong? Shall I let fsck run?
The point is, of course, to mount vg01/lvol1 and vg02/lvol[1-3] to recover
data from it.

p.

-- 
Beware of he who would deny you access to information, for in his
heart he dreams himself your master.   -- Commissioner Pravin Lal
0
Reply chopin (4) 12/28/2009 9:44:11 PM

On Dec 28, 4:44=A0pm, Piotr KUCHARSKI <cho...@sgh.waw.pl> wrote:
> Hey,
>
> I had a 9k machine with 10.20 hp-ux, with one 4G disk and three 9G disks.
> 4G was a system one and failed. I unfortunately misplaced hp/ux cdroms,
> but fortunately had another machine with 2G system disk, so I swapped 4G
> for 2G and booted single.
>
> # ioscan -fnCdisk
> Class =A0 =A0 I =A0H/W Path =A0 =A0 Driver =A0 =A0 =A0S/W State H/W Type =
=A0Description
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> disk =A0 =A0 =A04 =A010/0.3.0 =A0 =A0 sdisk =A0 =A0 =A0 CLAIMED =A0 DEVIC=
E =A0 =A0SEAGATE ST19171W
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/dev/dsk/c3t3d0 =A0 /d=
ev/rdsk/c3t3d0
> disk =A0 =A0 =A05 =A010/0.4.0 =A0 =A0 sdisk =A0 =A0 =A0 CLAIMED =A0 DEVIC=
E =A0 =A0SEAGATE ST19171W
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/dev/dsk/c3t4d0 =A0 /d=
ev/rdsk/c3t4d0
> disk =A0 =A0 =A06 =A010/0.5.0 =A0 =A0 sdisk =A0 =A0 =A0 CLAIMED =A0 DEVIC=
E =A0 =A0SEAGATE ST19171W
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/dev/dsk/c3t5d0 =A0 /d=
ev/rdsk/c3t5d0
> disk =A0 =A0 =A07 =A010/0.6.0 =A0 =A0 sdisk =A0 =A0 =A0 CLAIMED =A0 DEVIC=
E =A0 =A0HP =A0 =A0 =A0C2490WD
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/dev/dsk/c3t6d0 =A0 /d=
ev/rdsk/c3t6d0
> disk =A0 =A0 =A08 =A010/12/5.2.0 =A0sdisk =A0 =A0 =A0 CLAIMED =A0 DEVICE =
=A0 =A0TOSHIBA CD-ROM XM-5701TA
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/dev/dsk/c4t2d0 =A0 /d=
ev/rdsk/c4t2d0
>
> # vgscan
> vgscan: Warning: couldn't query physical volume "/dev/dsk/c0t4d0": # that=
's disk from another machine
> The specified path does not correspond to physical volume attached to
> this volume group
> vgscan: Warning: couldn't query physical volume "/dev/dsk/c0t5d0": # that=
's disk from another machine
> The specified path does not correspond to physical volume attached to
> this volume group
> vgscan: Warning: couldn't query all of the physical volumes.
> Physical Volume "/dev/dsk/c3t6d0" contains no LVM information
> Physical Volume "/dev/dsk/c4t2d0" contains no LVM information
>
> Following Physical Volumes belong to one Volume Group.
> Unable to match these Physical Volumes to a Volume Group.
> Use the vgimport command to complete the process.
> /dev/dsk/c3t3d0
> /dev/dsk/c3t4d0
>
> Following Physical Volumes belong to one Volume Group.
> Unable to match these Physical Volumes to a Volume Group.
> Use the vgimport command to complete the process.
> /dev/dsk/c3t5d0
>
> The Volume Group /dev/vg00/group was not matched with any Physical Volume=
s.
>
> # mkdir /dev/vg01
> # mknod /dev/vg01/group c 64 0x030000
> # vgimport -v /dev/vg01 /dev/dsk/c3t5d0
> Beginning the import process on Volume Group "/dev/vg01".
> vgimport: Warning: =A0Volume Group belongs to different CPU ID.
> Can not determine if Volume Group is in use on another system. Continuing=
..
> Logical volume "/dev/vg01/lvol1" has been successfully created
> with lv number 1.
> Volume group "/dev/vg01" has been successfully created.
> Warning: A backup of this volume group may not exist on this machine.
> Please remember to take a backup using the vgcfgbackup command after acti=
vating the volume group.
> # vgchange -a y vg01
> Activated volume group
>
> # mkdir /dev/vg02
> # mknod /dev/vg02/group c 64 0x020000
> # vgimport /dev/vg02 /dev/dsk/c3t3d0 /dev/dsk/c3t4d0
> vgimport: Warning: =A0Volume Group belongs to different CPU ID.
> Can not determine if Volume Group is in use on another system. Continuing=
..
> Warning: A backup of this volume group may not exist on this machine.
> Please remember to take a backup using the vgcfgbackup command after acti=
vating the volume group.
>
> # ls /dev/vg01
> group =A0 lvol1 =A0 rlvol1 =A0 # this seems to be correct
> # ls /dev/vg02
> group =A0 lvol1 =A0 lvol2 =A0 lvol3 =A0 rlvol1 =A0rlvol2 =A0rlvol3 =A0# s=
o is this
>
> # fsck -F vxfs /dev/vg01/rlvol1
> log replay in progress
> file system is not clean, full fsck required
> a full file system check may be required
> # fsck -F vxfs -o full /dev/vg01/rlvol1
> pass0 - checking structural files
> vxfs fsck: structural inode 97 (Primary Ilist 1) failed validation clear?=
 (ynq)q
>
> At this point I am a little wary. Did I do wrong? Shall I let fsck run?
> The point is, of course, to mount vg01/lvol1 and vg02/lvol[1-3] to recove=
r
> data from it.
>
> p.
>
> --
> Beware of he who would deny you access to information, for in his
> heart he dreams himself your master. =A0 -- Commissioner Pravin Lal

Piotr,

Did you ever make a tape of the disk with the utility disk (ODE. Back,
Restore, Etc) ?
Boot
sea
bo px
ode
copyutil

Then you should only have to change the drive then using the utility
disk copy from
the tape back to the new hard drive (same size).

Reboot and all you should have to do if you have not done a routine
backup is to
touch up a few things.

I like HP ODE or imaging the system disk for disaster restore.
Depending on size
it takes about 2 hours.  Of course while it is running the system is
not accessable.

If you have a working system disk I personally would make a backup
copy of it
before modifying it. Nothing worse then wrecking your only workable
disk.

Using the HP ODE disk. If you have a D-9000, ODE PA0603 March 2006
should
work PA RISC 5791-4295.

-Ken



0
Reply Ken_Old_Unix_Guy 12/29/2009 12:54:47 PM


Ken_Old_Unix_Guy <bcopanel-ggroup@yahoo.com> wrote:
>> # fsck -F vxfs -o full /dev/vg01/rlvol1
>> pass0 - checking structural files
>> vxfs fsck: structural inode 97 (Primary Ilist 1) failed validation clear? (ynq)q
>> At this point I am a little wary. Did I do wrong? Shall I let fsck run?

I tried other lvols and two out of four finished CLEAN on fsck and allowed
to be mounted, so I eventually answered "y" on that "bad" one and let it run
(it's few hours already on pass1 (checking inode sanity and blocks)).

> Did you ever make a tape of the disk with the utility disk (ODE. Back,
> Restore, Etc) ?

I did, but it turns out it was so long ago, the tape was lost. :)
These are last moments of this system, we were going to copy everything
off it and shut it down, unfortunately it went down shortly before. :)

p.

-- 
Beware of he who would deny you access to information, for in his
heart he dreams himself your master.   -- Commissioner Pravin Lal
0
Reply Piotr 12/29/2009 2:01:02 PM

On Dec 29, 9:01=A0am, Piotr KUCHARSKI <cho...@sgh.waw.pl> wrote:
> Ken_Old_Unix_Guy <bcopanel-ggr...@yahoo.com> wrote:
> >> # fsck -F vxfs -o full /dev/vg01/rlvol1
> >> pass0 - checking structural files
> >> vxfs fsck: structural inode 97 (Primary Ilist 1) failed validation cle=
ar? (ynq)q
> >> At this point I am a little wary. Did I do wrong? Shall I let fsck run=
?
>
> I tried other lvols and two out of four finished CLEAN on fsck and allowe=
d
> to be mounted, so I eventually answered "y" on that "bad" one and let it =
run
> (it's few hours already on pass1 (checking inode sanity and blocks)).
>
> > Did you ever make a tape of the disk with the utility disk (ODE. Back,
> > Restore, Etc) ?
>
> I did, but it turns out it was so long ago, the tape was lost. :)
> These are last moments of this system, we were going to copy everything
> off it and shut it down, unfortunately it went down shortly before. :)
>
> p.
>
> --
> Beware of he who would deny you access to information, for in his
> heart he dreams himself your master. =A0 -- Commissioner Pravin Lal


Yes. I that happen to me several years ago. They were about to retire
a 9000
with a Nike 20. There was a single failed drive in a raid 5 array. I
was on vacation
and my sub replaced the failed drive but did not allow it to rebuild
paniced by
the disk activity he inserted another drive and not allowing the
system to
catch up powered down the whole thing (oops). He giggled the
controller cables
inside the cabinet. Powered up the system. System tried to fail over
to working
Nike controller. He wanted to retry the failed drive and pulled the
wrong one. Oops!

To make a long story short he managed to screw things up so bad the
system would not boot!!!

Upon my return the system was down with multiple drives backed out of
the Nike 20.
I asked him where they belonged. He was not sure of the slot they came
from!!! I said
you didn't even tag them. This is a pile of shit.

Needless to say with the system being retired it got retired early...

Hint: Weather you need it or not ODE your system boot disks to tape as
a safety measure.
I used to do it every four months during a maintenance cycle at night.
I also ran fbackup
nightly to catch loose ends.

Good luck.

-Ken



0
Reply Ken_Old_Unix_Guy 12/30/2009 5:47:21 PM

Ken_Old_Unix_Guy <bcopanel-ggroup@yahoo.com> wrote:

>On Dec 29, 9:01�am, Piotr KUCHARSKI <cho...@sgh.waw.pl> wrote:
>> Ken_Old_Unix_Guy <bcopanel-ggr...@yahoo.com> wrote:
>> >> # fsck -F vxfs -o full /dev/vg01/rlvol1
>> >> pass0 - checking structural files
>> >> vxfs fsck: structural inode 97 (Primary Ilist 1) failed validation clear? (ynq)q
>> >> At this point I am a little wary. Did I do wrong? Shall I let fsck run?
>>
>> I tried other lvols and two out of four finished CLEAN on fsck and allowed
>> to be mounted, so I eventually answered "y" on that "bad" one and let it run
>> (it's few hours already on pass1 (checking inode sanity and blocks)).
>>
>> > Did you ever make a tape of the disk with the utility disk (ODE. Back,
>> > Restore, Etc) ?
>>
>> I did, but it turns out it was so long ago, the tape was lost. :)
>> These are last moments of this system, we were going to copy everything
>> off it and shut it down, unfortunately it went down shortly before. :)
>>
>> p.
>>
>> --
>> Beware of he who would deny you access to information, for in his
>> heart he dreams himself your master. � -- Commissioner Pravin Lal
>
>
>Yes. I that happen to me several years ago. They were about to retire
>a 9000
>with a Nike 20. There was a single failed drive in a raid 5 array. I
>was on vacation
>and my sub replaced the failed drive but did not allow it to rebuild
>paniced by
>the disk activity he inserted another drive and not allowing the
>system to
>catch up powered down the whole thing (oops). He giggled the
>controller cables
>inside the cabinet. Powered up the system. System tried to fail over
>to working
>Nike controller. He wanted to retry the failed drive and pulled the
>wrong one. Oops!
>
>To make a long story short he managed to screw things up so bad the
>system would not boot!!!
>
>Upon my return the system was down with multiple drives backed out of
>the Nike 20.
>I asked him where they belonged. He was not sure of the slot they came
>from!!! I said
>you didn't even tag them. This is a pile of shit.
>
>Needless to say with the system being retired it got retired early...
>
>Hint: Weather you need it or not ODE your system boot disks to tape as
>a safety measure.
>I used to do it every four months during a maintenance cycle at night.
>I also ran fbackup
>nightly to catch loose ends.
>
>Good luck.
>
>-Ken
>
>
You can never have too many backups.

On a 10.20 system you can run Ignite backups, which do not require an
outage and give you  bootable restore tape.
When I was looking after 10.20 systems we ran an Ignite backup every
weekend, kept them for six weeks before overwriting.

Regards,

Ted.

==============================================================
| Ted Linnell                 <edlinnell@acslink.net.au>     |
|                                  |
| Nunawading, Victoria , Australia                           |
==============================================================
0
Reply Ted 12/30/2009 11:20:35 PM

Piotr KUCHARSKI <chopin@sgh.waw.pl> wrote:
> I tried other lvols and two out of four finished CLEAN on fsck and allowed
> to be mounted, so I eventually answered "y" on that "bad" one and let it run
> (it's few hours already on pass1 (checking inode sanity and blocks)).

It took like 20 hours (on a 9 GB disk!)

# fsck -F vxfs -o full /dev/vg02/lvol3
pass0 - checking structural files
vxfs fsck: structural inode 97 (Primary Ilist 1) failed validation clear? (ynq)y
pass1 - checking inode sanity and blocks
pass2 - checking directory linkage
fileset 999 inode 52546 contains invalid directory blocks  clear? (ynq)y
pass3 - checking reference counts
fileset 999 unreferenced file, ino 52547, reconnect? (ynq)y
fileset 999 unreferenced file, ino 346790, reconnect? (ynq)y
rebuild structural files? (ynq)y
pass0 - checking structural files
pass1 - checking inode sanity and blocks
pass2 - checking directory linkage
fileset 999 inode 1003 contains invalid directory blocks  clear? (ynq)y
fileset 999 inode 31352 contains invalid directory blocks  clear? (ynq)y
fileset 999 directory 41314 block 120355 offset 2 references free inode
             ino 52546 remove entry? (ynq)y
fileset 999 directory 41314 block 120355 rebuild header? (ynq)y
pass3 - checking reference counts
fileset 999 unreferenced file, ino 54215, reconnect? (ynq)y
fileset 999 unreferenced file, ino 2125, reconnect? (ynq)y
fileset 999 unreferenced file, ino 2254, reconnect? (ynq)y
....
fileset 999 inode 41314 link count is 25 should be 24 adjust? (ynq)y
fileset 999 unreferenced file, ino 41329, reconnect? (ynq)y
fileset 999 unreferenced file, ino 41368, reconnect? (ynq)y
fileset 999 unreferenced file, ino 41552, reconnect? (ynq)y
fileset 999 unreferenced file, ino 41641, reconnect? (ynq)y
fileset 999 unreferenced file, ino 41726, reconnect? (ynq)y
fileset 999 unreferenced file, ino 41774, reconnect? (ynq)y
fileset 999 unreferenced file, ino 42296, reconnect? (ynq)y
fileset 999 lost+found is full
sorry cannot reconnect clear? (ynq)

Sigh. I guess I have no choice but press 'y'.

p.

-- 
Beware of he who would deny you access to information, for in his
heart he dreams himself your master.   -- Commissioner Pravin Lal
0
Reply Piotr 12/31/2009 12:05:40 AM

Piotr KUCHARSKI wrote:
> Sigh. I guess I have no choice but press 'y'.

fsck has a -y option to do that.
0
Reply Dennis 1/5/2010 4:34:59 AM

Dennis Handly <dhandly@convex.hp.com> wrote:
>> Sigh. I guess I have no choice but press 'y'.
> fsck has a -y option to do that.

I know. I was more sighing about being forced to lose some data. :)

p.

-- 
Beware of he who would deny you access to information, for in his
heart he dreams himself your master.   -- Commissioner Pravin Lal
0
Reply Piotr 1/5/2010 6:29:05 PM

7 Replies
868 Views

(page loaded in 0.145 seconds)

Similiar Articles:













7/24/2012 10:45:22 AM


Reply: