Filesystem "automatically" damaged?

  • Follow


Hi all,

we're running a SunFire 280R using Solaris 9. It is connected to an
external SCSI RAID. As this RAID system gets filled up more and more, we
connected another SCSI RAID, ran format, created two UFS file systems
using newfs and put some data there.
At a reboot, the system complained about file system errors, and fsck
could only repair one FS.
I gave it another try and recreated the file systems, provided them with
some data, unmounted them and invoked fsck one day later - again there
were several defects. There are no errors displayed at the RAID.

Has anyone got a glue what I should take a special look at?

'uname -a' reports SunOS 5.9 Generic_122300-34 sun4u sparc
SUNW,Sun-Fire-280R.
'format' shows the RAID unit as:
c3t1d0 <easyRAID-Q16+-R0.0 cyl 38130 alt 2 hd 128 sec 128>  easyraid 
/pci@8,700000/LSILogic,scsi@3/sd@1,0
It's a RAID 5 consisting of three disks with one hot-spare disk.

I appreciate any hints!

Regards,
Christian
0
Reply Christian 5/26/2009 11:38:23 AM

Christian Schmidt wrote:
> Hi all,
> 
> we're running a SunFire 280R using Solaris 9. It is connected to an
> external SCSI RAID. As this RAID system gets filled up more and more, we
> connected another SCSI RAID, ran format, created two UFS file systems
> using newfs and put some data there.
> At a reboot, the system complained about file system errors, and fsck
> could only repair one FS.
> I gave it another try and recreated the file systems, provided them with
> some data, unmounted them and invoked fsck one day later - again there
> were several defects. There are no errors displayed at the RAID.
> 
> Has anyone got a glue what I should take a special look at?
> 
> 'uname -a' reports SunOS 5.9 Generic_122300-34 sun4u sparc
> SUNW,Sun-Fire-280R.
> 'format' shows the RAID unit as:
> c3t1d0 <easyRAID-Q16+-R0.0 cyl 38130 alt 2 hd 128 sec 128>  easyraid 
> /pci@8,700000/LSILogic,scsi@3/sd@1,0
> It's a RAID 5 consisting of three disks with one hot-spare disk.
> 
> I appreciate any hints!
> 
> Regards,
> Christian

Hi Christian,

Are there any i/o errors reported by "iostat -En"?

0
Reply solx 5/26/2009 12:13:59 PM


solx <nospam@example.net> wrote:
> 
> Are there any i/o errors reported by "iostat -En"?

This is what it reports:

c3t1d0          Soft Errors: 2 Hard Errors: 0 Transport Errors: 2 
Vendor: easyRAID Product:  Q16+            Revision: R0.0 Serial No:  
Size: 319.87GB <319866011648 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 2 Predictive Failure Analysis: 0 

Can this be caused by bad SCSI cables?

Thanks & Regards,
Christian

0
Reply Christian 5/26/2009 12:30:32 PM

Christian Schmidt wrote:
> solx <nospam@example.net> wrote:
>> Are there any i/o errors reported by "iostat -En"?
> 
> This is what it reports:
> 
> c3t1d0          Soft Errors: 2 Hard Errors: 0 Transport Errors: 2 
> Vendor: easyRAID Product:  Q16+            Revision: R0.0 Serial No:  
> Size: 319.87GB <319866011648 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 2 Predictive Failure Analysis: 0 
> 
> Can this be caused by bad SCSI cables?
> 
> Thanks & Regards,
> Christian
> 

Yes. Do have any replacement cables?
0
Reply solx 5/26/2009 4:44:57 PM

Christian Schmidt wrote:
> solx <nospam@example.net> wrote:
>> Are there any i/o errors reported by "iostat -En"?
> 
> This is what it reports:
> 
> c3t1d0          Soft Errors: 2 Hard Errors: 0 Transport Errors: 2 
> Vendor: easyRAID Product:  Q16+            Revision: R0.0 Serial No:  
> Size: 319.87GB <319866011648 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 2 Predictive Failure Analysis: 0 
> 
> Can this be caused by bad SCSI cables?
> 
> Thanks & Regards,
> Christian
> 

It could be cables.  It could also be a disk that is developing a 
problem!  If it IS developing a problem may want to talk to the hardware 
service people about getting it replaced!

Sometimes these little glitches mean nothing!  You get an error message 
or two and that's the end of it.  When you get a message, another the 
next day, two messages the day after that and then three error messages 
IT'S TRYING TO TELL YOU SOMETHING!!!  If you are wise you will retire 
that disk and use it as as door stop, a foot ball, a bookend or some 
other service in which your data is not at risk!
0
Reply Richard 5/26/2009 6:25:17 PM

On May 26, 6:38=A0am, Christian Schmidt <use...@siebenbergen.de> wrote:
> At a reboot, the system complained about file system errors, and fsck
> could only repair one FS.

SCSI disk errors aside, there is a known bug with veritas+solaris
9+UFS logging=3Don.  Make sure your sol9 system has all of the latest
patches (as well as the latest veritas patches if you run veritas SF).
0
Reply Jim 5/26/2009 7:30:32 PM

Jim Leonard <MobyGamer@gmail.com> wrote:

> On May 26, 6:38�am, Christian Schmidt <use...@siebenbergen.de> wrote:
>> At a reboot, the system complained about file system errors, and fsck
>> could only repair one FS.
> 
> SCSI disk errors aside, there is a known bug with veritas+solaris
> 9+UFS logging=on.  Make sure your sol9 system has all of the latest
> patches (as well as the latest veritas patches if you run veritas SF).

We're not using veritas. We'll try replacing the SCSI cables.

Thanks a lot!

Regards,
Christian
0
Reply Christian 5/27/2009 3:10:14 PM

Christian Schmidt <usenet@siebenbergen.de> wrote:

> we're running a SunFire 280R using Solaris 9. It is connected to an
> external SCSI RAID. As this RAID system gets filled up more and more, we
> connected another SCSI RAID, ran format, created two UFS file systems
> using newfs and put some data there.
> At a reboot, the system complained about file system errors, and fsck
> could only repair one FS.
> I gave it another try and recreated the file systems, provided them with
> some data, unmounted them and invoked fsck one day later - again there
> were several defects. There are no errors displayed at the RAID.
> 
> Has anyone got a glue what I should take a special look at?

I took a closer look at the fs setup today and noticed that the
cylinders of my two partitions "overlapped": The first partition covered
cylinders 0-8169, the second one cylinders 16-16199. :-|

I'm quite sure that this was the problem.

Stupid me... Next time, I'll take some more care when creating the
partitions.

Thanks for your help!!

Regards,
Christian
0
Reply Christian 6/2/2009 11:00:11 AM

7 Replies
333 Views

(page loaded in 0.124 seconds)

Similiar Articles:













7/25/2012 4:20:12 AM


Reply: