Solaris 10 amd64 - pollsys loop of a java programm

  • Follow


Hi!

I've got a strange problem: my java command line aplication that 
connects to a socket (where another java apl is listening on) does its 
job (currently transferring an object throgh connection, read it and 
returns output to STDOUT) and after that it falls into a kind of pollsys 
loop:

truss -p 23312
/2:     pollsys(0x00000000, 0, 0xFBC8BE58, 0x00000000)  = 0
/2:     pollsys(0x00000000, 0, 0xFBC8BE58, 0x00000000)  = 0
(...)


It goes on with this syscall all the time and after a while (several 
minutes(!) to hours times differ) it ends OK. The process receives 
signals but thoesn't handle 'em uncluding SIGKILL (sic! never seen on 
Sol before).

^C/1:       Received signal #2, SIGINT, in lwp_cond_wait() [caught]

It's a widely testes and used production aplication that we had never 
such probelms with under Sparc Solaris9 before migrating (test period) 
to the amd64 Opteron 244 TYAN platform with Solaris10 x86. Well, nothing 
but pain so far.

Can anybody give me a hint what to do with it then? Is it likely to be a 
hardware problem? How to track it further? Maybe someone could suggest 
some DTrace solution? any ideas??
0
Reply Piotr 4/13/2005 9:50:27 AM


Piotr Kowalski wrote:
> Hi!
> 
> I've got a strange problem: my java command line aplication that 
> connects to a socket (where another java apl is listening on) does its 
> job (currently transferring an object throgh connection, read it and 
> returns output to STDOUT) and after that it falls into a kind of pollsys 
> loop:
> 
> truss -p 23312
> /2:     pollsys(0x00000000, 0, 0xFBC8BE58, 0x00000000)  = 0
> /2:     pollsys(0x00000000, 0, 0xFBC8BE58, 0x00000000)  = 0
> (...)
> 
> 
> It goes on with this syscall all the time and after a while (several 
> minutes(!) to hours times differ) it ends OK. The process receives 
> signals but thoesn't handle 'em uncluding SIGKILL (sic! never seen on 
> Sol before).
> 
> ^C/1:       Received signal #2, SIGINT, in lwp_cond_wait() [caught]
> 
> It's a widely testes and used production aplication that we had never 
> such probelms with under Sparc Solaris9 before migrating (test period) 
> to the amd64 Opteron 244 TYAN platform with Solaris10 x86. Well, nothing 
> but pain so far.
> 
> Can anybody give me a hint what to do with it then? Is it likely to be a 
> hardware problem? How to track it further? Maybe someone could suggest 
> some DTrace solution? any ideas??
Try SUN's application certification test suite, appcert and apptrace in 
particular.  see: 
http://www.sun.com/software/solaris/programs/abi/documentation/index.xml
0
Reply Charles 4/13/2005 5:57:19 PM


1 Replies
512 Views

(page loaded in 0.036 seconds)


Reply: