f



use LD_PRELOAD with Java VM

I need to monitor (and potentially change) some system calls made by a
java program. I tried to use LD_PRELOAD to substitute my own version
for the libc system call wrappers. I wrote a braindead mylib.so which
simply prints out the system call name when a system call is made and a
LOAD message when it's loaded. The process id is also printed along
with the messages. A couple of strange things happened when I tested
it.

When I executed LD_PRELOAD=mylib.so java helloWorld, to my surprise,
the LOAD message was printed *twice* with the same process id. How come
the library was loaded twice? But the write system call was correctly
caught.

I then run LD_PRELOAD=mylib.so java echo. The echo program is a
classical echo server. I was expecting the LISTEN and ACCEPT system
call would be intercepted. But neither one of them was caught. The
command "strace java echo" showed that it indeed called listen and
accept.

Anybody can shed some light on these peculiarities? Please cc to me so
that I won't miss your post. Thanks!

Ronghua

0
3/30/2005 5:06:37 AM
comp.unix.programmer 10848 articles. 0 followers. kokososo56 (350) is leader. Post Follow

4 Replies
2342 Views

Similar Articles

[PageSpeed] 32

ronghuazhang@gmail.com writes:

> I need to monitor (and potentially change) some system calls made by a
> java program. I tried to use LD_PRELOAD to substitute my own version
> for the libc system call wrappers.

Note that this will only intercept/monitor calls to libc, not
_system_ calls.

> When I executed LD_PRELOAD=mylib.so java helloWorld, to my surprise,
> the LOAD message was printed *twice* with the same process id. 

This is happening because (some versions of) java checks to see if
LD_LIBRARY_PATH is properly set in the environment, and re-exec()s
itself with proper setting if it was not already set.

So your library is preloaded into the original java process, and
then again into the newly-exec()ed instance. PID stays constant
because execve() doesn't change the PID.

> I then run LD_PRELOAD=mylib.so java echo. The echo program is a
> classical echo server. I was expecting the LISTEN and ACCEPT system
> call would be intercepted. But neither one of them was caught. The
> command "strace java echo" showed that it indeed called listen and
> accept.

The Solaris j2sdk1.4.2_02 libjvm.so indeed references listen()
and accept(), so I'd expect them to be intercepted.

If you are on Linux, or using a different version of java, the
story might be quite different.

Posting exact system details, as well as minimal test case, may
get you better answers.

Cheers,
-- 
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
0
Paul
3/31/2005 1:59:18 AM
Paul,
  Thank you for the reply. I reexamined my code and found an bug Now
both listen and accept can be intercepted. But they are reported twice.
I can understand why LOAD message is printed twice, but I couldn't
understand why listen and accept are reported twice.


  I was using Sun JDK 1.5 Update 2 on Red Hat Linux 9. The source code
for the dummy mylib.so is:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>

void init(void) __attribute__((constructor));
void cleanup(void) __attribute__((destructor));

void init(void)
{
    fprintf(stderr, "initialize libcSubstitute\n");
}

void cleanup(void)
{
    fprintf(stderr, "libcSubstitute cleanup\n");
}

int write(int fd, const void *buf, size_t count)
{       static int (*real_write)(int, const void*, size_t) = NULL;
        int ret;

        if (real_write == NULL){
                real_write = dlsym(RTLD_NEXT, "write");
        }

        if (!real_write){
                fprintf(stderr, "cannot find write\n");
                return -1;
        }
        fprintf(stderr, "intercept write from process %d\n", getpid());

        ret = (*real_write)(fd, buf, count);
        return ret;
}

int listen(int fd, int backlog)
{       static int (*real_listen)(int,int) = NULL;

        if (real_listen == NULL) real_listen = dlsym(RTLD_NEXT,
"listen");
        if (!real_listen){
                fprintf(stderr, "cannot resolve listen()\n");
                return -1;
        }
        fprintf(stderr, "intercept listen from process %d\n",
getpid());

        return (*real_listen)(fd, backlog);
}

int accept(int fd, struct sockaddr *addr, socklen_t *addrlen)
{       static int (*real_accept)(int,struct sockaddr*,socklen_t*) =
NULL;

        if (real_accept == NULL) real_accept = dlsym(RTLD_NEXT,
"accept");
        if (!real_accept){
                fprintf(stderr, "cannot resolve accept()\n");
                return -1;
        }
        fprintf(stderr, "intercept accept from process %d\n",
getpid());

        return (*real_accept)(fd, addr, addrlen);
}

I used the following command to compile it
gcc -g -fPIC -Wall -I.    -c -o lib.o lib.c
gcc -shared -o libcsubst.so lib.o -ldl

The code for the java echo server is:

import java.io.*;
import java.net.*;

public class echo{
    public static void main(String args[]) {
        ServerSocket echoServer = null;
        String line;
        DataInputStream is;
        PrintStream os;
        Socket clientSocket = null;

        try {
                        echoServer = new ServerSocket(9999);
        }
        catch (IOException e) {
                        System.out.println(e);
        }

                System.out.print("listening\n");

                try {
                        clientSocket = echoServer.accept();
                        is = new
DataInputStream(clientSocket.getInputStream());
                        os = new
PrintStream(clientSocket.getOutputStream());
                        while (true) {
                                line = is.readLine();
                                os.println(line);
                        }
                }
                catch (IOException e) {
                        System.out.println(e);
                }
        }
}


I used the command LD_PRELOAD=libcsubst.so java echo, and observed the
following output:

initialize libcSubstitute
initialize libcSubstitute
intercept listen from process 32211
intercept accept from process 32211
intercept listen from process 32211
intercept write from process 32211
listening
intercept accept from process 32211
libcSubstitute cleanup

0
ronghuazhang
3/31/2005 7:52:04 PM
ronghuazhang@gmail.com writes:

> I reexamined my code and found an bug Now
> both listen and accept can be intercepted. But they are reported twice.
> I can understand why LOAD message is printed twice, but I couldn't
> understand why listen and accept are reported twice.

On my RH-9 machine:

  $ strace -e trace=listen /usr/java/jdk1.5.0_01/bin/java echo
  listen(3, 1)                            = 0
  listen(3, 50)                           = 0
  listening

Given the output above, I don't think you have any reason to be
surprised that you get 2 listen() intercepts.

Cheers,
-- 
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
0
Paul
3/31/2005 11:42:40 PM
ronghuazhang@gmail.com writes:

> int write(int fd, const void *buf, size_t count)
> {       static int (*real_write)(int, const void*, size_t) = NULL;
....
>         fprintf(stderr, "intercept write from process %d\n", getpid());

Note that fprintf(stderr, ...) calls write(2, ...) internally,
and if you'd managed to intercept all write()s, the code above
would have run out of stack via infinite recursion.

Fortunately for you, glibc makes intercepting such "internal"
calls quite difficult. But that also means that your intercept
routine is going it miss a lot of write()s.

Cheers,
-- 
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
0
Paul
4/1/2005 1:16:30 AM
Reply: