I try to write a simple tcp based job queue server. It's purpose is to
get commands from a client and store them in a queue (STL). The server
has a thread which checks periodically the queue, executes the commands
in it and removes them from the queue after done so. It also has a
thread for every client connection. I am using the low level SOCKET
API. My server is a Win32 console app and my client is MFC. The
followinf struct is passed through the sockets:
typedef struct
{
int iCmdId;
char szJobID[JOBIDLENGTH];
char szWorkPath[PATHLENGTH];
char szUser[USERLENGTH];
} SERVERCMD;
The client can query the server about the jobs currently queued. The
server sends then each item (the whole struct) in the queue to the
client. The wired problem with this function is that for a while
everything seems fine. I get all queued jobs listed in my clients
CListBox. But after repeatingly calling the ListJobs() function my data
get somehow corrupted (although the queue doesn't change). In the
following I post the sourcode for the server's and client's ListJobs()
function and the debuging output which clearly shows the problem.
server:
void ListJobs( SOCKET client )
{
int bytesSent = 0;
char buffer[5];
itoa( CmdQueue.size(), buffer, 10 );
bytesSent = send( client, (char*)&buffer, sizeof(buffer), 0 );
LogMessage( "ListJobs: sent %d bytes -> nJobs=%d\n", bytesSent,
CmdQueue.size() );
for (int i=0; i<CmdQueue.size(); i++)
{
SERVERCMD job = CmdQueue[i];
bytesSent = send( client, (char*)&job, sizeof(SERVERCMD), 0 );
LogMessage( "ListJobs: sent %d bytes of jobdata\n", bytesSent
);
}
LogMessage( "\n" );
}
client:
SERVERCMD * ListJobs( int * nJobs )
{
if ( sockClient == NULL )
return NULL;
int bytesSent,bytesRecv;
SERVERCMD * jobs;
// send list request to server
SERVERCMD job;
job.iCmdId = CMD_LSTJOB;
bytesSent = send( sockClient, (char*)&job, sizeof(SERVERCMD), 0 );
if ( bytesSent == SOCKET_ERROR )
return NULL;
// receive number of jobs
*nJobs = 0;
char buffer[5];
bytesRecv = recv( sockClient, (char*)&buffer, sizeof(buffer), 0 );
if ( bytesRecv != SOCKET_ERROR )
{
*nJobs = atoi( buffer );
debug("ListJobs: reveiced %d bytes -> nJobs=%d\n", bytesRecv,
*nJobs);
// allocate space for jobs
size_t jobsSize = *nJobs * sizeof(SERVERCMD);
jobs = (SERVERCMD*)malloc( jobsSize );
debug("ListJobs: allocating %d bytes\n", jobsSize);
// receive jobs
for (int i=0;i<*nJobs; i++)
{
bytesRecv = recv( sockClient, (char*)&(jobs[i]),
sizeof(SERVERCMD), 0 );
if ( bytesRecv == SOCKET_ERROR ) return NULL;
debug("ListJobs: received %d bytes of jobdata\n",
bytesRecv);
}
debug("\n");
}
else
return NULL;
return jobs;
}
In the following debugging output 2 jobs were put into the queue. The
queue was then queried multible times.
server:
ListJobs: sent 5 bytes -> nJobs=1 <-- one job in queue
ListJobs: sent 1228 bytes of jobdata
ListJobs: sent 5 bytes -> nJobs=2 <-- two jobs in queue
ListJobs: sent 1228 bytes of jobdata
ListJobs: sent 1228 bytes of jobdata
ListJobs: sent 5 bytes -> nJobs=2 <-- two jobs in queue
ListJobs: sent 1228 bytes of jobdata
ListJobs: sent 1228 bytes of jobdata
(...)
ListJobs: sent 5 bytes -> nJobs=2
ListJobs: sent 1228 bytes of jobdata
ListJobs: sent 1228 bytes of jobdata
client:
ListJobs: reveiced 5 bytes -> nJobs=1 <-- one job in queue
ListJobs: allocating 1228 bytes <-- size of 1*SERVERCMD
ListJobs: received 1228 bytes of jobdata <-- this is OK
ListJobs: reveiced 5 bytes -> nJobs=2 <-- two jobs in queue
ListJobs: allocating 2456 bytes <-- OK
ListJobs: received 1228 bytes of jobdata <-- OK
ListJobs: received 1228 bytes of jobdata <-- OK
(after quering queue multible times)
ListJobs: reveiced 5 bytes -> nJobs=2
ListJobs: allocating 2456 bytes
ListJobs: received 1228 bytes of jobdata
ListJobs: received 232 bytes of jobdata <-- Woho, what's that?
ListJobs: reveiced 5 bytes -> nJobs=0 <-- HELP!!!
ListJobs: allocating 0 bytes
I really need help with that. Maybe this is a thread thing, but in my
test I used only one client.
__
Tom
|
|
0
|
|
|
|
Reply
|
bauer (27)
|
8/8/2005 4:03:31 PM |
|
I forgot to mention that this effect only appears when I run the server
on a different host. When both, server and client, are on one host the
problem do not occure.
|
|
0
|
|
|
|
Reply
|
bauer (27)
|
8/8/2005 4:12:00 PM
|
|
On Mon, 8 Aug 2005, Tom wrote:
> I try to write a simple tcp based job queue server. It's purpose is to
> get commands from a client and store them in a queue (STL). The server
> has a thread which checks periodically the queue, executes the commands
> in it and removes them from the queue after done so. It also has a
> thread for every client connection. I am using the low level SOCKET
> API. My server is a Win32 console app and my client is MFC. The
> followinf struct is passed through the sockets:
> typedef struct
> {
> int iCmdId;
> char szJobID[JOBIDLENGTH];
> char szWorkPath[PATHLENGTH];
> char szUser[USERLENGTH];
> } SERVERCMD;
>
> The client can query the server about the jobs currently queued. The
> server sends then each item (the whole struct) in the queue to the
> client. The wired problem with this function is that for a while
> everything seems fine. I get all queued jobs listed in my clients
> CListBox. But after repeatingly calling the ListJobs() function my data
> get somehow corrupted (although the queue doesn't change). In the
> following I post the sourcode for the server's and client's ListJobs()
> function and the debuging output which clearly shows the problem.
>
> server:
> void ListJobs( SOCKET client )
> {
> int bytesSent = 0;
> char buffer[5];
> itoa( CmdQueue.size(), buffer, 10 );
>
> bytesSent = send( client, (char*)&buffer, sizeof(buffer), 0 );
> LogMessage( "ListJobs: sent %d bytes -> nJobs=%d\n", bytesSent,
> CmdQueue.size() );
>
> for (int i=0; i<CmdQueue.size(); i++)
> {
> SERVERCMD job = CmdQueue[i];
> bytesSent = send( client, (char*)&job, sizeof(SERVERCMD), 0 );
> LogMessage( "ListJobs: sent %d bytes of jobdata\n", bytesSent
> );
> }
> LogMessage( "\n" );
> }
>
> client:
> SERVERCMD * ListJobs( int * nJobs )
> {
> if ( sockClient == NULL )
> return NULL;
>
> int bytesSent,bytesRecv;
> SERVERCMD * jobs;
>
> // send list request to server
> SERVERCMD job;
> job.iCmdId = CMD_LSTJOB;
> bytesSent = send( sockClient, (char*)&job, sizeof(SERVERCMD), 0 );
> if ( bytesSent == SOCKET_ERROR )
> return NULL;
>
> // receive number of jobs
> *nJobs = 0;
> char buffer[5];
> bytesRecv = recv( sockClient, (char*)&buffer, sizeof(buffer), 0 );
>
> if ( bytesRecv != SOCKET_ERROR )
> {
> *nJobs = atoi( buffer );
> debug("ListJobs: reveiced %d bytes -> nJobs=%d\n", bytesRecv,
> *nJobs);
>
> // allocate space for jobs
> size_t jobsSize = *nJobs * sizeof(SERVERCMD);
> jobs = (SERVERCMD*)malloc( jobsSize );
> debug("ListJobs: allocating %d bytes\n", jobsSize);
>
> // receive jobs
> for (int i=0;i<*nJobs; i++)
> {
> bytesRecv = recv( sockClient, (char*)&(jobs[i]),
> sizeof(SERVERCMD), 0 );
> if ( bytesRecv == SOCKET_ERROR ) return NULL;
> debug("ListJobs: received %d bytes of jobdata\n",
> bytesRecv);
> }
> debug("\n");
> }
> else
> return NULL;
>
> return jobs;
> }
>
> In the following debugging output 2 jobs were put into the queue. The
> queue was then queried multible times.
>
> server:
> ListJobs: sent 5 bytes -> nJobs=1 <-- one job in queue
> ListJobs: sent 1228 bytes of jobdata
>
> ListJobs: sent 5 bytes -> nJobs=2 <-- two jobs in queue
> ListJobs: sent 1228 bytes of jobdata
> ListJobs: sent 1228 bytes of jobdata
>
> ListJobs: sent 5 bytes -> nJobs=2 <-- two jobs in queue
> ListJobs: sent 1228 bytes of jobdata
> ListJobs: sent 1228 bytes of jobdata
> (...)
> ListJobs: sent 5 bytes -> nJobs=2
> ListJobs: sent 1228 bytes of jobdata
> ListJobs: sent 1228 bytes of jobdata
>
> client:
> ListJobs: reveiced 5 bytes -> nJobs=1 <-- one job in queue
> ListJobs: allocating 1228 bytes <-- size of 1*SERVERCMD
> ListJobs: received 1228 bytes of jobdata <-- this is OK
>
> ListJobs: reveiced 5 bytes -> nJobs=2 <-- two jobs in queue
> ListJobs: allocating 2456 bytes <-- OK
> ListJobs: received 1228 bytes of jobdata <-- OK
> ListJobs: received 1228 bytes of jobdata <-- OK
>
> (after quering queue multible times)
>
> ListJobs: reveiced 5 bytes -> nJobs=2
> ListJobs: allocating 2456 bytes
> ListJobs: received 1228 bytes of jobdata
> ListJobs: received 232 bytes of jobdata <-- Woho, what's that?
>
> ListJobs: reveiced 5 bytes -> nJobs=0 <-- HELP!!!
> ListJobs: allocating 0 bytes
>
> I really need help with that. Maybe this is a thread thing, but in my
> test I used only one client.
>
> __
>
> Tom
>
>
.... And the remaining of your post :
> I forgot to mention that this effect only appears when I run the server
> on a different host. When both, server and client, are on one host the
> problem do not occure.
comp.lang.c isn't the right newsgroup to post to for your problem. I've a
feeling that c.u.p. might be more appropriate.
X-post & f.u.2 where relevant
--
"Je deteste les ordinateurs : ils font toujours ce que je dis, jamais ce
que je veux !"
"The obvious mathematical breakthrough would be development of an easy
way to factor large prime numbers." (Bill Gates, The Road Ahead)
|
|
0
|
|
|
|
Reply
|
Stephane
|
8/8/2005 4:22:01 PM
|
|
Tom wrote:
> I forgot to mention that this effect only appears when I run the server
> on a different host. When both, server and client, are on one host the
> problem do not occure.
Hi,
Your code looks rather like C++ to me... But your problems are likely
to do
with send/recv, which aren't really on topic in comp.lang.c++ either.
I suggest you ask in comp.os.ms-windows.programmer.win32, you will
get better help there (or in comp.unix.programmer, assuming your
problems aren't windows specific).
<OT> Look at recv documention for your system,
you may not get as many bytes as you ask for,
in which case you need to try again until you do. recv
takes flags that modify this behavior, look at them. <OT>
-David
|
|
0
|
|
|
|
Reply
|
lndresnick (326)
|
8/8/2005 4:35:10 PM
|
|
Oooooooooops. My bad, sorry, bad fu2.
--
"Je deteste les ordinateurs : ils font toujours ce que je dis, jamais ce
que je veux !"
"The obvious mathematical breakthrough would be development of an easy
way to factor large prime numbers." (Bill Gates, The Road Ahead)
|
|
0
|
|
|
|
Reply
|
Stephane
|
8/8/2005 5:01:40 PM
|
|
I apologize for posting my issue on this group. I already posted it to
comp.os.ms-windows.programmer.win32,
microsoft.public.win32.programmer.networks and alt.winsock.programming.
But I tought maybe I can get help here too.
|
|
0
|
|
|
|
Reply
|
bauer (27)
|
8/8/2005 5:45:19 PM
|
|
On 8 Aug 2005 09:03:31 -0700, "Tom" <bauer@b3s.de> wrote:
> I try to write a simple tcp based job queue server. It's purpose is to
> get commands from a client and store them in a queue (STL). The server
As others have noted the STL part is C++ and offtopic in c.l.C, but
this is relatively small and easily ignorable. "Mixed" declarations
(that is, not solely at the beginning of a block) and double-slash
comments which were introduced in C++ _are_ standard in C as of C99,
but not yet universally implemented. And double-slash comments are
still unwise in news postings, since those may get line breaks (wraps)
added at various points, and this breaks // comments but does not harm
/* */ comments. (It also harms long preprocessor #directives, the only
other place lines are significant.)
> has a thread which checks periodically the queue, executes the commands
> in it and removes them from the queue after done so. It also has a
> thread for every client connection. I am using the low level SOCKET
Server thread per connection/client scales poorly; see any week (of
the last N years) of comp.programming.threads. But leave that for now.
> API. My server is a Win32 console app and my client is MFC. The
> followinf struct is passed through the sockets:
> typedef struct
> {
> int iCmdId;
> char szJobID[JOBIDLENGTH];
> char szWorkPath[PATHLENGTH];
> char szUser[USERLENGTH];
> } SERVERCMD;
>
In general it's a poor idea to send C-language structs over a network;
compilers on different types of systems can lay them out differently,
and sometimes also different compilers or the same compiler with
different options on the same system type. This is why you see proper
network protocols specified in terms of actual bits (or nowadays
usually octets) on the wire and not in C or other HLL. But since you
apparently are using Wintel and probably the same compiler at both
(all) endpoints, and structs that are _mostly_ chars, leave that also.
> The client can query the server about the jobs currently queued. The
> server sends then each item (the whole struct) in the queue to the
> client. The wired problem with this function is that for a while
> everything seems fine. I get all queued jobs listed in my clients
> CListBox. But after repeatingly calling the ListJobs() function my data
> get somehow corrupted (although the queue doesn't change). In the
> following I post the sourcode for the server's and client's ListJobs()
> function and the debuging output which clearly shows the problem.
>
> server:
> void ListJobs( SOCKET client )
> {
> int bytesSent = 0;
> char buffer[5];
> itoa( CmdQueue.size(), buffer, 10 );
>
itoa() is not standard in C or C++. sprintf() is. If the number of
jobs is > 9999 this overflows the buffer, formally causing Undefined
Behavior although in practice on nearly all if not all machines this
particular UB doesn't actually cause harm until you get to at least 6
digits and probably 8 digits. And the latter at least probably won't
happen because you'll hit other limits first. (Like uptime!)
> bytesSent = send( client, (char*)&buffer, sizeof(buffer), 0 );
> LogMessage( "ListJobs: sent %d bytes -> nJobs=%d\n", bytesSent,
> CmdQueue.size() );
>
> for (int i=0; i<CmdQueue.size(); i++)
> {
> SERVERCMD job = CmdQueue[i];
> bytesSent = send( client, (char*)&job, sizeof(SERVERCMD), 0 );
Perhaps clearer to use sizeof(job), and for an object/expression (but
not a typename) can omit the parentheses = sizeof job .
> LogMessage( "ListJobs: sent %d bytes of jobdata\n", bytesSent
> );
> }
> LogMessage( "\n" );
> }
>
> client:
> SERVERCMD * ListJobs( int * nJobs )
> {
> if ( sockClient == NULL )
> return NULL;
>
> int bytesSent,bytesRecv;
> SERVERCMD * jobs;
>
> // send list request to server
> SERVERCMD job;
> job.iCmdId = CMD_LSTJOB;
> bytesSent = send( sockClient, (char*)&job, sizeof(SERVERCMD), 0 );
> if ( bytesSent == SOCKET_ERROR )
> return NULL;
>
> // receive number of jobs
> *nJobs = 0;
> char buffer[5];
> bytesRecv = recv( sockClient, (char*)&buffer, sizeof(buffer), 0 );
>
> if ( bytesRecv != SOCKET_ERROR )
> {
> *nJobs = atoi( buffer );
> debug("ListJobs: reveiced %d bytes -> nJobs=%d\n", bytesRecv,
> *nJobs);
>
If the server numjobs was > 9999 this reads an unterminated (and
possibly very wrong) value on which atoi() isn't safe. In fact atoi()
isn't safe in the presence of almost any error; strtol (and ul, and ll
and ull on C99 systems) handles some errors but not unterminated.
> // allocate space for jobs
> size_t jobsSize = *nJobs * sizeof(SERVERCMD);
> jobs = (SERVERCMD*)malloc( jobsSize );
The cast wouldn't be needed in C, but is in C++, and you don't check
the result is nonnull before using it (below). In C++ it is briefer,
clearer, and more robust to just write jobs = new SERVERCMD [*nJobs],
which also throws (by default) on allocation failure.
> debug("ListJobs: allocating %d bytes\n", jobsSize);
>
%d expects an int (signed) but jobsSize is size_t which is definitely
unsigned and may be a different size than int or unsigned int. In C99
there is a specific modifier for this; in C90 and C++ probably best to
cast to and specify unsigned long, or perhaps unsigned long long if
you have it. Or in C++ use << instead, to a stringstream if necessary.
> // receive jobs
> for (int i=0;i<*nJobs; i++)
> {
> bytesRecv = recv( sockClient, (char*)&(jobs[i]),
> sizeof(SERVERCMD), 0 );
> if ( bytesRecv == SOCKET_ERROR ) return NULL;
> debug("ListJobs: received %d bytes of jobdata\n",
> bytesRecv);
> }
> debug("\n");
> }
> else
> return NULL;
>
> return jobs;
> }
>
> In the following debugging output 2 jobs were put into the queue. The
> queue was then queried multible times.
>
As already noted, recv() on a bytestream protocol like TCP can receive
only part of what you asked for (and was sent). You need to use
MSG_COMP if available, which I don't think it is on (most?) Winsock,
or be prepared to read multiple pieces and combine them.
- David.Thompson1 at worldnet.att.net
|
|
0
|
|
|
|
Reply
|
david.thompson1 (1042)
|
8/14/2005 8:18:05 AM
|
|
Dave Thompson <david.thompson1@worldnet.att.net> writes:
[...]
> As others have noted the STL part is C++ and offtopic in c.l.C, but
> this is relatively small and easily ignorable. "Mixed" declarations
> (that is, not solely at the beginning of a block) and double-slash
> comments which were introduced in C++ _are_ standard in C as of C99,
> but not yet universally implemented. And double-slash comments are
> still unwise in news postings, since those may get line breaks (wraps)
> added at various points, and this breaks // comments but does not harm
> /* */ comments. (It also harms long preprocessor #directives, the only
> other place lines are significant.)
Lines are also significant in string literals. E.g., "splitting this
across lines" will cause problems.
[...]
> In general it's a poor idea to send C-language structs over a network;
> compilers on different types of systems can lay them out differently,
> and sometimes also different compilers or the same compiler with
> different options on the same system type. This is why you see proper
> network protocols specified in terms of actual bits (or nowadays
> usually octets) on the wire and not in C or other HLL. But since you
> apparently are using Wintel and probably the same compiler at both
> (all) endpoints, and structs that are _mostly_ chars, leave that also.
It's not uncommon for structs to be laid out in the same way for
different compilers on the same platform, so there's some hope of
combining code compiled with different compilers, but generally it's
not wise to depend on it.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
|
|
0
|
|
|
|
Reply
|
kst-u (21460)
|
8/14/2005 5:32:19 PM
|
|
|
7 Replies
16 Views
(page loaded in 0.233 seconds)
|