f



Possible ways of implementing an HTTP proxy with priority mechanism

I'm working on a TCP HTTP proxy: need to turn it in a priority proxy.

My standard proxy used to deal every connection with a fork and the child p=
rocess handles the connection. Now I've implemented a priority queueing mec=
hanism where packets coming from both sides of connections are queued and d=
equeued: packets coming from clients and remote servers are queued in my pr=
oxy, while packets are removed from queues and must be sent to clients or t=
o remote servers. Packets are inserted with a priority algorithm of mine, a=
nd packets removed are guaranteed to be the ones with highest priority.

I thought three possible implementation of the "receive packets from client=
s and servers and in the meantime send packets with high priority to destin=
ation" mechanism, but none of the three satisfies me or it has conceptual i=
ssues.

1) The forking proxy spawns children processes, each of which must "talk" w=
ith priority queues, in the main process: this would require some kind of I=
PC, and for a fast exchange of data like a TCP connection IPC doesn't seem =
a good idea.

2) I may keep everything in the main: loop with a select over all the socke=
t file descriptors (client and server sides) stored in a reading fd_set (so=
 I can receive packets coming to the proxy and store them in priority queue=
s). Packets that must be sent to client or server are first removed from qu=
eues, but the socket fd I will use to send the packet may block: so, make t=
he socket non-blocking or spawn a thread that will send just one packet and=
 then will die?

3) Third option is to use one thread per connection: again, each thread wil=
l have to talk to prio queues, inserting packets received from clients and =
servers in queues and sending packets taken from queues to its destination.=
 But a CONNECT HTTP request is not a just simple receiving loop from server=
 fd: it requires a select loop where packets received from the outside are =
queued, and in the meantime I have to send packets dequeued to destination.

I'm not the do my stuff for me guy, but I came up with only three solutions=
 and can't figure other ways.

Working on Ubuntu 14.04, C++11.

Here's the pseudocode of the method the removal thread is calling for every=
 removed packet from prio queues: http://pastebin.com/h2ZGj4ED

As you can see, it is a quite questionable method: there are send and recv,=
 and this syscalls may block the thread calling this method, which is the r=
emoval thread, and in the meantime priority queues may have been filled whi=
le the removal thread is blocked because of some blocking send or recv.

It may be more convenient if the removal thread could just remove packets a=
nd let something else deal with them: for each dequeued packet, removal thr=
ead may spawn a thread that will open a new connection (if the packet is a =
HTTP request) or will just send the packet to its destination. But by doing=
 so, lots of threads would born and die very quickly, thereby having more c=
ontext switches than usual.

To put it in a nutshell, my forking proxy worked like this:

for every new request from clients {
    // child process will deal the connection
    if (fork() =3D=3D 0) {
        // connect to remote server
        // send request (GET or CONNECT) to remote server
        if (request =3D=3D GET) {
            while (recv from server) {
                send to client;
            }
        } else if (request =3D=3D CONNECT) {
            while (true) {
                select over client and server file descriptors;
                if (read from client)
                    send to server; // and viceversa
                // break if send or recv <=3D 0
            }
     }
}
               =20
But now, sending and receiving packets is determined by the algorithm of pr=
iority queues: every packet received from client or server must be inserted=
 in queues and the proxy can only sends packets removed from queues.       =
=20
    =20



















0
elmazzun
11/9/2016 10:50:13 AM
comp.unix.programmer 10848 articles. 0 followers. kokososo56 (350) is leader. Post Follow

15 Replies
533 Views

Similar Articles

[PageSpeed] 39

On 2016-11-09, elmazzun <mazzocchiandrea24@gmail.com> wrote:
> But now, sending and receiving packets is determined by the algorithm
> of priority queues: every packet received from client or server must
> be inserted in queues and the proxy can only sends packets removed
> from queues.

In that case I would make it all single threaded based around a poll()
event loop, or possibly using a pool of threads.

To simplify the coding, threads could be used to handle the proxy
handshake parts of the protocol (to avoid doing state machine work). But
then for the bulk transfers, the connections are thrown into a pool
which is subject to the required queuing rules, rather than serviced
by dedicated threads.
0
Kaz
11/9/2016 3:01:38 PM
> > But now, sending and receiving packets is determined by the algorithm
> > of priority queues: every packet received from client or server must
> > be inserted in queues and the proxy can only sends packets removed
> > from queues.
>=20
> In that case I would make it all single threaded based around a poll()
> event loop, or possibly using a pool of threads.
>=20
> To simplify the coding, threads could be used to handle the proxy
> handshake parts of the protocol (to avoid doing state machine work). But
> then for the bulk transfers, the connections are thrown into a pool
> which is subject to the required queuing rules, rather than serviced
> by dedicated threads.

Sounds nice, let's imagine a select() polling clients and servers file desc=
riptors inserted in reading fd_set: if a packet is received, its payload is=
 copied in a custom struct of mine together with other stats like direction=
, timestamp, priority and others; this select() does not take care about in=
coming connections, accepting connections and getting the first request fro=
m client because that is done by another class whose task is to loop with a=
ccept(), read the request from client, craft a packet and insert it in the =
matching priority queue, nothing more.

But if select()'s duty is to receive packets and enqueue them, something el=
se must send the packets removed from prio queues, unless there's a way for=
 passing data to a looping select().

I hope you can see my actual dilemma now: if I keep a looping select() that=
 will receive packets and enqueue them and if a recv returns 0, this select=
() will notify it by crafting a special packet for closing connection; but =
who will send packets? For every removed packet, a thread will be born just=
 for one packet? Not efficient. Maybe better to have a thread that will tak=
e care of sending packets: if the packet removed from queues is the first o=
f some connection, connect(), update reading fd_set with new file descripto=
r and start reading from it and enqueuing packets; if the packet belongs to=
 an already existing connection, send it to the sender thread and this thre=
ad will take care of sending it, for the removing packets thread can't be b=
usy sending removed packets too, something else must do it. A thread pool m=
aybe, but how many? My priority mechanism already uses two threads, and if =
use a thread pool two threads from the total number of concurrent threads o=
f implementation has already been taken.
0
elmazzun
11/10/2016 1:26:32 PM
elmazzun , dans le message
<7a539138-9bff-453f-9d25-4e5c5f2996c3@googlegroups.com>, a �crit�:
> Sounds nice, let's imagine a select() polling clients and servers file
> descriptors inserted in reading fd_set

To get this out of the way: you would be better of using an existing
event library instead of reimplementing your own. They include support
for non-portable high-performance versions of the multiplexing APIs.

>						  this select() does not
> take care about incoming connections, accepting connections and
> getting the first request from client because that is done by another
> class

IMNSHO, bad design. Once you have a loop, use it for everything.

> unless there's a way for passing data to a looping select().

I do not know what that means. select() and other multiplexing APIs tell
you when you can read or write on a file descriptor. What you do with
that information is entirely up to you, but of course, it often
involves, you know, reading and writing.
0
Nicolas
11/10/2016 1:46:32 PM
On 10 Nov 2016 13:46:32 GMT
Nicolas George <nicolas$george@salle-s.org> wrote:
>elmazzun , dans le message
><7a539138-9bff-453f-9d25-4e5c5f2996c3@googlegroups.com>, a �crit�:
>> Sounds nice, let's imagine a select() polling clients and servers file
>> descriptors inserted in reading fd_set
>
>To get this out of the way: you would be better of using an existing
>event library instead of reimplementing your own. They include support
>for non-portable high-performance versions of the multiplexing APIs.
>
>>						  this select() does not
>> take care about incoming connections, accepting connections and
>> getting the first request from client because that is done by another
>> class
>
>IMNSHO, bad design. Once you have a loop, use it for everything.

Agreed but some people love to make things more complicated than they need to
be. I presume his other class is in a seperate thread which probably creates
far more problems than it solves.

-- 
Spud

0
spud
11/10/2016 1:55:04 PM
> > unless there's a way for passing data to a looping select().
>=20
> I do not know what that means. select() and other multiplexing APIs tell
> you when you can read or write on a file descriptor. What you do with
> that information is entirely up to you, but of course, it often
> involves, you know, reading and writing.

Sure, reading is easy, but what with writing? I don't care if socket fd cie=
nt or server are ready to receive data: the proxy sends only packets remove=
d from prio queues. No sense in checking FD_ISSET(serverFD, &writeset) and =
send it a packet if the packet removed from prio queue does not belong to t=
hat connection.
0
elmazzun
11/10/2016 2:31:00 PM
> Agreed but some people love to make things more complicated than they need to
> be. I presume his other class is in a seperate thread which probably creates
> far more problems than it solves.
> 
> -- 
> Spud

Dear Spud, please do not answer to my question if are not constructive.
0
elmazzun
11/10/2016 2:33:32 PM
> Agreed but some people love to make things more complicated than they need to
> be. I presume his other class is in a seperate thread which probably creates
> far more problems than it solves.
> 
> -- 
> Spud

Dear Spud, please do not answer in a such unconstructive way to my question.
0
elmazzun
11/10/2016 2:35:50 PM
elmazzun , dans le message
<f5f5ebea-5cde-4df0-9d82-04fe81bd6899@googlegroups.com>, a �crit�:
> Sure, reading is easy, but what with writing?

It is easy too.

>						I don't care if socket
> fd cient or server are ready to receive data:

You should care, otherwise your proxy will become blocked in a write
operation and not react any longer.

>						the proxy sends only
> packets removed from prio queues. No sense in checking
> FD_ISSET(serverFD, &writeset) and send it a packet if the packet
> removed from prio queue does not belong to that connection.

The queues needs to be organized per client. Priority is not everything:
a fast client should be served immediately even if a higher priority one
is present, if the higher priority one is blocked.
0
Nicolas
11/10/2016 3:35:42 PM
elmazzun <mazzocchiandrea24@gmail.com> writes:
>> > But now, sending and receiving packets is determined by the algorithm
>> > of priority queues: every packet received from client or server must
>> > be inserted in queues and the proxy can only sends packets removed
>> > from queues.
>> 
>> In that case I would make it all single threaded based around a poll()
>> event loop, or possibly using a pool of threads.
>> 
>> To simplify the coding, threads could be used to handle the proxy
>> handshake parts of the protocol (to avoid doing state machine work). But
>> then for the bulk transfers, the connections are thrown into a pool
>> which is subject to the required queuing rules, rather than serviced
>> by dedicated threads.
>
> Sounds nice, let's imagine a select() polling clients and servers file
> descriptors inserted in reading fd_set: if a packet is received, its
> payload is copied in a custom struct of mine together with other stats
> like direction, timestamp, priority and others; this select() does not
> take care about incoming connections, accepting connections and
> getting the first request from client because that is done by another
> class whose task is to loop with accept(), read the request from
> client, craft a packet and insert it in the matching priority queue,
> nothing more.
>
> But if select()'s duty is to receive packets and enqueue them,
> something else must send the packets removed from prio queues, unless
> there's a way for passing data to a looping select().

This can be handled with select (or any saner I/O readiness notification
mechanism) as well as it will also tell you which file descriptors are
ready for sending data: Just loop over these, too. You send queues will
need to support a peek operation for this. If the message that's
currently due for a connection can be sent without blocking, it can be
removed from its queue. Otherwise, it just stays there and sending will
be retried the next time the descriptor becomes writable.
0
Rainer
11/10/2016 4:53:31 PM
On 2016-11-10, Nicolas George <nicolas$george@salle-s.org> wrote:
> elmazzun , dans le message
><f5f5ebea-5cde-4df0-9d82-04fe81bd6899@googlegroups.com>, a écrit :
>> Sure, reading is easy, but what with writing?
>
> It is easy too.
>
>>						I don't care if socket
>> fd cient or server are ready to receive data:
>
> You should care, otherwise your proxy will become blocked in a write
> operation and not react any longer.
>
>>						the proxy sends only
>> packets removed from prio queues. No sense in checking
>> FD_ISSET(serverFD, &writeset) and send it a packet if the packet
>> removed from prio queue does not belong to that connection.
>
> The queues needs to be organized per client. Priority is not everything:
> a fast client should be served immediately even if a higher priority one
> is present, if the higher priority one is blocked.

You're just changing the priority function.

Whenever you identify something to process first according to some
rules, then you have priority. (Only, the structure member called
"priority" in those objects isn't necessary that priority.)
0
Kaz
11/10/2016 5:37:28 PM
In article <20161110093456.309@kylheku.com>,
 Kaz Kylheku <221-501-9011@kylheku.com> wrote:

> On 2016-11-10, Nicolas George <nicolas$george@salle-s.org> wrote:
> > elmazzun , dans le message
> ><f5f5ebea-5cde-4df0-9d82-04fe81bd6899@googlegroups.com>, a écrit :
> >> Sure, reading is easy, but what with writing?
> >
> > It is easy too.
> >
> >>						I don't care if socket
> >> fd cient or server are ready to receive data:
> >
> > You should care, otherwise your proxy will become blocked in a write
> > operation and not react any longer.
> >
> >>						the proxy sends only
> >> packets removed from prio queues. No sense in checking
> >> FD_ISSET(serverFD, &writeset) and send it a packet if the packet
> >> removed from prio queue does not belong to that connection.
> >
> > The queues needs to be organized per client. Priority is not everything:
> > a fast client should be served immediately even if a higher priority one
> > is present, if the higher priority one is blocked.
> 
> You're just changing the priority function.
> 
> Whenever you identify something to process first according to some
> rules, then you have priority. (Only, the structure member called
> "priority" in those objects isn't necessary that priority.)

But you should only ranking things by priority if you can actually do 
something with them. If the send queue for the higher priority process 
is full, it doesn't make sense to keep them prioritized higher than a 
client you can actually send data to.

-- 
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
0
Barry
11/11/2016 5:04:07 PM
Barry Margolin , dans le message
<barmar-B149CD.12040711112016@88-209-239-213.giganet.hu>, a �crit�:
>> You're just changing the priority function.
>> 
>> Whenever you identify something to process first according to some
>> rules, then you have priority. (Only, the structure member called
>> "priority" in those objects isn't necessary that priority.)
> 
> But you should only ranking things by priority if you can actually do 
> something with them. If the send queue for the higher priority process 
> is full, it doesn't make sense to keep them prioritized higher than a 
> client you can actually send data to.

I think that what Kaz is trying to say is that when writing to a peer is
blocking, all packets for that peer should have their priority lowered
below that of packets for available peers.

To study the problem in theoretical terms it makes sense: always send
the packet with the highest "priority", and the issue of blocking or not
is taken care of by the "priority" function.

But in terms of actual implementation and algorithmic, it is a very bad
idea. The blocking/ready state of clients changes at completely
different times than their actual priorit, handling them together would
be more complex.
0
Nicolas
11/11/2016 7:07:16 PM
On Thu, 10 Nov 2016 17:37:28 +0000 (UTC)
Kaz Kylheku <221-501-9011@kylheku.com> wrote:

> > The queues needs to be organized per client. Priority is not
> > everything: a fast client should be served immediately even if a
> > higher priority one is present, if the higher priority one is
> > blocked.
> 
> You're just changing the priority function.
> 
> Whenever you identify something to process first according to some
> rules, then you have priority. (Only, the structure member called
> "priority" in those objects isn't necessary that priority.)

https://www.freebsd.org/cgi/man.cgi?query=scheduler&sektion=9

Conventionally, "priority" is orthogonal to "runnable".  Blocked tasks
can't be run, no matter their priority.  Priority determines which of
the runnable tasks runs next.  

Some kind of signal indicates when a task moves from blocked to
runnable state.  Whatever the signalling mechanism, adjusting the
task's priority would only be redundant.  

<war story>
ISTR that in VMS, processes blocked against I/O temporarily had the
priority elevated, so that as soon as the device responded, the I/O was
completed.  Then its priorty dropped back down.  

I'm sure it worked well normally.  I remember when it didn't: if you
popped out the tape drive while in use, the interrupted task was in a
privileged state and couldn't be cancelled.  Nothing else could acquire
the drive without rebooting.  Ask me how I know.  
<story/>

--jkl


0
James
11/12/2016 10:47:29 PM
In article <582616e3$0$5272$426a74cc@news.free.fr>,
 Nicolas George <nicolas$george@salle-s.org> wrote:

> Barry Margolin , dans le message
> <barmar-B149CD.12040711112016@88-209-239-213.giganet.hu>, a �crit�:
> >> You're just changing the priority function.
> >> 
> >> Whenever you identify something to process first according to some
> >> rules, then you have priority. (Only, the structure member called
> >> "priority" in those objects isn't necessary that priority.)
> > 
> > But you should only ranking things by priority if you can actually do 
> > something with them. If the send queue for the higher priority process 
> > is full, it doesn't make sense to keep them prioritized higher than a 
> > client you can actually send data to.
> 
> I think that what Kaz is trying to say is that when writing to a peer is
> blocking, all packets for that peer should have their priority lowered
> below that of packets for available peers.

I was thinking that priority is assigned to clients/sockets, not to 
individual packets.

-- 
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
0
Barry
11/14/2016 2:21:06 AM
Barry Margolin , dans le message
<barmar-82BE1D.21210613112016@88-209-239-213.giganet.hu>, a �crit�:
> I was thinking that priority is assigned to clients/sockets, not to 
> individual packets.

Considering the protocols used in this discussion do not allow
reordering, anything else would make absolute no sense. But the OP
always spoke about prioritizing packets, I did not want to confuse him
more by adding yet another consideration to the discussion.
0
Nicolas
11/14/2016 10:52:53 AM
Reply: