f



Network traffic

Hi

I have a simple program that opens a DBF in a network drive with RMDBFCDX rdd, reads several millions records and total them in another DBF in the local drive.

Before the loop begins a network traffic meter shows 180/89 Kbps for download/upload.

Once the reading loop begins, the figures raise to 4.2/4.0 Mbps although my program writes nothing in the network drive (at least coded by me). If I pause the loop, traffic lowers both ways.

Does the RDD or any other lower level routine uses any paging/temp file in the network drive?

Thanks
Claudio H
0
Claudio
12/6/2016 6:25:41 PM
comp.lang.xharbour 5470 articles. 0 followers. Post Follow

21 Replies
169 Views

Similar Articles

[PageSpeed] 13

Dear Claudio H:

On Tuesday, December 6, 2016 at 11:25:42 AM UTC-7, Claudio H wrote:
....
> I have a simple program that opens a DBF in a network
> drive with RMDBFCDX rdd, reads several millions records
> and total them in another DBF in the local drive.

OK, so it opens this distant file, reads each record of this file, from beginning to end, and sums just one field of that whole record it read.

> Before the loop begins a network traffic meter shows
> 180/89 Kbps for download/upload.

OK.
 
> Once the reading loop begins, the figures raise to
> 4.2/4.0 Mbps although my program writes nothing in
> the network drive (at least coded by me).

Oh, *yes* it does.  As coded by you, you copy every record of that file, one-at-at-time into the memory of your workstation computer.  Then you probably generate network traffic to lock / unlock files and/or records, plus flushing buffers.

> If I pause the loop, traffic lowers both ways.

Yes, so you know what I have said is true.

> Does the RDD or any other lower level routine
> uses any paging/temp file in the network drive?

It cannot possibly matter.  You are not asking a SQL or ADS server, or terminal services, to sum the data at the server.  You are asking for the entire file to be sent to the workstation.  One record at a time.

David A. Smith
0
dlzc
12/6/2016 8:26:55 PM
On Tuesday, December 6, 2016 at 10:26:56 PM UTC+2, dlzc wrote:
> Dear Claudio H:
> 
> On Tuesday, December 6, 2016 at 11:25:42 AM UTC-7, Claudio H wrote:
> ...
> > I have a simple program that opens a DBF in a network
> > drive with RMDBFCDX rdd, reads several millions records
> > and total them in another DBF in the local drive.
> 
> OK, so it opens this distant file, reads each record of this file, from beginning to end, and sums just one field of that whole record it read.
> 
> > Before the loop begins a network traffic meter shows
> > 180/89 Kbps for download/upload.
> 
> OK.
>  
> > Once the reading loop begins, the figures raise to
> > 4.2/4.0 Mbps although my program writes nothing in
> > the network drive (at least coded by me).
> 
> Oh, *yes* it does.  As coded by you, you copy every record of that file, one-at-at-time into the memory of your workstation computer.  Then you probably generate network traffic to lock / unlock files and/or records, plus flushing buffers.
> 
> > If I pause the loop, traffic lowers both ways.
> 
> Yes, so you know what I have said is true.
> 
> > Does the RDD or any other lower level routine
> > uses any paging/temp file in the network drive?
> 
> It cannot possibly matter.  You are not asking a SQL or ADS server, or terminal services, to sum the data at the server.  You are asking for the entire file to be sent to the workstation.  One record at a time.
> 
> David A. Smith

As David explains, whenever you are using file sharing (no database server), all the records you are reading from or writing into are copied to your local workstation, plus the appropriate sections of the index files.
0
Ella
12/6/2016 8:41:50 PM
Dear Claudio H:

On Tuesday, December 6, 2016 at 11:25:42 AM UTC-7, Claudio H wrote:
....
> ...a DBF in a network drive with RMDBFCDX rdd,

Is the field being summed, in the index somewhere?  There might be some savings to be had there... since you are using RMDBFCDX.  Not sure a Clipper code routine can access it directly from the index file, however.

David A. Smith
0
dlzc
12/6/2016 11:41:04 PM
On Wednesday, December 7, 2016 at 2:11:52 AM UTC+5:30, Ella Stern wrote:
> On Tuesday, December 6, 2016 at 10:26:56 PM UTC+2, dlzc wrote:
> > Dear Claudio H:
> > 
> > On Tuesday, December 6, 2016 at 11:25:42 AM UTC-7, Claudio H wrote:
> > ...
> > > I have a simple program that opens a DBF in a network
> > > drive with RMDBFCDX rdd, reads several millions records
> > > and total them in another DBF in the local drive.
> > 
> > OK, so it opens this distant file, reads each record of this file, from beginning to end, and sums just one field of that whole record it read.
> > 
> > > Before the loop begins a network traffic meter shows
> > > 180/89 Kbps for download/upload.
> > 
> > OK.
> >  
> > > Once the reading loop begins, the figures raise to
> > > 4.2/4.0 Mbps although my program writes nothing in
> > > the network drive (at least coded by me).
> > 
> > Oh, *yes* it does.  As coded by you, you copy every record of that file, one-at-at-time into the memory of your workstation computer.  Then you probably generate network traffic to lock / unlock files and/or records, plus flushing buffers.
> > 
> > > If I pause the loop, traffic lowers both ways.
> > 
> > Yes, so you know what I have said is true.
> > 
> > > Does the RDD or any other lower level routine
> > > uses any paging/temp file in the network drive?
> > 
> > It cannot possibly matter.  You are not asking a SQL or ADS server, or terminal services, to sum the data at the server.  You are asking for the entire file to be sent to the workstation.  One record at a time.
> > 
> > David A. Smith
> 
> As David explains, whenever you are using file sharing (no database server), all the records you are reading from or writing into are copied to your local workstation, plus the appropriate sections of the index files.

err...would someone dumb it down to me in explaining:

'Before the loop begins a network traffic meter shows 180/89 Kbps for download/upload' 

is it something to do with a cloudbased xhb app.?

thanks
0
timecosting
12/7/2016 9:00:29 AM
Dear timec...:

On Wednesday, December 7, 2016 at 2:00:30 AM UTC-7, timec...@gmail.com wrote:
....
> err...would someone dumb it down to me in explaining:

Imagine that they monitor network traffic at {a managed switch in the network} or {get status from the server network card} between server and the network at large

> 'Before the loop begins a network traffic meter
> shows 180/89 Kbps for download/upload' 

So this is the average amount of traffic, on this network, as seen by whatever device is compiling statistics.  Sometimes called a "pack sniffer", usually on the internet though.

> is it something to do with a cloudbased xhb app.?

It could be, but using my crystal ball, I suspect Claudio H, or his client, noticed a slowdown in a *local* wired network response, when this "simple summation program" was being run.

Speeding this up could be done a number of ways:
- switch to SQL or ADS;
- implement "terminal services";
- have a summation program running on the server in the background, that updated a file that could quickly be read, with sums, counts, even generate by-customer statistics;
- implement a hardier / faster network architecture, with multiple paths to the server, and either gigabit LAN architecture, or fiber.

David A. Smith
0
dlzc
12/7/2016 2:15:12 PM
Hello Claudio H,

I think the traffic in this scenario is the same in both local and LAN sett=
ings. Upload/download ratio is close to 1. One can accept the download numb=
er as being reasonable but the upload is of concern when a table is being r=
ead but not written to. RDDs like LetoDB do achieve a higher Upload/downloa=
d ratio pointing to possible efficiency issues in RMDBFCDX.

My 2 pennies worth.

Regards.
Ash


On Tuesday, December 6, 2016 at 1:25:42 PM UTC-5, Claudio H wrote:
> Hi
>=20
> I have a simple program that opens a DBF in a network drive with RMDBFCDX=
 rdd, reads several millions records and total them in another DBF in the l=
ocal drive.
>=20
> Before the loop begins a network traffic meter shows 180/89 Kbps for down=
load/upload.
>=20
> Once the reading loop begins, the figures raise to 4.2/4.0 Mbps although =
my program writes nothing in the network drive (at least coded by me). If I=
 pause the loop, traffic lowers both ways.
>=20
> Does the RDD or any other lower level routine uses any paging/temp file i=
n the network drive?
>=20
> Thanks
> Claudio H

0
Ash
12/7/2016 7:03:08 PM
On Wednesday, December 7, 2016 at 7:45:15 PM UTC+5:30, dlzc wrote:
> Dear timec...:
> 
> On Wednesday, December 7, 2016 at 2:00:30 AM UTC-7, timec...@gmail.com wrote:
> ...
> > err...would someone dumb it down to me in explaining:
> 
> Imagine that they monitor network traffic at {a managed switch in the network} or {get status from the server network card} between server and the network at large
> 
> > 'Before the loop begins a network traffic meter
> > shows 180/89 Kbps for download/upload' 
> 
> So this is the average amount of traffic, on this network, as seen by whatever device is compiling statistics.  Sometimes called a "pack sniffer", usually on the internet though.
> 
> > is it something to do with a cloudbased xhb app.?
> 
> It could be, but using my crystal ball, I suspect Claudio H, or his client, noticed a slowdown in a *local* wired network response, when this "simple summation program" was being run.
> 
> Speeding this up could be done a number of ways:
> - switch to SQL or ADS;
> - implement "terminal services";
> - have a summation program running on the server in the background, that updated a file that could quickly be read, with sums, counts, even generate by-customer statistics;
> - implement a hardier / faster network architecture, with multiple paths to the server, and either gigabit LAN architecture, or fiber.
> 
> David A. Smith

'Imagine that they monitor network traffic at {a managed switch in the network} or {get status from the server network card} between server and the network at large'

-so, *some* type of hardware is required, it is not purely software driven - right ?
-can the speed/flow be accessed from *within' a xhb app. (i mean, coded with xhb or xhb-compatble-library synatx)
-what if some other app/s is also running at the some time as your xhb app, would it show different flow for each app.

thanks
0
timecosting
12/8/2016 9:00:13 AM
On Thursday, December 8, 2016 at 11:00:14 AM UTC+2, timec...@gmail.com wrote:
> On Wednesday, December 7, 2016 at 7:45:15 PM UTC+5:30, dlzc wrote:
> > Dear timec...:
> > 
> > On Wednesday, December 7, 2016 at 2:00:30 AM UTC-7, timec...@gmail.com wrote:
> > ...
> > > err...would someone dumb it down to me in explaining:
> > 
> > Imagine that they monitor network traffic at {a managed switch in the network} or {get status from the server network card} between server and the network at large
> > 
> > > 'Before the loop begins a network traffic meter
> > > shows 180/89 Kbps for download/upload' 
> > 
> > So this is the average amount of traffic, on this network, as seen by whatever device is compiling statistics.  Sometimes called a "pack sniffer", usually on the internet though.
> > 
> > > is it something to do with a cloudbased xhb app.?
> > 
> > It could be, but using my crystal ball, I suspect Claudio H, or his client, noticed a slowdown in a *local* wired network response, when this "simple summation program" was being run.
> > 
> > Speeding this up could be done a number of ways:
> > - switch to SQL or ADS;
> > - implement "terminal services";
> > - have a summation program running on the server in the background, that updated a file that could quickly be read, with sums, counts, even generate by-customer statistics;
> > - implement a hardier / faster network architecture, with multiple paths to the server, and either gigabit LAN architecture, or fiber.
> > 
> > David A. Smith
> 
> 'Imagine that they monitor network traffic at {a managed switch in the network} or {get status from the server network card} between server and the network at large'
> 
> -so, *some* type of hardware is required, it is not purely software driven - right ?
> -can the speed/flow be accessed from *within' a xhb app. (i mean, coded with xhb or xhb-compatble-library synatx)
> -what if some other app/s is also running at the some time as your xhb app, would it show different flow for each app.
> 
> thanks

In Windows:
- hit the search icon on the Task Bar
- type in "Resource Monitor"
- you can track the resources per total or per process
0
Ella
12/8/2016 1:13:18 PM
Dear timec...:

On Thursday, December 8, 2016 at 2:00:14 AM UTC-7, timec...@gmail.com wrote:
....
> -so, *some* type of hardware is required, it is not
> purely software driven - right ?

Unless you bought the cheapest possible hardware, there is software that can read network traffic.
http://www.techrepublic.com/blog/five-apps/five-free-dead-easy-ip-traffic-monitoring-tools/

> -can the speed/flow be accessed from *within' a xhb
> app. (i mean, coded with xhb or xhb-compatble-library
> synatx)

Possible, but why?  If you still have hair, you will have less when you are done figuring this out.

Consider that each record read, in however much time it took, generated network traffic proportional to the size of the record, plus some fixed overhead for the call.

> -what if some other app/s is also running at the
> some time as your xhb app, would it show different
> flow for each app.

.... sounds like you want "Sysmon"...
https://technet.microsoft.com/en-us/sysinternals/bb545021.aspx

Not sure "Resource Monitor" shows network burden, by application.

David A. Smith
0
dlzc
12/8/2016 2:11:46 PM
Ella/David

I do understand that if I read millions of records there'll be a huge network traffic from server to workstation.

My question was about the traffic from workstation to server in a situation where there is NO writing to the dbf.

Just in case is required to understand my question:
nCount:=0
USE TEST NEW
DBGOTOP()
DO WHILE !EOF()
   nCount++
   SELECT TEST
   DBSKIP()
ENDDO
USE

Regards
Claudio H
0
Claudio
12/8/2016 2:47:43 PM
Dear Claudio H:

On Thursday, December 8, 2016 at 7:47:44 AM UTC-7, Claudio H wrote:
> Ella/David
> 
> I do understand that if I read millions of records
> there'll be a huge network traffic from server to
> workstation.
> 
> My question was about the traffic from workstation
> to server in a situation where there is NO writing
> to the dbf.

Your "intent" is not changes are made, but the scatter / gather operation (not in your control) might lose the information as to whether or not any particular field content changed.

Your client workstation additionally still has to communicate the following with the server...
- reposition record in dbf, fpt (if any), cdx;
- reposition any related dbf/fpt/cdx;
- apply / control locks;
- the server has to flush buffers, and negotiate buffers with the client;
- status checks between the two for determining whether the network is still alive.

Not all of this is under control of your program, or even the xHarbour core code.  There is a *load* of overhead in updating (or just reading) a large shared Excel spreadsheet too...

Can you:
- disable indexes on opening (AUTOPEN OFF), and
- open the dbf file readonly:
http://www.itlnet.net/programming/program/Reference/c53g01c/ng3575a.html

And see what the network hit is?

David A. Smith
0
dlzc
12/8/2016 7:12:24 PM
Dear Claudio H:

On Thursday, December 8, 2016 at 7:47:44 AM UTC-7, Claudio H wrote:
....
> nCount:=0
....
> DBGOTOP() 
> DO WHILE !EOF()
>    nCount++
>    SELECT TEST
>    DBSKIP()
> ENDDO

You might change that loop to one of the commands
COUNT TO nCount
.... or ...
SUM 1, <NumericField> TO nCount, nTotal

.... I don't think the read only nature of these commands on their own, will in any way reduce the amount of traffic.  But they sure read easier, and don't take any longer to process.

This is what I means about C-ifying a language.  It becomes normal to overlook the power / simplicity of the dBase language.

David A. Smith
0
dlzc
12/9/2016 1:47:59 PM
Hi Claudio

What would happen if you open the file as:

> USE TEST NEW READONLY
?

Regards!
Claudio

El jueves, 8 de diciembre de 2016, 11:47:44 (UTC-3), Claudio H escribi=C3=
=B3:
> Ella/David
>=20
> I do understand that if I read millions of records there'll be a huge net=
work traffic from server to workstation.
>=20
> My question was about the traffic from workstation to server in a situati=
on where there is NO writing to the dbf.
>=20
> Just in case is required to understand my question:
> nCount:=3D0
> USE TEST NEW
> DBGOTOP()
> DO WHILE !EOF()
>    nCount++
>    SELECT TEST
>    DBSKIP()
> ENDDO
> USE
>=20
> Regards
> Claudio H

0
CV
12/10/2016 2:08:47 AM
Hey Claudio H:

On Friday, December 9, 2016 at 7:08:48 PM UTC-7, CV wrote:
> Hi Claudio
> 
> What would happen if you open the file as:
> 
> > USE TEST NEW READONLY
> ?

Any results?  I'd also not open the index file, since you have no need to slide through the DBF in any particular order.  This should further reduce the server->client data load, and the client -> server data load a little bit too.

Might even get some significant savings by only:
   SET ORDER TO 0
.... outside the loop, but after opening.

Windows cannot know what to queue in file buffers for the DBF, since it cannot comprehend index and ordering.  So it will tend to queue (say) records 1 thru 100, when you might read record 1, 2300, 1000000, 2...

David A. Smith
0
dlzc
12/13/2016 2:01:45 PM
David

The piece of code I posted was just a simple way to show that the loop is not explicitly writing to the DBF. 
The real code actually operates with the data read.

The READONLY clause decreased the processing time in about 15%, no way to test the real code without indexes.

Regards
Claudio H
0
Claudio
12/13/2016 4:00:03 PM
Dear Claudio H:

On Tuesday, December 13, 2016 at 9:00:04 AM UTC-7, Claudio H wrote:
....
> The piece of code I posted was just a simple way
> to show that the loop is not explicitly writing
> to the DBF.  The real code actually operates with
> the data read.

But an active index will cause a large chunk of data to be sent server-> client many times, for adjacent records, that are not in controlling index order.

> The READONLY clause decreased the processing time
> in about 15%,

OK, so you only saw a 15% decrease (from 4.2 Mbs to 3.6 Mbs), or it decreased to 15%?  It "slogged down the network" for 10 seconds before, and this decreased to 8.5 seconds.  It reduced the client->server rate by 15%, or what?

> no way to test the real code without indexes.

You can:
   SET ORDER TO 0
.... or you can
   SET AUTOPEN OFF && turns off automatic index opening
   USE TEST NEW ALIAS QUICKY READONLY
   SELECT QUICKY
   ...  <loop or SUM code>
   USE
   SET AUTOPEN ON && turns on automatic index opening

So are we to assume that you feel that none of our further suggestions have merit, and that we are wasting your time by trying to "help"?

David A. Smith
0
dlzc
12/13/2016 4:40:57 PM
On Tuesday, December 13, 2016 at 1:40:58 PM UTC-3, dlzc wrote:

> So are we to assume that you feel that none of our further suggestions have merit, and that we are wasting your time by trying to "help"?

David

I'm so sorry if something I wrote made you feel that way. I don't feel at all that your suggestions have no merit or that you or anyone else is  wasting your time.

Maybe I need to say again that there's no way to avoid the use of an index. The DBF has 19,558,369 records and I need to process about 1/3rd of them and need the index to do it.

Of course I tested your READONLY suggestion. The total time needed to process the same number of records was about 15% less.

Thank you again, and please don't stop suggesting.
Regards
Claudio H

0
Claudio
12/15/2016 7:24:40 PM
Dear Claudio H:

On Thursday, December 15, 2016 at 12:24:43 PM UTC-7, Claudio H wrote:
....
> I'm so sorry if something I wrote made you feel that
> way. I don't feel at all that your suggestions have
> no merit or that you or anyone else is  wasting your
> time.

My tiny little feelings are not important.  What is important, is that your=
 needs be addressed.  I don't think we are there quite yet.

> Maybe I need to say again that there's no way to
> avoid the use of an index. The DBF has 19,558,369
> records and I need to process about 1/3rd of them

So, your simple program, completely left out that information.  Index files=
 are small, and DBF files are large.  If you "SET ORDER TO 0", the OS can g=
uess easily which DBF record is next in line, even though it may thrash aro=
und on the index file (assuming this is not loaded entirely on opening).

So if you can turn the order off, set it to 0, you might not slow the flood=
 of data requested server->client, but it might be done sooner.  So less no=
ticeable.

> and need the index to do it.

Yes, but you don't need it to be "turned on", just speed up the filter proc=
ess.

> Of course I tested your READONLY suggestion. The
> total time needed to process the same number of
> records was about 15% less.

How did the read/write "data rates" look?  You balked at the fact that this=
 simple program could slow down the network, and the data rates were very h=
igh both directions.  This is not a good measure of "how many bytes sent ea=
ch direction", but if averaged over "one second", might be an indication of=
 whether or not we could reduce the client->server data load.

> Thank you again, and please don't stop suggesting.

I just hope we can get to some metrics, and more to the point, get your cus=
tomer happier.

You might add a delay, assuming you use a hand-coded loop, to "wait" until =
the hundredths of a second is "odd".  This will cause your loop to get data=
 for 0.1 seconds, wait 0.1 seconds, and repeat until done.  Smaller chunk o=
f data, and maybe happier customer.  Take you twice as long, but the load o=
n the network will be halved.

David A. Smith
0
dlzc
12/15/2016 9:30:25 PM
On Thursday, December 15, 2016 at 11:30:26 PM UTC+2, dlzc wrote:
> Dear Claudio H:
>=20
> On Thursday, December 15, 2016 at 12:24:43 PM UTC-7, Claudio H wrote:
> ...
> > I'm so sorry if something I wrote made you feel that
> > way. I don't feel at all that your suggestions have
> > no merit or that you or anyone else is  wasting your
> > time.
>=20
> My tiny little feelings are not important.  What is important, is that yo=
ur needs be addressed.  I don't think we are there quite yet.
>=20
> > Maybe I need to say again that there's no way to
> > avoid the use of an index. The DBF has 19,558,369
> > records and I need to process about 1/3rd of them
>=20
> So, your simple program, completely left out that information.  Index fil=
es are small, and DBF files are large.  If you "SET ORDER TO 0", the OS can=
 guess easily which DBF record is next in line, even though it may thrash a=
round on the index file (assuming this is not loaded entirely on opening).
>=20
> So if you can turn the order off, set it to 0, you might not slow the flo=
od of data requested server->client, but it might be done sooner.  So less =
noticeable.
>=20
> > and need the index to do it.
>=20
> Yes, but you don't need it to be "turned on", just speed up the filter pr=
ocess.
>=20
> > Of course I tested your READONLY suggestion. The
> > total time needed to process the same number of
> > records was about 15% less.
>=20
> How did the read/write "data rates" look?  You balked at the fact that th=
is simple program could slow down the network, and the data rates were very=
 high both directions.  This is not a good measure of "how many bytes sent =
each direction", but if averaged over "one second", might be an indication =
of whether or not we could reduce the client->server data load.
>=20
> > Thank you again, and please don't stop suggesting.
>=20
> I just hope we can get to some metrics, and more to the point, get your c=
ustomer happier.
>=20
> You might add a delay, assuming you use a hand-coded loop, to "wait" unti=
l the hundredths of a second is "odd".  This will cause your loop to get da=
ta for 0.1 seconds, wait 0.1 seconds, and repeat until done.  Smaller chunk=
 of data, and maybe happier customer.  Take you twice as long, but the load=
 on the network will be halved.
>=20
> David A. Smith

SQL engines do ignore the indexes and start sequential table scanning when =
their statistics show that more than 5% of the records are "eligible" by th=
e filtering condition(s).
0
Ella
12/15/2016 10:13:20 PM
Dear Ella Stern:

On Thursday, December 15, 2016 at 3:13:24 PM UTC-7, Ella Stern wrote:
> On Thursday, December 15, 2016 at 11:30:26 PM UTC+2, dlzc wrote:
....
> > > and need the index to do it.
> > 
> > Yes, but you don't need it to be "turned on",
> > just speed up the filter process.
> 
> SQL engines do ignore the indexes and start sequential
> table scanning when their statistics show that more
> than 5% of the records are "eligible" by the filtering
> condition(s).

In this case, he is using RMDBFCDX.

David A. Smith
0
dlzc
12/16/2016 12:06:52 AM
To the Group:

> You might add a delay, assuming you use a hand-coded
> loop, to "wait" until the hundredths of a second is
> "odd".  This will cause your loop to get data for
> 0.1 seconds, wait 0.1 seconds, and repeat until done.

I thought we had even/odd functions, but I cannot seem to find any.  How about:

Function IsEven( nUmber )
return (nUmber == int( 0.5 * nUmber ) * 2)

So:
nCount:=0
SET AUTOPEN OFF
USE TEST NEW READONLY
SELECT TEST
GO TOP
DO WHILE !EOF()
   nCount++
   DBSKIP()
   DO WHILE IsEven( int( seconds() * 10 ))
      millisecs( 11 ) && don't want inkey() here...
      ENDDO && WHILE IsEven( int( seconds() * 10 ))
   ENDDO && WHILE !EOF()
USE
SET AUTOPEN ON

Should do the job in twice the time, and lug the network down half as much.

David A. Smith
0
dlzc
12/17/2016 9:22:50 PM
Reply: