Memory and Scan Rate

  • Follow


Greetings!
We are running Solaris 10 3/05.
The system has run out of memory a few times due to different causes,
such as memory leaks and dodgy scripts written by some developers.
What's interesting is that vmstat's scan rate (the sr column) shows 0.
Shouldn't the page scanner have kicked in and Scan Rate be a non-zero
value?

Cheers.

0
Reply noident 12/4/2006 5:53:30 AM

noident@my-deja.com wrote:
> Greetings!
> We are running Solaris 10 3/05.
> The system has run out of memory a few times due to different causes,
> such as memory leaks and dodgy scripts written by some developers.
> What's interesting is that vmstat's scan rate (the sr column) shows 0.
> Shouldn't the page scanner have kicked in and Scan Rate be a non-zero
> value?

It depends on exactly how you ran out of memory.

The page scanner tries to find free RAM.  If you reserve but don't
immediately use a lot of memory (think alloc() ), then you'll run out of
virtual memory.  The page scanner can't do anything about that.

So if you just alloc(), all over the place, you'll run out of VM and I
would expect the page scanner to never get involved.

Do you have any more detail about exactly what happened in the past?

-- 
Darren Dunham                                           ddunham@taos.com
Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >
0
Reply Darren 12/4/2006 8:11:38 AM


> Do you have any more detail about exactly what happened in the past?
Well, the first time we ran out of memory was when an sql query
triggered what I think was a memory leak - where mysqld's size and
rsize were increasing for about an hour until nothing on the system
could allocate any more memory. The process was increasing its memory
size at about 2 MB per second. With 16GB RAM, I would expect the page
scanner to kick in once the free RAM falls below 250MB, which I believe
it had plenty of time to do. But it didn't somehow...

The 2nd time around it was a dodgy perl script that put too much stuff
in hashes and ran out of memory in about 5 minutes.

0
Reply noident 12/4/2006 7:58:24 PM

noident@my-deja.com wrote:
>> Do you have any more detail about exactly what happened in the past?
> Well, the first time we ran out of memory was when an sql query
> triggered what I think was a memory leak - where mysqld's size and
> rsize were increasing for about an hour until nothing on the system
> could allocate any more memory. The process was increasing its memory
> size at about 2 MB per second. With 16GB RAM, I would expect the page
> scanner to kick in once the free RAM falls below 250MB, which I
> believe it had plenty of time to do. But it didn't somehow...

At the end of it, what kind of error did you have?  How much free swap
(esitmated) did you have?  (No free swap, nothing for page scanner to
do).

If you run out of VM, you'll get a "mem alloc failed" or something like
that.  If you instead run out of RAM, nothing should happen, but it
might be slow until the page scanner finds some space.  The process
shouldn't die (or notice).

> The 2nd time around it was a dodgy perl script that put too much stuff
> in hashes and ran out of memory in about 5 minutes.

If I remember, I'll try to write a few one-liners to kill a machine
these ways.  You can then watch the numbers as they drop and either see
it die or the page scanner kick in.  I don't have a non-production
machine to run it on at the moment.

The idea is to just malloc() all over the place (should run out of VM,
but not RAM), and then again with calloc() instead (should dry up RAM
first).

-- 
Darren Dunham                                           ddunham@taos.com
Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >
0
Reply Darren 12/4/2006 11:11:54 PM

> At the end of it, what kind of error did you have?  How much free swap
> (esitmated) did you have?  (No free swap, nothing for page scanner to
> do).
>
> If you run out of VM, you'll get a "mem alloc failed" or something like
> that.  If you instead run out of RAM, nothing should happen, but it
> might be slow until the page scanner finds some space.  The process
> shouldn't die (or notice).

Sorry, I had to be more specific - ran out of VM completely.
The error messages were "Out of memory", "Couldn't allocate memory",
etc.
That is malloc() and similar returned NULL.

0
Reply noident 12/4/2006 11:30:47 PM

noident@my-deja.com wrote:
>> At the end of it, what kind of error did you have?  How much free swap
>> (esitmated) did you have?  (No free swap, nothing for page scanner to
>> do).
>>
>> If you run out of VM, you'll get a "mem alloc failed" or something like
>> that.  If you instead run out of RAM, nothing should happen, but it
>> might be slow until the page scanner finds some space.  The process
>> shouldn't die (or notice).

> Sorry, I had to be more specific - ran out of VM completely.
> The error messages were "Out of memory", "Couldn't allocate memory",
> etc.
> That is malloc() and similar returned NULL.

So, I'm not sure why you didn't see page scanning beforehand, but
running out of VM isn't an issue that involves the page scanner at all.
It can't create any more VM.

If you get to that point, then things must be working (in some sense).
The system has successfully allocated all VM pages.  

-- 
Darren Dunham                                           ddunham@taos.com
Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >
0
Reply Darren 12/5/2006 1:38:28 AM

Darren Dunham <ddunham@redwood.taos.com> wrote:
> If I remember, I'll try to write a few one-liners to kill a machine
> these ways.  You can then watch the numbers as they drop and either see
> it die or the page scanner kick in.  I don't have a non-production
> machine to run it on at the moment.

Okay.  Here is a run with such.  Program is at the end so you can try it
out.

The following is a run where I'm allocating ( malloc() ) 10 megabytes a
second, and I'm running vmstat with a 5 second interval.  The machine
has 128Mb RAM and a 2GB swap partition.  Let's skip to the end.

[...]
 0 0 0 376504 15792   0   2  0  0  0  0  0  0  0  0  0  404   24   44  0  1 99
 0 0 0 325248 15656   0   2  0  0  0  0  0  0  0  0  0  404   27   46  0  1 99
 0 0 0 276048 15536   0   2  0  0  0  0  0  0  0  0  0  402   32   50  0  1 99
 0 0 0 233000 15424   0   2  0  0  0  0  0  0  0  0  0  405   23   42  0  1 99
 0 0 0 181752 15296   0   2  0  0  0  0  0  0  0  0  0  404   26   45  0  1 99
 0 0 0 130504 15168   0   2  0  0  0  0  0  0  0  0  0  403   26   43  0  1 99
 0 0 0  79248 15032   0   2  0  0  0  0  0  0  0  0  0  405   28   52  0  1 99
 0 0 0  28000 14904   0   2  0  0  0  0  0  0  0  0  0  404   36   47  0  1 99
 0 0 0 2106832 18248  0   0  0  0  0  0  0  0  0  0  0  401   15   37  0  1 99

It exits immediately after a failed alloc and returns the memory.  The
free RAM is bouncing around a little bit, but not 10MB/sec.

Next, I use calloc() instead of malloc().

# vmstat 5
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr dd s1 -- --   in   sy   cs us sy id
 0 0 0 2047824 15040  4  21 21  0  0  0  6  3  0  0  0  408  284  182  1  2 97
 0 0 0 2096040 9552   1 451  0  0 48  0 771 0  0  0  0  405   47   62  2  6 92
 0 0 0 2074376 2440   0 806  9 4433 7041 0 3013 50 0 0 0 503  24  187  3 14 83
 0 0 0 2037720 2048   0 699 66 4942 5416 0 3975 49 0 0 0 501  18  246  3 13 84
 0 0 0 2006608 1872   1 664 60 5158 5172 0 4145 50 0 0 0 502  17  263  3 13 85
 0 0 0 1975608 1704   1 618 168 5196 5225 0 4288 61 0 0 0 526 18  278  2 13 85
[...]
 0 0 0 140392  1992   0 653 110 4966 4982 0 3874 60 0 0 0 511 20  297  3 13 85
 0 0 0 110592  1304   1 662 268 5346 5401 0 4797 178 0 0 0 763 18 553  3 16 81
 0 0 0  77448  1816   0 670 108 5424 5508 0 4632 99 0 0 0 601 19  375  3 14 83
 0 0 0  47560  2128   0 634 114 4843 4849 0 3651 78 0 0 0 560 20  345  2 13 85
 0 0 0  15424  5088   2 283 124 3491 3548 0 3238 216 0 0 0 835 10 383  1 12 87
 5 0 0   9384 14672 237  15 84  0  0  0  0  8  0  0  0  418   14   35  0 88 12
 0 0 0 2087376 63280  1  48 492 0  0  0  0 48  0  0  0  500   36  135  0  2 98


Wham..  Because this is a little, underpowered Ultra 5, it runs out of
free RAM almost immediately and has to begin scanning to gather free
pages.  Also notice the machine had a lot more work to do when it came
time to actually free all those pages.

I'm not sure I understand the numbers, though.  Since I'm allocating
10240 KB/s, I expected to see a similar figure in 'po', but it's only
about half that; not sure why.

Here's the program.

/* overalloc.c
   Attempt to allocate memory until error occurs
   Then print allocation amount and exit. */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main()
{
   size_t alloc_size = 1024 * 1024 * 10;  /* 10 MB */
   size_t nelem = 1;
   int alloc_num = 0;
   void *buf_ptr;

   while (1)
   {
     alloc_num++;
     /* buf_ptr = malloc(alloc_size); */   /* Change comments for */
     buf_ptr = calloc(nelem, alloc_size);  /* calloc vs alloc     */

     if (buf_ptr == NULL)
     {
       printf ("Alloc failed after %i megabytes\n", alloc_num * 10);
       exit(0);
     }
     printf ("%i\n", alloc_num);
     sleep (1);
   }
}

-- 
Darren Dunham                                           ddunham@taos.com
Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >
0
Reply Darren 12/5/2006 6:27:44 AM

6 Replies
532 Views

(page loaded in 0.263 seconds)

Similiar Articles:













7/23/2012 7:42:03 PM


Reply: