f



Re: index and queries using '<' '>'

At 16:23 18/11/2004 +0100, Martijn van Oosterhout wrote:
> The system seems to think that a scan is cheap because the table is so
> small.
The table currently contains just over 10000 elements. So 238 rows is a
small part of it.

> Have you ever ANALYZEd that table?
Yes. That and reindex (in case of a problem).
I have also tried ALTER TABLE ALTER COLUMN SET STATISTICS, but without
improvement.

> Also, EXPLAIN ANALYZE gives a much better idea of what is going on...
You mean ANALYZE VERBOSE ?


Since the table is still growing, and the amount of rows in the reply of
the query is quite uniform (it's not dependant on the size of the table), I
hope that the statistics will evolve in a state that will force the use of
the index.


thanks

-- 
Marc


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

0
pgml
11/19/2004 1:45:04 PM
comp.postgresql.general 5128 articles. 0 followers. Post Follow

5 Replies
523 Views

Similar Articles

[PageSpeed] 13

On Fri, Nov 19, 2004 at 14:45:04 +0100,
  Marc Boucher <pgml@gmx.net> wrote:
> At 16:23 18/11/2004 +0100, Martijn van Oosterhout wrote:
> > The system seems to think that a scan is cheap because the table is so
> > small.
> The table currently contains just over 10000 elements. So 238 rows is a
> small part of it.

No, small is typically less than 1%. This depends on the size of the rows
and how much better accessing disk blocks sequentially is in your
enviroment and the size of your cache.

Because your table is so small it will probably all be cached after being
read through once, so you may want to tune your config settings to
say than random disk access costs only a little more than sequential
access. However, you need to be careful if your table is going to grow
a lot larger.

> Since the table is still growing, and the amount of rows in the reply of
> the query is quite uniform (it's not dependant on the size of the table), I
> hope that the statistics will evolve in a state that will force the use of
> the index.

Index scans aren't always faster than sequential scans.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

0
bruno
11/19/2004 2:16:48 PM
At 08:16 19/11/2004 -0600, Bruno Wolff III wrote:
>> The table currently contains just over 10000 elements. So 238 rows is a
>> small part of it.
> No, small is typically less than 1%. This depends on the size of the rows
> and how much better accessing disk blocks sequentially is in your
> enviroment and the size of your cache.
PG runs on an old computer (200Mhz, 64MB ram); this is probably part of my
"problem". With modern hard drives, sequential scan could be faster.

> Because your table is so small it will probably all be cached after being
> read through once, so you may want to tune your config settings to
> say than random disk access costs only a little more than sequential
> access.
I think that the indexes are all cached after a while, but I doubt that the
tables can.

> However, you need to be careful if your table is going to grow
> a lot larger.
The whole database is quite large (that is for the computer it is on). 

>> Since the table is still growing, and the amount of rows in the reply of
>> the query is quite uniform (it's not dependant on the size of the table), I
>> hope that the statistics will evolve in a state that will force the use of
>> the index.
> Index scans aren't always faster than sequential scans.
I know that, but I've some comparisons with other queries. And someone
advised me to try "set enable_seqscan=off;". It takes 50-60% (after
checking right now) less to use the index. Unfortunately I can't use this
setting, the query being part of a larger query (joins), and the time
gained on this particular index is partially lost on the joins.


-- 
Marc


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

0
pgml
11/19/2004 2:47:21 PM
On Fri, Nov 19, 2004 at 15:47:21 +0100,
  Marc Boucher <pgml@gmx.net> wrote:
> I know that, but I've some comparisons with other queries. And someone
> advised me to try "set enable_seqscan=off;". It takes 50-60% (after
> checking right now) less to use the index. Unfortunately I can't use this
> setting, the query being part of a larger query (joins), and the time
> gained on this particular index is partially lost on the joins.

Tweaking random_page_cost may help. The default is 4, which is often too
high. It is supposed to represent the relative cost of retrieving disk
blocks in a random order as compared to retrieving them in sequential
order. Lowering this value will make index scans look better. Than value
should not be lowered below 1.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

0
bruno
11/19/2004 3:16:28 PM
On Fri, 19 Nov 2004, Marc Boucher wrote:

> At 16:23 18/11/2004 +0100, Martijn van Oosterhout wrote:
> > The system seems to think that a scan is cheap because the table is so
> > small.
> The table currently contains just over 10000 elements. So 238 rows is a
> small part of it.
>
> > Have you ever ANALYZEd that table?
> Yes. That and reindex (in case of a problem).
> I have also tried ALTER TABLE ALTER COLUMN SET STATISTICS, but without
> improvement.
>
> > Also, EXPLAIN ANALYZE gives a much better idea of what is going on...
> You mean ANALYZE VERBOSE ?

EXPLAIN ANALYZE is similar to EXPLAIN except that it runs the query
and gives real time per step and real row counts rather than the
estimates.  It's very handy.

If random page reads are consistently less expensive than postgres
currently thinks, you may want to lower random_page_cost.  EXPLAIN ANALYZE
is useful here because it lets you compare the real times versus the
estimated costs of both scan types when used with enable_seqscan=off.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

0
sszabo
11/19/2004 3:20:50 PM
Marc Boucher <pgml@gmx.net> writes:
>> Index scans aren't always faster than sequential scans.

> I know that, but I've some comparisons with other queries. And someone
> advised me to try "set enable_seqscan=off;". It takes 50-60% (after
> checking right now) less to use the index. Unfortunately I can't use this
> setting, the query being part of a larger query (joins), and the time
> gained on this particular index is partially lost on the joins.

A less brute-force way of encouraging the planner to use indexscans is
to reduce the random_page_cost setting.  It defaults to 4 but many
people find that values nearer 2 are more representative of what happens
in their environments.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postgresql.org so that your
      message can get through to the mailing list cleanly

0
tgl
11/19/2004 3:22:32 PM
Reply: