[Babase] Re: Postgresql server badly needs tuneing!

Karl O. Pinc babase@www.eco.princeton.edu
Tue, 12 Oct 2004 22:00:37 -0500


On 2004.10.12 16:55 Hunter Matthews wrote:
> It is indeed the default.
> 
> I will reset this now. HOWEVER,

Good.  Thanks.

Did you restart the server?  My conversion and
anything else that happened to be running died.
I thought that sighup would re-read the config without
interrupting things.   ?

> 
> Please dig around a bit and discover all(1) of the tunings we should
> do
> - I'd like to apply them as a block.
> 
> 1. I fully realize things may well be discovered the second after you
> hit send - I'm flexible.

Here's my take.  Bear in mind that while I've oodles of database
experience, I don't have oodles of postgresql experience so,
while this is not exactly seat-of-the-pants, I wouldn't go so
far as to say I Know What I'm Doing.

-------------------------------------

EFFECTIVE_CACHE_SIZE (floating point)

     Sets the optimizer's assumption about the effective size of the 
disk cache (that is, the portion of the kernel's disk cache that will 
be used for PostgreSQL data files). This is measured in disk pages, 
which are normally 8 kB each.

(current value 1000)

---------------------

DEBUG_PRINT_PARSE (boolean)
DEBUG_PRINT_REWRITTEN (boolean)
DEBUG_PRINT_PLAN (boolean)
DEBUG_PRETTY_PRINT (boolean)

     These flags enable various debugging output to be sent to the 
server log. For each executed query, print either the query text, the 
resulting parse tree, the query rewriter output, or the execution plan. 
DEBUG_PRETTY_PRINT indents these displays to produce a more readable 
but much longer output format.

LOG_DURATION (boolean)

     Causes the duration of every completed statement to be logged. To 
use this option, enable LOG_STATEMENT and LOG_PID so you can link the 
statement to the duration using the process ID. 
LOG_PID (boolean)

     Prefixes each server message in the log file with the process ID 
of the backend process. This is useful to sort out which messages 
pertain to which connection. The default is off. This parameter does 
not affect messages logged via syslog, which always contain the process 
ID. LOG_STATEMENT (boolean)

     Causes each SQL statement to be logged. LOG_TIMESTAMP (boolean)

     Prefixes each server log message with a time stamp. The default is 
off.


**
I want to turn these on in my databases for testing.  Soon I may 
_really_ want them on.
I believe I can do this but I don't have access to the server logs to 
see the results.
I want to dynamically turn this off and on as I do things in the 
database and see
the query plans generated, in a real-time basis.  What can we do about 
this?  Any way
I can have access to the server logs?  Without this it can be a real 
pain to figure out where the slowness is.

-------------------------
SHOW_STATEMENT_STATS (boolean)
SHOW_PARSER_STATS (boolean)
SHOW_PLANNER_STATS (boolean)
SHOW_EXECUTOR_STATS (boolean)

     For each query, write performance statistics of the respective 
module to the server log. This is a crude profiling instrument.


STATS_COMMAND_STRING (boolean)
STATS_BLOCK_LEVEL (boolean)
STATS_ROW_LEVEL (boolean)

     These flags determine what information backends send to the 
statistics collector process: current commands, block-level activity 
statistics, or row-level activity statistics. All default to off. 
Enabling statistics collection costs a small amount of time per query, 
but is invaluable for debugging and performance tuning.

**
These might be useful to me too.

------------------------------
SYSLOG (integer)

     PostgreSQL allows the use of syslog for logging. If this option is 
set to 1, messages go both to syslog and the standard output. A setting 
of 2 sends output only to syslog. (Some messages will still go to the 
standard output/error.) The default is 0, which means syslog is off. 
This option must be set at server start. SYSLOG_FACILITY (string)

     This option determines the syslog "facility" to be used when 
syslog is enabled. You may choose from LOCAL0, LOCAL1, LOCAL2, LOCAL3, 
LOCAL4, LOCAL5, LOCAL6, LOCAL7; the default is LOCAL0. See also the 
documentation of your system's syslog.

**
These might be a way to deliver logs to me.  Maybe these can be set on a
per database basis (I can actually set them as DB owner, maybe.)  ?
I can set lots of other stuff on a per-database basis so why not these?
(Ok, so why not facility at least?)

Along these lines, I want to write to the syslog on the web server
and archive the results so we can keep records of what programs where
run when by whom to do what, like good little scientists should.
Whatdaysay?
-----------------------------------
SORT_MEM (integer)

     Specifies the amount of memory to be used by internal sorts and 
hashes before switching to temporary disk files. The value is specified 
in kilobytes, and defaults to 1024 kilobytes (1 MB). Note that for a 
complex query, several sorts might be running in parallel, and each one 
will be allowed to use as much memory as this value specifies before it 
starts to put data into temporary files. Also, each running backend 
could be doing one or more sorts simultaneously, so the total memory 
used could be many times the value of SORT_MEM. Sorts are used by ORDER 
BY, merge joins, and CREATE INDEX.

**
Uh, I've no idea if this should be increased.  It is 1024 now.

--------------------------------------------
VACUUM_MEM (integer)

     Specifies the maximum amount of memory to be used by VACUUM to 
keep track of to-be-reclaimed tuples. The value is specified in 
kilobytes, and defaults to 8192 kilobytes. Larger settings may improve 
the speed of vacuuming large tables that have many deleted tuples.

**
You mentioned vacuum time limits.  Tweaking this might help.


> 
> 
> On Tue, 2004-10-12 at 14:46, Karl O. Pinc wrote:
> > Hi,
> >
> > I see that shared_buffers (presumeably in postgresql.conf)
> > is set to the default, 64.  This allocates only .5MB of
> > RAM (64 * 8KB blocks).  They recommend a setting of several thousand
> > at least.  I leave it to you to set it based on available RAM.
> >
> > See:
> > http://www.postgresql.org/docs/7.3/static/runtime-config.html
> >
> > You can send the server SIGHUP to reset this without interrupting
> > service.
> >
> > I'd appreciate it if you'd get to it ASAP as I'm working
> > on improving performance and there's no point in proceeding
> > until the server's tuned.

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                  -- Robert A. Heinlein