[Babase] query for marie

Karl O. Pinc kop at meme.com
Fri May 11 10:00:12 EDT 2007


On 05/11/2007 08:02:23 AM, Susan Alberts wrote:

  We should use you for harder problems, defined as problems
> that several of us can't solve after a day of trying or so. Does this  
> make sense? See more below.

Makes sense to me.

>>> Secondary problem: we selected on members.date instead of  
>>> cycpoints.date. I am not sure whether this was a problem or not but  
>>> on thinking about it I realized that the date in cycpoints was the  
>>> relevant date (it was the D date) and not the date in members.
>> 
>> This is not a problem, so long as members.date = cycpoints.date.  :-)
> 
> Thanks for this input -- I figured that this was the case, but it  
> seemed to me that the query would be more efficient if we selected on  
> cycpoints.date rather than members.date -- in other words, it might  
> run quicker. Is this right, or is it really not an issue?

It's not supposed to be an issue.  There are cases, particularly
with very complex queries with subqueries and so forth,
where it can make a difference.  The optimizer is smart but
does not do absolutely everything.  For instance, historically,
it has a preference for constants appearing on the right side
of = operators, at least in some databases.

Having said that, the only time it's worth thinking about is when
the query's going to be run over and over.  Otherwise it's
most likely not worth spending your time on.  It's also rather
rare to come across a case where fiddling with stuff like that
will actually make a difference.  I could be wrong so you might
want to experiment a little.  Put "EXPLAIN ANALYZE" in front of
your SELECT statements to do benchmarks.  Always run the SELECT
twice this way and take the results from the second run, otherwise
you won't get reproducible results because of disk read caching.
It's also a good idea to do benchmarks on a more idle system,
so today wouldn't be good.

What _is_ worth thinking about is the kind of optimization you made,
where you did members.date = cycpoints.date and so cut the number
of rows down by a couple of factors of 10, at least, rather than
doing what I did and later eliminating the rows using DISTINCT.

Karl <kop at meme.com>
Free Software:  "You don't pay back, you pay forward."
                  -- Robert A. Heinlein



More information about the Babase mailing list