summaryrefslogtreecommitdiffstatsabout
path: root/docs-2.0/manual.xml
diff options
context:
space:
mode:
Diffstat (limited to 'docs-2.0/manual.xml')
-rw-r--r--docs-2.0/manual.xml3927
1 files changed, 3927 insertions, 0 deletions
diff --git a/docs-2.0/manual.xml b/docs-2.0/manual.xml
new file mode 100644
index 0000000..9019e80
--- /dev/null
+++ b/docs-2.0/manual.xml
@@ -0,0 +1,3927 @@
1<?xml version="1.0" encoding="UTF-8"?>
2<?xml-stylesheet href="file://localhost/home/urkle/Documents/DocBook/docbook.css" type="text/css"?>
3<!DOCTYPE article PUBLIC "-//OOOCC//DTD Simplified DocBook XML V1.1 Variant V1.0//EN" "http://outoforder.cc/dtds/odocbook/1.1/odocbook.dtd" [
4<!ENTITY EmailContact "<email>urkle &lt;at&gt; outoforder &lt;dot&gt; cc</email>">
5]>
6<article>
7 <articleinfo>
8 <title>mod_log_sql Manual</title>
9 <author>
10 <firstname>Edward</firstname>
11 <surname>Rudd</surname>
12 <contrib>Conversion from Lyx to DocBook</contrib>
13 <contrib>Current Maintainer</contrib>
14 <authorblurb>
15 <simpara>
16 &EmailContact;
17 </simpara>
18 </authorblurb>
19 </author>
20 <author>
21 <firstname>Christopher</firstname>
22 <othername>B.</othername>
23 <surname>Powell</surname>
24 <contrib>Original documentation author.</contrib>
25 <authorblurb>
26 <simpara>
27 <email>chris &lt;at&gt; grubbybaby &lt;dot&gt; com</email>
28 </simpara>
29 </authorblurb>
30 </author>
31 <copyright>
32 <year>2001</year>
33 <year>2002</year>
34 <year>2003</year>
35 <holder>Christopher B. Powell</holder>
36 </copyright>
37 <copyright>
38 <year>2004</year>
39 <year>2005</year>
40 <year>2006</year>
41 <holder>Edward Rudd</holder>
42 </copyright>
43 <revhistory>
44 <revision>
45 <revnumber>1.5</revnumber>
46 <date>2006-11-04</date>
47 <revremark>Added documentation about logio parameters and added DBParam Mysql driver parameters (including tabletype)</revremark>
48 </revision>
49 <revision>
50 <revnumber>1.4</revnumber>
51 <date>2006-02-13</date>
52 <revremark>Added missing logformat types, switched to simplified docbook 1.1</revremark>
53 </revision>
54 <revision>
55 <revnumber>1.3</revnumber>
56 <date>2005-01-11</date>
57 <revremark>Updated for mod_log_sql v1.100</revremark>
58 </revision>
59 <revision>
60 <revnumber>1.2</revnumber>
61 <date>2004-04-08</date>
62 <revremark>Updated for mod_log_sql v1.97</revremark>
63 </revision>
64 <revision>
65 <revnumber>1.1</revnumber>
66 <date>2004-03-02</date>
67 <revremark>Updated for mod_log_sql v1.96</revremark>
68 </revision>
69 <revision>
70 <revnumber>1.0</revnumber>
71 <date>2004-01-22</date>
72 <revremark>Initial Conversion from Lyx to Docbook</revremark>
73 </revision>
74 </revhistory>
75 </articleinfo>
76 <section>
77 <title>Introduction</title>
78 <section tocstyle="fragment">
79 <title>Summary</title>
80 <para>
81 This Apache module will permit you to log to a SQL database; it
82 can log each access request as well as data associated with each
83 request: cookies, notes, and inbound/outbound headers. Unlike
84 logging to a flat text file -- which is standard in Apache -- a
85 SQL-based log exhibits tremendous flexibility and power of data
86 extraction. (See FAQ entry
87 <xref linkend="FAQ.WhyLogToSQL" />
88 for further discussion and examples of the advantages to SQL.)
89 </para>
90 <para>
91 This module can either replace or happily coexist with
92 mod_log_config, Apache's text file logging facility. In addition
93 to being more configurable than the standard module, mod_log_sql
94 is much more flexible.
95 </para>
96 </section>
97 <section tocstyle="fragment">
98 <title>Approach</title>
99 <para>
100 This project was formerly known as "mod_log_mysql." It was
101 renamed "mod_log_sql" in order to reflect the project goal of
102 database in-specificity. The module currently supports MySQL,
103 but support for other database back-ends is underway.
104 </para>
105 <para>
106 In order to save speed and overhead, links are kept alive in
107 between queries. This module uses one dedicated SQL link per
108 httpd child, opened by each child process when it is born. Among
109 other things, this means that this module supports logging into
110 only one MySQL server, and for now, also, only one SQL database.
111 But that's a small tradeoff compared to the blinding speed of
112 this module. Error reporting is robust throughout the module and
113 will inform the administrator of database issues in the Apache
114 ErrorLog for the server/virtual server.
115 </para>
116 <para>
117 Virtual hosts are supported in the same manner they are in the
118 regular logging modules. The administrator defines some basic
119 'global' directives in the main server config, then defines more
120 specific 'local' directives inside each VirtualHost stanza.
121 </para>
122 <para>
123 A robust "preserve" capability has now been implemented. This
124 permits the module to preserve any failed INSERT commands to a
125 local file on its machine. In any situation that the database is
126 unavailable -- e.g. the network fails or the database host is
127 rebooted -- mod_log_sql will note this in the error log and
128 begin appending its log entries to the preserve file (which is
129 created with the user and group ID of the running Apache
130 process, e.g. "nobody/nobody" on many Linux installations). When
131 database availability returns, mod_log_sql seamlessly resumes
132 logging to it. When convenient for the sysadmin, he/she can
133 easily import the preserve file into the database because it is
134 simply a series of SQL insert statements.
135 </para>
136 </section>
137 <section tocstyle="fragment">
138 <title>What gets logged by default?</title>
139 <para>
140 All the data that would be contained in the "Combined Log
141 Format" is logged by default, plus a little extra. Your best bet
142 is to begin by accepting this default, then later customize the
143 log configuration based on your needs. The documentation of the
144 run-time directives includes a full explanation of what you can
145 log, including examples -- see section
146 <xref endterm="Sect.ConfigReference.title"
147 linkend="Sect.ConfigReference" />
148 .
149 </para>
150 </section>
151 <section tocstyle="fragment">
152 <title>Miscellaneous Notes</title>
153 <itemizedlist>
154 <listitem>
155 <para>
156 Note which directives go in the 'main server config' and
157 which directives apply to the 'virtual host config'. This is
158 made clear in the directive documentation.
159 </para>
160 </listitem>
161 <listitem>
162 <para>
163 The 'time_stamp' field is stored in an UNSIGNED INTEGER
164 format, in the standard unix "seconds since the epoch"
165 format. This is superior to storing the access time as a
166 string due to size requirements: an UNSIGNED INT requires 4
167 bytes, whereas an Apache date string (e.g.
168 "18/Nov/2001:13:59:52 -0800") requires 26 bytes: those extra
169 22 bytes become significant when multiplied by thousands of
170 accesses on a busy server. Besides, an INT type is far more
171 flexible for comparisons, etc.
172 </para>
173 <para>
174 In MySQL 3.21 and above you can easily convert this to a
175 human readable format using from_unixtime(), e.g.:
176 </para>
177 <programlisting>SELECT remote_host,request_uri,from_unixtime(time_stamp)
178FROM access_log;</programlisting>
179 <para>
180 The enclosed perl program "make_combined_log.pl" extracts
181 your access log in a format that is completely compatible
182 with the Combined Log Format. You can then feed this to your
183 favorite web log analysis tool.
184 </para>
185 </listitem>
186 <listitem>
187 <para>
188 The table's string values can be CHAR or VARCHAR, at a
189 length of your choice. VARCHAR is superior because it
190 truncates long strings; CHAR types are fixed-length and will
191 be padded with spaces, resulting in waste. Just like the
192 time_stamp issue described above, that kind of space waste
193 multiplies over thousands of records.
194 </para>
195 </listitem>
196 <listitem>
197 <para>
198 Be careful not to go overboard setting fields to NOT NULL.
199 If a field is marked NOT NULL then it must contain data in
200 the INSERT statement, or the INSERT will fail. These
201 mysterious failures can be quite frustrating and difficult
202 to debug.
203 </para>
204 </listitem>
205 <listitem>
206 <para>
207 When Apache logs a numeric field, it uses a '-' character to
208 mean "not applicable," e.g. the number of bytes returned on
209 a 304 (unchanged) request. Since '-' is an illegal character
210 in an SQL numeric field, such fields are assigned the value
211 0 instead of '-' which, of course, makes perfect sense
212 anyway.
213 </para>
214 </listitem>
215 </itemizedlist>
216 </section>
217 <section tocstyle="fragment">
218 <title>Author / Maintainer</title>
219 <para>
220 The actual logging code was taken from the already existing flat
221 file text modules, so all that credit goes to the Apache
222 Software Foundation.
223 </para>
224 <para>
225 The MySQL routines and directives were added by Zeev Suraski
226 &lt;bourbon@netvision.net.il&gt;.
227 </para>
228 <para>
229 All changes from 1.06+ and the new documentation were added by
230 Chris Powell
231 <email>chris &lt;at&gt; grubbybaby &lt;dot&gt; com</email>
232 . It seems that the module had fallen into the "un-maintained"
233 category -- it had not been updated since 1998 -- so Chris
234 adopted it as the new maintainer.
235 </para>
236 <para>
237 In December of 2003, Edward Rudd
238 &EmailContact;
239 porting the module to Apache 2.0, cleaning up the code,
240 converting the documentation to DocBook, optimizing the main
241 logging loop, and added the much anticipated database
242 abstraction layer.
243 </para>
244 <para>
245 As of February 2004, Chris Powell handed over maintenance of the
246 module over to Edward Rudd. So you should contact Edward Rudd
247 about the module from now on.
248 </para>
249 </section>
250 <section id="Sect.MailingLists" tocstyle="fragment">
251 <title id="Sect.MailingLists.title">Mailing Lists</title>
252 <para>
253 A general discussion and support mailing list is provided for
254 mod_log_sq at lists.outoforder.cc. To subscribe to the mailing
255 list send a blank e-mail to
256 mod_log_sql-subscribe@lists.outoforder.cc. The list archives can
257 be accessed via Gmane.org's mailng list gateway via any new
258 reader
259 <ulink
260 url="news://news.gmane.org/gmane.comp.apache.mod-log-sql">
261 news://news.gmane.org/gmane.comp.apache.mod-log-sql
262 </ulink>
263 , or via a web browser at
264 <ulink
265 url="http://news.gmane.org/gmane.comp.apache.mod-log-sql">
266 http://news.gmane.org/gmane.comp.apache.mod-log-sql
267 </ulink>
268 .
269 </para>
270 </section>
271 </section>
272 <section>
273 <title>Installation</title>
274 <section tocstyle="fragment">
275 <title>Requirements</title>
276 <itemizedlist>
277 <listitem>
278 <para>
279 A compatible system. mod_log_sql was authored and tested on
280 systems based on Red Hat Linux (Red Hat, Mandrake), but the
281 module should easily adapt to any modern distribution.
282 mod_log_sql has also been ported successfully to Solaris and
283 FreeBSD.
284 </para>
285 </listitem>
286 <listitem>
287 <para>
288 Apache 1.3 or 2.0, 1.2 is no longer supported, but may still
289 compile. Ideally you should already have successfully
290 compiled Apache and understand the process, but this
291 document tries to make it simple for beginners.
292 </para>
293 </listitem>
294 <listitem>
295 <para>
296 The MySQL development headers. This package is called
297 different things on different distributions. For example,
298 Red Hat 6.x calls this RPM "MySQL-devel" whereas Mandrake
299 calls it "libmysql10-devel." Both MySQL 3.23.x and 4.x are
300 supported.
301 </para>
302 </listitem>
303 <listitem>
304 <para>
305 MySQL &gt;= 3.23.15 configured, installed and running on
306 either localhost or an accessible networked machine. You
307 should already have a basic understanding of MySQL and how
308 it functions.
309 </para>
310 </listitem>
311 <listitem>
312 <para>
313 Optionally, if you want to be able to log SSL information
314 such as keysize or cipher, you need OpenSSL and mod_ssl
315 installed.
316 </para>
317 </listitem>
318 </itemizedlist>
319 </section>
320 <section>
321 <title>Compiling and Installing</title>
322 <orderedlist>
323 <listitem>
324 <para>Unpack the archive into a working directory.</para>
325 <programlisting>$ tar -xzf mod_log_sql-1.94.tar.gz
326$ cd mod_log_sql-1.94</programlisting>
327 </listitem>
328 <listitem>
329 <para>run configure to configure the source directory.</para>
330 <programlisting>$ ./configure</programlisting>
331 <para>
332 The
333 <filename>configure</filename>
334 script should automatically detect all the required
335 libraries and program if the are installed in standard
336 locations.. If it returns an error, here is a description of
337 the arguments you can specify when you run
338 <filename>configure</filename>
339 .
340 </para>
341 <variablelist>
342 <varlistentry>
343 <term>--with-apxs=/usr/sbin/apxs</term>
344 <listitem>
345 <para>
346 This is the full path to the apxs binary, or the
347 directory which contains the program. This program is
348 part of the Apache 1.3 and 2.0 installation.
349 </para>
350 <para>
351 The default is to search
352 <filename>/usr/bin/apxs</filename>
353 and
354 <filename>/usr/sbin/apxs</filename>
355 .
356 </para>
357 <para>
358 Specifying a directory here will search
359 $directory/apxs, $directory/bin/apxs, and
360 $directory/sbin/apxs
361 </para>
362 <para>
363 If you have more than one version of Apache installed,
364 you need to specify the correct apxs binary for the
365 one you wish to compile for.
366 </para>
367 </listitem>
368 </varlistentry>
369 <varlistentry>
370 <term>--with-mysql=/path/to/mysql</term>
371 <listitem>
372 <para>
373 This is the directory to search for the
374 <filename>libmysqlclient</filename>
375 library and the
376 <application>MySQL</application>
377 headers.
378 </para>
379 <para>
380 The default is to search
381 <filename>/usr/include</filename>
382 ,
383 <filename>/usr/include/mysql</filename>
384 ,
385 <filename>/usr/local/include</filename>
386 , and
387 <filename>/usr/local/include/mysql</filename>
388 for
389 <application>MySQL</application>
390 headers.. And
391 <filename>/usr/lib</filename>
392 .
393 <filename>/usr/lib/mysql</filename>
394 ,
395 <filename>/usr/local/lib</filename>
396 , and
397 <filename>/usr/local/lin/mysql</filename>
398 for the
399 <application>MySQL</application>
400 libraries.
401 </para>
402 <para>
403 Specifying this testargument will search
404 $directory/include and $directory/mysql for
405 <application>MySQL</application>
406 headers. And $directory/lib and $directory/lib/mysql
407 for
408 <application>MySQL</application>
409 libraries.
410 </para>
411 </listitem>
412 </varlistentry>
413 <varlistentry>
414 <term>--enable-ssl</term>
415 <listitem>
416 <para>
417 Specifying this argument will enable the search for
418 mod_ssl and SSL headers, and if found will enable
419 compilation of SSL support into mod_log_sql. SSL
420 support is compiled into a separate module that can be
421 loaded after the main mod_log_sql.
422 </para>
423 </listitem>
424 </varlistentry>
425 <varlistentry>
426 <term>--with-ssl-inc=/usr/include/openssl</term>
427 <listitem>
428 <para>
429 This is the path to the SSL toolkit header files that
430 were used to compile mod_ssl. If you want SSL support
431 you most likely need to specify this.
432 </para>
433 <para>
434 The default is to search
435 <filename>/usr/include</filename>
436 and
437 <filename>/usr/include/openssl</filename>
438 .
439 </para>
440 <para>
441 Specifying this argument will search that directory
442 for the SSL headers.
443 </para>
444 </listitem>
445 </varlistentry>
446 <varlistentry>
447 <term>--with-db-inc=/usr/include/db1</term>
448 <listitem>
449 <para>
450 This argument is only needed when compiling SSL
451 support for Apache 1.3, and needs to be the directory
452 which contains the ndbm.h header file. You can find
453 this by using
454 </para>
455 <programlisting>$ locate ndbm.h
456/usr/include/db1/ndbm.h
457/usr/include/gdbm/ndbm.h</programlisting>
458 <para>
459 As far as I can tell, there is no difference as to
460 which you specify, but it should be the one that you
461 compiled mod_ssl with.
462 </para>
463 <para>
464 The default is
465 <filename>/usr/include/db1</filename>
466 , which should work on most systems.
467 </para>
468 </listitem>
469 </varlistentry>
470 <varlistentry>
471 <term>--disable-apachetest</term>
472 <listitem>
473 <para>
474 This will disable the apache version test. However
475 there is a side affect if you specify this where I
476 will not be able to determine which version of Apache
477 you are compiling for. So don't specify this.. If you
478 are having troubles with the script detecting your
479 Apache version, then send a bug report along with your
480 system OS version and versions of related packages.
481 </para>
482 </listitem>
483 </varlistentry>
484 <varlistentry>
485 <term>--disable-mysqltest</term>
486 <listitem>
487 <para>
488 This will disable the MySQL compile test. Specify this
489 if for some reason the test fail but you know you have
490 specified the correct directories. If mod_los_sql also
491 fails to compile report a bug along with your system
492 OS version and versions of related packages.
493 </para>
494 </listitem>
495 </varlistentry>
496 </variablelist>
497 </listitem>
498 <listitem>
499 <para>
500 Now compile the module with GNU make. You may have to
501 specify gmake on some systems like FreeBSD.
502 </para>
503 <programlisting>$ gmake</programlisting>
504 </listitem>
505 <listitem>
506 <para>
507 If there were no errors, you can now install the module(s).
508 If you compiled as a non-root user you may need to switch
509 users with
510 <application>su</application>
511 or
512 <application>sudo</application>
513 .
514 </para>
515 <programlisting>$ su -c "gmake install"
516Password:</programlisting>
517 </listitem>
518 <listitem>
519 <para>
520 Now edit your Apache configuration and load the modules.
521 </para>
522 <note>
523 <itemizedlist>
524 <listitem>
525 <para>
526 If you are loading the SSL logging module, you need to
527 make sure it is loaded after mod_ssl and mod_log_sql.
528 </para>
529 </listitem>
530 <listitem>
531 <para>
532 If you have previously used mod_log_sql version 1.18,
533 the name of the module has changed from sql_log_module
534 to log_sql_module (the first parameter to LoadModule)
535 </para>
536 </listitem>
537 <listitem>
538 <para>
539 If you are upgrading from any release earlier than
540 1.97 you need to add an extra LoadModule directive to
541 load the database driver (ie mysql).
542 </para>
543 </listitem>
544 </itemizedlist>
545 </note>
546 <orderedlist>
547 <listitem>
548 <para>
549 Insert these lines to either the main
550 <filename>httpd.conf</filename>
551 or a file included via an include directive.
552 </para>
553 <programlisting>LoadModule log_sql_module modules/mod_log_sql.so
554LoadModule log_sql_mysql_module modules/mod_log_sql_mysql.so
555&lt;IfModule mod_ssl.c&gt;
556LoadModule log_sql_ssl_module moduels/mod_log_sql_ssl.so
557&lt;/IfModule&gt;</programlisting>
558 <note>
559 <para>
560 If you did not compile SSL support in mod_log_sql, do
561 not include the lines between the &lt;IfModule&gt;
562 directives.
563 </para>
564 </note>
565 </listitem>
566 <listitem>
567 <para>
568 If you are using Apache 1.3 you may need add these lines
569 later in the configuration.
570 </para>
571 <programlisting>AddModule mod_log_sql.c
572AddModule mod_log_sql_mysql.c
573&lt;IfModule mod_ssl.c&gt;
574AddModule mod_log_sql_ssl.c
575&lt;/IfModule&gt;</programlisting>
576 <note>
577 <para>
578 If you did not compile SSL support in mod_log_sql, do
579 not include the lines between the &lt;IfModule&gt;
580 directives.
581 </para>
582 </note>
583 </listitem>
584 </orderedlist>
585 </listitem>
586 </orderedlist>
587 </section>
588 </section>
589 <section id="Sect.Configuration">
590 <title id="Sect.Configuration.title">Configuration</title>
591 <section id="Sect.Preperation">
592 <title id="Sect.Preperation.title">
593 Preparing MySQL for logging
594 </title>
595 <para>
596 You have to prepare the database to receive data from
597 <application>mod_log_sql</application>
598 , and set up run-time directives in
599 <filename>httpd.conf</filename>
600 to control how and what
601 <application>mod_log_sql</application>
602 logs.
603 </para>
604 <para>
605 This section will discuss how to get started with a basic
606 configuration. Full documentation of all available run-time
607 directives is available in section
608 <xref endterm="Sect.ConfigReference.title"
609 linkend="Sect.ConfigReference" />
610 .
611 </para>
612 <orderedlist>
613 <listitem>
614 <para>
615 mod_log_sql can make its own tables on-the-fly, or you can
616 pre-make the tables by hand. The advantage of letting the
617 module make the tables is ease-of-use, but for raw
618 performance you will want to pre-make the tables in order to
619 save some overhead. In this basic setup we'll just let the
620 module create tables for us.
621 </para>
622 </listitem>
623 <listitem>
624 <para>
625 We still need to have a logging database created and ready,
626 so run the MySQL command line client and create a database:
627 </para>
628 <programlisting># mysql -uadmin -p
629Enter password:
630mysql&gt; create database apachelogs;</programlisting>
631 </listitem>
632 <listitem id="Item.CreateTable">
633 <para>
634 If you want to hand-create the tables, run the enclosed
635 'create-tables' SQL script as follows ("create_tables.sql"
636 needs to be in your current working directory).
637 </para>
638 <programlisting>mysql&gt; use apachelogs
639Database changed
640mysql&gt; source create_tables.sql</programlisting>
641 </listitem>
642 <listitem>
643 <para>
644 Create a specific
645 <application>MySQL</application>
646 userid that
647 <application>httpd</application>
648 will use to authenticate and enter data. This userid need
649 not be an actual Unix user. It is a userid internal to
650 <application>MySQL</application>
651 with specific privileges. In the following example command,
652 "apachelogs" is the database, "loguser" is the userid to
653 create, "my.apachemachine.com" is the name of the Apache
654 machine, and "l0gger" is the password to assign. Choose
655 values that are different from these examples.
656 </para>
657 <programlisting>mysql&gt; grant insert,create on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger';</programlisting>
658 </listitem>
659 <listitem>
660 <para>
661 You may be especially security-paranoid and want "loguser"
662 to not have "create" capability within the "apachelogs"
663 database. You can disable that privilege, but the cost is
664 that you will not be able to use the module's on-the-fly
665 table creation feature. If that cost is acceptable,
666 hand-create the tables as described in step
667 <xref linkend="Item.CreateTable" />
668 and use the following GRANT statement instead of the one
669 above:
670 </para>
671 <programlisting>mysql&gt; grant insert on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger';</programlisting>
672 </listitem>
673 <listitem id="Item.EnableLogging">
674 <para>
675 Enable full logging of your
676 <application>MySQL</application>
677 daemon (at least temporarily for debugging purposes) if you
678 don't do this already. Edit /etc/my.cnf and add the
679 following line to your [mysqld] section:
680 </para>
681 <programlisting>log=/var/log/mysql-messages</programlisting>
682 <para>
683 Then restart
684 <application>MySQL</application>
685 </para>
686 <programlisting># /etc/rc.d/init.d/mysql restart</programlisting>
687 </listitem>
688 </orderedlist>
689 </section>
690 <section>
691 <title>A very basic logging setup in Apache</title>
692 <orderedlist>
693 <listitem>
694 <para>
695 Tell the module what database to use and the appropriate
696 authentication information.
697 </para>
698 <para>
699 So, edit httpd.conf and insert the following lines somewhere
700 after any LoadModule / AddModule statements. Make sure these
701 statements are "global," i.e. not inside any VirtualHost
702 stanza. You will also note that you are embedding a password
703 in the file. Therefore you are advised to "chmod 660
704 httpd.conf" to prevent unauthorized regular users from
705 viewing your database user and password.
706 </para>
707 <para>
708 Use the
709 <application>MySQL</application>
710 database called "apachelogs" running on "dbmachine.foo.com".
711 Use username "loguser" and password "l0gg3r" to authenticate
712 to the database. Permit the module create tables for us.
713 </para>
714 <example>
715 <title>Basic Example</title>
716 <programlisting>LogSQLLoginInfo mysql://loguser:l0gg3r@dbmachine.foo.com/apachelogs
717LogSQLCreateTables on</programlisting>
718 </example>
719 <para>
720 If your database resides on localhost instead of another
721 host, specify the MySQL server's socket file as follows:
722 </para>
723 <programlisting>LogSQLDBParam socketfile /your/path/to/mysql.sock</programlisting>
724 <para>
725 If your database is listening on a port other than 3306,
726 specify the correct TCP port as follows:
727 </para>
728 <programlisting>LogSQLDBParam port 1234</programlisting>
729 </listitem>
730 <listitem>
731 <para>
732 The actual logging is set up on a virtual-host-by-host
733 basis. So, skip down to the virtual host you want to set up.
734 Instruct this virtual host to log entries to the table
735 "access_log" by inserting a LogSQLTransferLogTable
736 directive. (The LogSQLTransferLogTable directive is the
737 minimum required to log -- other directives that you will
738 learn about later simply tune the module's behavior.)
739 </para>
740 <programlisting>&lt;VirtualHost 1.2.3.4&gt;
741 [snip]
742 LogSQLTransferLogTable access_log
743 [snip]
744&lt;/VirtualHost&gt;</programlisting>
745 </listitem>
746 <listitem>
747 <para>Restart apache.</para>
748 <programlisting># /etc/rc.d/init.d/httpd stop
749# /etc/rc.d/init.d/httpd start</programlisting>
750 </listitem>
751 </orderedlist>
752 </section>
753 <section>
754 <title>Testing the basic setup</title>
755 <orderedlist>
756 <listitem>
757 <para>
758 Visit your web site in a browser to trigger some hits, then
759 confirm that the entries are being successfully logged:
760 </para>
761 <programlisting># mysql -hdbmachine.foo.com -umysqladmin -p -e "SELECT * FROM access_log" apachelogs
762Enter password:</programlisting>
763 <para>
764 Several lines of output should follow, corresponding to your
765 hits on the site. You now have basic functionality. Don't
766 disable your regular Apache logs until you feel comfortable
767 that the database is behaving as you'd like and that things
768 are going well. If you do not see any entries in the
769 access_log, please consult section
770 <xref linkend="FAQ.NothingLogged" />
771 of the FAQ on how to debug and fix the situation.
772 </para>
773 </listitem>
774 <listitem>
775 <para>
776 You can now activate the advanced features of mod_log_sql,
777 which are described in the next section.
778 </para>
779 </listitem>
780 </orderedlist>
781 </section>
782 <section>
783 <title>How to tune logging with run-time directives</title>
784 <section tocstyle="fragment">
785 <title>Instructing the module what to log</title>
786 <para>
787 The most basic directive for the module is
788 LogSQLTransferLogFormat, which tells the module which
789 information to send to the database; logging to the database
790 will not take place without it. Place a
791 LogSQLTransferLogFormat directive in the VirtualHost stanza of
792 each virtual host that you want to activate.
793 </para>
794 <para>
795 After LogSQLTransferLogFormat you supply a string of
796 characters that tell the module what information to log. In
797 the configuration directive reference (section
798 <xref linkend="Conf.LogSQLTransferLogFormat" />
799 ) there is a table which clearly defines all the possible
800 things to log. Let's say you want to log only the "request
801 time," the "remote host," and the "request"; you'd use:
802 </para>
803 <programlisting>LogSQLTransferLogFormat hUS</programlisting>
804 <para>But a more appropriate string to use is</para>
805 <programlisting>LogSQLTransferLogFormat AbHhmRSsTUuv</programlisting>
806 <para>
807 which logs all the information required to be compatible with
808 the Combined Log Format (CLF).
809 </para>
810 <para>
811 If you don't choose to log everything that is available,
812 that's fine. Fields in the unused columns in your table will
813 simply contain NULL.
814 </para>
815 <para>
816 Some of the LogSQLTransferLogFormat characters require a
817 little extra configuration:
818 </para>
819 <itemizedlist>
820 <listitem>
821 <para>
822 If you specify 'c' to indicate that you want to log the
823 cookie value, you must also tell the module which cookie
824 you mean by using LogSQLWhichCookie -- after all, there
825 could be many cookies associated with a given request.
826 Fail to specify LogSQLWhichCookie, and no cookie
827 information at all will be logged.
828 </para>
829 </listitem>
830 <listitem>
831 <para>
832 If you specify 'M' to indicate that you want to log the
833 machine ID, you must also tell the module this machine's
834 identity using the LogSQLMachineID directive. Fail to
835 specify LogSQLMachineID, and a simple '-' character will
836 be logged in the machine_id column.
837 </para>
838 </listitem>
839 </itemizedlist>
840 </section>
841 <section id="Sect.Ignore">
842 <title id="Sect.Ignore.title">
843 Instructing the module what NOT to log using filtering
844 directives
845 </title>
846 <para>
847 One "accept" and two "ignore" directives allow you to
848 fine-tune what the module should not log. These are very handy
849 for keeping your database as uncluttered as possible and
850 keeping your statistics free of unneeded numbers. Think of
851 each one as a gatekeeper.
852 </para>
853 <para>
854 <emphasis>
855 It is important to remember that each of these three
856 directives is purely optional. mod_log_sql's default is to
857 log everything.
858 </emphasis>
859 </para>
860 <para>
861 When a request comes in, the contents of LogSQLRequestAccept
862 are evaluated first. This optional, "blanket" directive lets
863 you specify that only certain things are to be accepted for
864 logging, and everything else discarded. Because it is
865 evaluated before LogSQLRequestIgnore and LogSQLRemhostIgnore
866 it can halt logging before those two filtering directives "get
867 their chance."
868 </para>
869 <para>
870 Once a request makes it past LogSQLRequestAccept, it still can
871 be excluded based on LogSQLRemhostIgnore and
872 LogSQLRequestIgnore. A good way to use LogSQLRemhostIgnore is
873 to prevent the module from logging the traffic that your
874 internal hosts generate. LogSQLRequestIgnore is great for
875 preventing things like requests for "favicon.ico" from
876 cluttering up your database, as well as excluding the various
877 requests that worms make, etc.
878 </para>
879 <para>
880 You can specify a series of strings after each directive. Do
881 not use any type of globbing or regular-expression syntax --
882 each string is considered a match
883 <emphasis>
884 if it is a substring of the larger request or remote-host;
885 the comarison is case-sensitive
886 </emphasis>
887 . This means that "LogSQLRemhostIgnore micro" will ignore
888 requests from "microsoft.com," "microworld.net,"
889 "mymicroscope.org," etc. "LogSQLRequestIgnore gif" will
890 instruct the module to ignore requests for "leftbar.gif,"
891 "bluedot.gif" and even "giftwrap.jpg" -- but "RED.GIF" and
892 "Tree.Gif" would still get logged because of case sensitivity.
893 </para>
894 <para>A summary of the decision flow:</para>
895 <orderedlist>
896 <listitem>
897 <para>
898 If LogSQLRequestAccept exists and a request does not match
899 anything in that list, it is discarded.
900 </para>
901 </listitem>
902 <listitem>
903 <para>
904 If a request matches anything in the LogSQLRequestIgnore
905 list, it is discarded.
906 </para>
907 </listitem>
908 <listitem>
909 <para>
910 If a reqiest matches anything in the LogSQLRemhostIgnore
911 list, it is discarded.
912 </para>
913 </listitem>
914 <listitem>
915 <para>Otherwise the request is logged.</para>
916 </listitem>
917 </orderedlist>
918 <para>
919 This means that you can have a series of directives similar to
920 the following:
921 </para>
922 <programlisting>LogSQLRequestAccept .html .gif .jpg
923LogSQLRequestIgnore statistics.html bluedot.jpg</programlisting>
924 <para>
925 So the first line instructs the module to only log files with
926 html, gif and jpg suffixes; requests for "formail.cgi" and
927 "shopping-cart.pl" will never be considered for logging.
928 ("LeftArrow.JPG" will also never be considered for logging --
929 remember, the comparison is case sensitive.) The second line
930 prunes the list further -- you never want to log requests for
931 those two objects.
932 </para>
933 <note role="tip">
934 <itemizedlist>
935 <listitem>
936 <para>
937 If you want to match all the hosts in your domain such
938 as "host1.corp.foo.com" and "server.dmz.foo.com", simply
939 specify:
940 </para>
941 <programlisting>LogSQLRemhostIgnore foo.com</programlisting>
942 </listitem>
943 <listitem>
944 <para>
945 A great way to catch the vast majority of worm-attack
946 requests and prevent them from being logged is to
947 specify:
948 </para>
949 <programlisting>LogSQLRequestIgnore root.exe cmd.exe default.ida</programlisting>
950 </listitem>
951 <listitem>
952 <para>
953 To prevent the logging of requests for common graphic
954 types, make sure to put a '.' before the suffix to avoid
955 matches that you didn't intend:
956 </para>
957 <programlisting>LogSQLRequestIgnore .gif .jpg</programlisting>
958 </listitem>
959 </itemizedlist>
960 </note>
961 </section>
962 </section>
963 <section>
964 <title>Advanced logging scenarios</title>
965 <section tocstyle="fragment">
966 <title>Using the module in an ISP environment</title>
967 <para>mod_log_sql has three basic tiers of operation:</para>
968 <orderedlist>
969 <listitem>
970 <para>
971 The administrator creates all necessary tables by hand and
972 configures each Apache VirtualHost by hand.
973 (LogSQLCreateTables Off)
974 </para>
975 </listitem>
976 <listitem>
977 <para>
978 The module is permitted to create necessary tables
979 on-the-fly, but the administrator configures each Apache
980 VirtualHost by hand. (LogSQLCreateTables On)
981 </para>
982 </listitem>
983 <listitem>
984 <para>
985 The module is permitted to create all necessary tables and
986 to make intelligent, on-the-fly configuration of each
987 VirtualHost. (LogSQLMassVirtualHosting On)
988 </para>
989 </listitem>
990 </orderedlist>
991 <para>
992 Many users are happy to use the module in its most minimal
993 form: they hand-create any necessary tables (using
994 "create_tables.sql"), and they configure each VirtualHost by
995 hand to suit their needs. However, some administrators need
996 extra features due to a large and growing number of
997 VirtualHosts. The LogSQLMassVirtualHosting directive activates
998 module capabilities that make it far easier to manage an ISP
999 environment, or any situation characterized by a large and
1000 varying number of virtual servers.
1001 </para>
1002 <itemizedlist>
1003 <listitem>
1004 <para>
1005 the on-the-fly table creation feature is activated
1006 automatically
1007 </para>
1008 </listitem>
1009 <listitem>
1010 <para>
1011 the transfer log table name is dynamically set from the
1012 virtual host's name (example: a virtual host
1013 "www.grubbybaby.com" gets logged to table
1014 "access_www_grubbybaby_com")
1015 </para>
1016 </listitem>
1017 </itemizedlist>
1018 <para>
1019 There are numerous benefits. The admin will not need to create
1020 new tables for every new VirtualHost. (Although the admin will
1021 still need to drop the tables of virtual hosts that are
1022 removed.) The admin will not need to set
1023 LogSQLTransferLogTable for each virtual host -- it will be
1024 configured automatically based on the host's name. Because
1025 each virtual host will log to its own segregated table, data
1026 about one virtual server will segregate from others; an admin
1027 can grant users access to the tables they need, and they will
1028 be unable to view data about another user's virtual host.
1029 </para>
1030 <para>
1031 In an ISP scenario the admin is likely to have a cluster of
1032 many front-end webservers logging to a back-end database.
1033 mod_log_sql has a feature that permits analysis of how well
1034 the web servers are loadbalancing: the LogSQLMachineID
1035 directive. The administrator uses this directive to assign a
1036 unique identifier to each machine in the web cluster, e.g.
1037 "LogSQLMachineID web01," "LogSQLMachineID web02," etc. Used in
1038 conjunction with the 'M' character in LogSQLTransferLogFormat,
1039 each entry in the SQL log will include the machine ID of the
1040 machine that created the entry. This permits the administrator
1041 to count the entries made by each particular machine and
1042 thereby analyze the front-end loadbalancing algorithm.
1043 </para>
1044 </section>
1045 <section id="Sect.MultiTable">
1046 <title id="Sect.MultiTable.title">
1047 Logging many-to-one data in separate tables
1048 </title>
1049 <para>
1050 A given HTTP request can have a one-to-many relationship with
1051 certain kinds of data. For example, a single HTTP request can
1052 have 4 cookies, 3 headers and 5 "mod_gzip" notes associated
1053 with it. mod_log_sql is capable of logging these relationships
1054 due to the elegance of SQL relational data.
1055 </para>
1056 <para>
1057 You already have a single table containing access requests.
1058 One of the columns in that table is 'id' which is intended to
1059 contain the unique request ID supplied by the standard Apache
1060 module mod_unique_id -- all you need to do is compile in that
1061 module and employ the LogSQLTransferLogFormat character 'I'.
1062 Thereafter, each request gets a unique ID that can be thought
1063 of as a primary key within the database, useful for joining
1064 multiple tables. So let's envision several new tables: a notes
1065 table, a cookies table, and a table for inbound and outbound
1066 headers.
1067 </para>
1068 <table>
1069 <title>&lt;tblAcc&gt;access_log</title>
1070 <tgroup cols="6">
1071 <colspec colname="1" />
1072 <colspec colname="2" />
1073 <colspec colname="3" />
1074 <colspec colname="4" />
1075 <colspec colname="5" colwidth="40" />
1076 <colspec colname="6" colwidth="70" />
1077 <thead>
1078 <row>
1079 <entry colname="1">id</entry>
1080 <entry colname="2">remote_host</entry>
1081 <entry colname="3">request_uri</entry>
1082 <entry colname="4">time_stamp</entry>
1083 <entry colname="5">status</entry>
1084 <entry colname="6">bytes_sent</entry>
1085 </row>
1086 </thead>
1087 <tbody>
1088 <row>
1089 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1090 <entry colname="2">zerberus.aiacs.net</entry>
1091 <entry colname="3">/mod_log_sql/index.html</entry>
1092 <entry colname="4">1022493617</entry>
1093 <entry colname="5">200</entry>
1094 <entry colname="6">2215</entry>
1095 </row>
1096 </tbody>
1097 </tgroup>
1098 </table>
1099 <table>
1100 <title>&lt;tblNotes&gt;notes_log</title>
1101 <tgroup cols="3">
1102 <colspec colname="1" />
1103 <colspec colname="2" />
1104 <colspec colname="3" colwidth="30" />
1105 <thead>
1106 <row>
1107 <entry colname="1">id</entry>
1108 <entry colname="2">item</entry>
1109 <entry colname="3">val</entry>
1110 </row>
1111 </thead>
1112 <tbody>
1113 <row>
1114 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1115 <entry colname="2">mod_gzip_result</entry>
1116 <entry colname="3">OK</entry>
1117 </row>
1118 <row>
1119 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1120 <entry colname="2">mod_gzip_compression_ratio</entry>
1121 <entry colname="3">69</entry>
1122 </row>
1123 </tbody>
1124 </tgroup>
1125 </table>
1126 <table>
1127 <title>&lt;tblHdr&gt;headers_log</title>
1128 <tgroup cols="3">
1129 <colspec colname="1" colnum="1" />
1130 <colspec colname="2" colnum="2" />
1131 <colspec colname="3" colnum="3" />
1132 <thead>
1133 <row>
1134 <entry colname="1">id</entry>
1135 <entry colname="2">item</entry>
1136 <entry colname="3">val</entry>
1137 </row>
1138 </thead>
1139 <tbody>
1140 <row>
1141 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1142 <entry colname="2">Content-Type</entry>
1143 <entry colname="3">text/html</entry>
1144 </row>
1145 <row>
1146 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1147 <entry colname="2">Accept-Encoding</entry>
1148 <entry colname="3">gzip, deflate</entry>
1149 </row>
1150 <row>
1151 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1152 <entry colname="2">Expires</entry>
1153 <entry colname="3">Tue, 28 May 2002 10:00:18 GMT</entry>
1154 </row>
1155 <row>
1156 <entry colname="1">PPIDskBRH30AAGPtAsg</entry>
1157 <entry colname="2">Cache-Control</entry>
1158 <entry colname="3">max-age=86400</entry>
1159 </row>
1160 </tbody>
1161 </tgroup>
1162 </table>
1163 <para>
1164 We have a certain request, and its unique ID is
1165 "PPIDskBRH30AAGPtAsg". Within each separate table will be
1166 multiple entries with that request ID: several cookie entries,
1167 several header entries, etc. As you can see in tables
1168 [tblAcc], [tblNotes] and [tblHdr], you have a one-to-many
1169 relationship for request PPIDskBRH30AAGPtAsg: that one access
1170 has two associated notes and four associated headers. You can
1171 extract this data easily using the power of SQL's "select"
1172 statement and table joins. To see the notes associated with a
1173 particular request:
1174 </para>
1175 <programlisting>SELECT a.remote_host, a.request_uri, n.item, n.val
1176FROM access_log a JOIN notes_log n ON a.id=n.id
1177WHERE a.id='PPIDskBRH30AAGPtAsg';</programlisting>
1178 <table>
1179 <title>access_log joined to notes_log</title>
1180 <tgroup cols="4">
1181 <colspec colname="1" />
1182 <colspec colname="2" />
1183 <colspec colname="3" />
1184 <colspec colname="4" colwidth="30" />
1185 <thead>
1186 <row>
1187 <entry colname="1">remote_host</entry>
1188 <entry colname="2">request_uri</entry>
1189 <entry colname="3">item</entry>
1190 <entry colname="4">val</entry>
1191 </row>
1192 </thead>
1193 <tbody>
1194 <row>
1195 <entry colname="1">zerberus.aiacs.net</entry>
1196 <entry colname="2">/mod_log_sql/index.html</entry>
1197 <entry colname="3">mod_gzip_result</entry>
1198 <entry colname="4">OK</entry>
1199 </row>
1200 <row>
1201 <entry colname="1">zerberus.aiacs.net</entry>
1202 <entry colname="2">/mod_log_sql/index.html</entry>
1203 <entry colname="3">mod_gzip_compression_ratio</entry>
1204 <entry colname="4">69</entry>
1205 </row>
1206 </tbody>
1207 </tgroup>
1208 </table>
1209 <para>
1210 Naturally you can craft similar statements for the outboud
1211 headers, inbound headers and cookies, all of which can live in
1212 separate tables. Your statements are limited in power only by
1213 your skill with SQL.
1214 </para>
1215 <para>
1216 In order to use this capability of mod_log_sql, you must do
1217 several things.
1218 </para>
1219 <itemizedlist>
1220 <listitem>
1221 <para>
1222 Compile mod_unique_id into Apache (statically or as a
1223 DSO). mod_log_sql employs the unique request ID that
1224 mod_unique_id provides in order to key between the
1225 separate tables. You can still log the data without
1226 mod_unqiue_id, but it will be completely uncorrelated and
1227 you will have no way to discern any meaning.
1228 </para>
1229 </listitem>
1230 <listitem>
1231 <para>
1232 Create the appropriate tables. This will be done for you
1233 if you permit mod_log_sql to create its own tables using
1234 LogSQLCreateTables On, or if you use the enclosed
1235 "create_tables.sql" script.
1236 </para>
1237 </listitem>
1238 <listitem>
1239 <para>
1240 Create a SQL index on the "id" column. Without this index,
1241 table joins will be deathly slow. I recommend you consult
1242 the MySQL documentation on the proper way to create a
1243 column index if you are not familiar with this operation.
1244 </para>
1245 </listitem>
1246 <listitem>
1247 <para>
1248 Within each appropriate VirtualHost stanza, use the
1249 LogSQLWhich* and LogSQL*LogTable directives to tell the
1250 module what and where to log the data. In the following
1251 example, I have overridden the name for the notes table
1252 whereas I have left the other table names at their
1253 defaults. I have then specified the cookies, headers and
1254 notes that interest me. (And as you can see, these
1255 directives do not require me to add any characters to
1256 LogSQLTransferLogTable.)
1257 </para>
1258 <programlisting>&lt;VirtualHost 216.231.36.128&gt;
1259 (snip)
1260 LogSQLNotesLogTable notestable
1261 LogSQLWhichCookies bluecookie redcookie greencookie
1262 LogSQLWhichNotes mod_gzip_result mod_gzip_compression_ratio
1263 LogSQLWhichHeadersOut Expires Content-Type Cache-Control
1264 LogSQLWhichHeadersIn User-Agent Accept-Encoding Host
1265 (snip)
1266&lt;/VirtualHost&gt;</programlisting>
1267 </listitem>
1268 </itemizedlist>
1269 </section>
1270 <section>
1271 <title>Using the same database for production and test</title>
1272 <para>
1273 Although sub-optimal, it is not uncommon to use the same
1274 back-end database for the "production" webservers as well as
1275 the "test" webservers (budgetary constraints, rack-space
1276 limits, etc.). Furthermore, an administrator in this situation
1277 may be unable to use LogSQLRemhostIgnore to exclude requests
1278 from the test servers -- perhaps the generated entries are
1279 genuinely useful for analytical or QA purposes, but their
1280 value after analysis is minimal.
1281 </para>
1282 <para>
1283 It is wasteful and potentially confusing to permit this
1284 internal test data to clutter the database, and a solution to
1285 the problem is the proper use of the LogSQLMachineID
1286 directive. Assume a scenario where the production webservers
1287 have IDs like "web01," "web02," and so on -- and the test
1288 webservers have IDs like "test01," "test02," etc. Because
1289 entries in the log database are distinguished by their source
1290 machine, an administrator may purge unneeded test data from
1291 the access log as follows:
1292 </para>
1293 <programlisting>DELETE FROM access_log WHERE machine_id like 'test%';</programlisting>
1294 </section>
1295 <section id="Sect.DelayedInsert">
1296 <title id="Sect.DelayedInsert.title">
1297 Optimizing for a busy database
1298 </title>
1299 <para>
1300 A busy MySQL database will have SELECT statements running
1301 concurrently with INSERT and UPDATE statements. A long-running
1302 SELECT can in certain circumstances block INSERTs and
1303 therefore block mod_log_sql. A workaround is to enable
1304 mod_log_sql for "delayed inserts," which are described as
1305 follows in the MySQL documentation.
1306 </para>
1307 <para>
1308 The DELAYED option for the INSERT statement is a
1309 MySQL-specific option that is very useful if you have clients
1310 that can't wait for the INSERT to complete. This is a common
1311 problem when you use MySQL for logging and you also
1312 periodically run SELECT and UPDATE statements that take a long
1313 time to complete. DELAYED was introduced in MySQL Version
1314 3.22.15. It is a MySQL extension to ANSI SQL92.
1315 </para>
1316 <para>
1317 INSERT DELAYED only works with ISAM and MyISAM tables. Note
1318 that as MyISAM tables supports concurrent SELECT and INSERT,
1319 if there is no free blocks in the middle of the data file, you
1320 very seldom need to use INSERT DELAYED with MyISAM.
1321 </para>
1322 <para>
1323 When you use INSERT DELAYED, the client will get an OK at once
1324 and the row will be inserted when the table is not in use by
1325 any other thread.
1326 </para>
1327 <para>
1328 Another major benefit of using INSERT DELAYED is that inserts
1329 from many clients are bundled together and written in one
1330 block. This is much faster than doing many separate inserts.
1331 </para>
1332 <para>The general disadvantages of delayed inserts are</para>
1333 <orderedlist>
1334 <listitem>
1335 <para>
1336 The queued rows are only stored in memory until they are
1337 inserted into the table. If mysqld dies unexpectedly, any
1338 queued rows that were not written to disk are lost.
1339 </para>
1340 </listitem>
1341 <listitem>
1342 <para>
1343 There is additional overhead for the server to handle a
1344 separate thread for each table on which you use INSERT
1345 DELAYED.
1346 </para>
1347 </listitem>
1348 </orderedlist>
1349 <note role="warning">
1350 <para>
1351 The MySQL documentation concludes, "This means that you
1352 should only use INSERT DELAYED when you are really sure you
1353 need it!" Furthermore, the current state of error return
1354 from a failed INSERT DELAYED seems to be in flux, and may
1355 behave in unpredictable ways between different MySQL
1356 versions. See FAQ entry
1357 <xref linkend="FAQ.DelayedInsert" />
1358 -- you have been warned.
1359 </para>
1360 </note>
1361 <para>
1362 If you are experiencing issues which could be solved by
1363 delayed inserts, then set LogSqlDelayedInserts On in the
1364 <filename>httpd.conf</filename>
1365 . All regular INSERT statements are now INSERT DELAYED, and
1366 you should see no more blocking of the module.
1367 </para>
1368 </section>
1369 </section>
1370 <section id="Sect.ConfigReference">
1371 <title id="Sect.ConfigReference.title">
1372 Configuration Directive Reference
1373 </title>
1374 <para>
1375 It is imperative that you understand which directives are used
1376 only once in the main server config, and which are used inside
1377 VirtualHost stanzas and therefore multiple times within
1378 httpd.conf. The "context" listed with each entry informs you of
1379 this.
1380 </para>
1381 <section tocstyle="fragment">
1382 <title>DataBase Configuration</title>
1383 <variablelist>
1384 <varlistentry>
1385 <term>LogSQLLoginInfo</term>
1386 <listitem>
1387 <cmdsynopsis>
1388 <command>LogSQLLoginInfo</command>
1389 <arg choice="req">
1390 <replaceable>connection URI</replaceable>
1391 </arg>
1392 </cmdsynopsis>
1393 <simpara>
1394 Example: LogSQLLoginInfo
1395 mysql://logwriter:passw0rd@foobar.baz.com/Apache_log
1396 </simpara>
1397 <simpara>Context: main server config</simpara>
1398 <para>
1399 Defines the basic connection URI to connect to the
1400 database with. The format of the connection URI is
1401 </para>
1402 <simpara>
1403 driver://username[:password]@hostname[:port]/database
1404 </simpara>
1405 <variablelist>
1406 <varlistentry>
1407 <term>driver</term>
1408 <listitem>
1409 <simpara>
1410 The database driver to use (mysql, pgsql, etc..)
1411 </simpara>
1412 </listitem>
1413 </varlistentry>
1414 <varlistentry>
1415 <term>username</term>
1416 <listitem>
1417 <simpara>
1418 The database username to login with INSERT
1419 privileges on the logging table defined in
1420 LogSQLtransferLogTable.
1421 </simpara>
1422 </listitem>
1423 </varlistentry>
1424 <varlistentry>
1425 <term>password</term>
1426 <listitem>
1427 <simpara>
1428 The password to use for username, and can be
1429 omitted if there is no password.
1430 </simpara>
1431 </listitem>
1432 </varlistentry>
1433 <varlistentry>
1434 <term>hostname</term>
1435 <listitem>
1436 <simpara>
1437 The hostname or Ip address of the Database
1438 machine, ans is simple "localhost" if the database
1439 lives on the same machine as Apache.
1440 </simpara>
1441 </listitem>
1442 </varlistentry>
1443 <varlistentry>
1444 <term>port</term>
1445 <listitem>
1446 <simpara>
1447 Port on hostname to connect to the Database, if
1448 not specified use the default port for the
1449 database.
1450 </simpara>
1451 </listitem>
1452 </varlistentry>
1453 <varlistentry>
1454 <term>database</term>
1455 <listitem>
1456 <simpara>
1457 The database to connect to on the server.
1458 </simpara>
1459 </listitem>
1460 </varlistentry>
1461 </variablelist>
1462 <note>
1463 <para>
1464 This is defined only once in the
1465 <filename>httpd.conf</filename>
1466 file.
1467 </para>
1468 <para>
1469 This directive Must be defined for logging to be
1470 enabled.
1471 </para>
1472 </note>
1473 </listitem>
1474 </varlistentry>
1475 <varlistentry>
1476 <term>LogSQLDBParam</term>
1477 <listitem>
1478 <cmdsynopsis sepchar=" ">
1479 <command>LogSQLDBParam</command>
1480 <arg choice="req">
1481 <replaceable>parameter-name</replaceable>
1482 </arg>
1483 <arg choice="req">
1484 <replaceable>value</replaceable>
1485 </arg>
1486 </cmdsynopsis>
1487 <simpara>
1488 Example: LogSQLDBParam socketfile
1489 /var/lib/mysql/mysql.socket
1490 </simpara>
1491 <simpara>Context: main server config</simpara>
1492 <para>
1493 This is the new method of specifying Database connection
1494 credentials and settings. This is used to define
1495 database driver specific options. For a list of options
1496 read the documentation for each specific database
1497 driver.
1498 </para>
1499 <table>
1500 <title>MySQL Driver parameters</title>
1501 <tgroup cols="5">
1502 <colspec colname="1" colnum="1" />
1503 <colspec colname="2" colnum="2" />
1504 <colspec colname="3" colnum="3" />
1505 <thead>
1506 <row>
1507 <entry colname="1">Parameter</entry>
1508 <entry colname="2">Meaning</entry>
1509 <entry colname="3">Default</entry>
1510 </row>
1511 </thead>
1512 <tbody>
1513 <row>
1514 <entry colname="1">hostname</entry>
1515 <entry colname="2">MySQL Server hostname</entry>
1516 <entry colname="3">none (use LogSQLLoginInfo to set)</entry>
1517 </row>
1518 <row>
1519 <entry colname="1">username</entry>
1520 <entry colname="2">The username to log in with</entry>
1521 <entry colname="3">none (use LogSQLLoginInfo to set)</entry>
1522 </row>
1523 <row>
1524 <entry colname="1">password</entry>
1525 <entry colname="2">The password to use</entry>
1526 <entry colname="3">none (use LogSQLLoginInfo to set)</entry>
1527 </row>
1528 <row>
1529 <entry colname="1">database</entry>
1530 <entry colname="2">Which database to connect to</entry>
1531 <entry colname="3">none (use LogSQLLoginInfo to set)</entry>
1532 </row>
1533 <row>
1534 <entry colname="1">port</entry>
1535 <entry colname="2">The TCP port to connect to the MySQL server over</entry>
1536 <entry colname="3">3306 (use LogSQLLoginInfo to set)</entry>
1537 </row>
1538 <row>
1539 <entry colname="1">socketfile</entry>
1540 <entry colname="2">The MySQL Unix socket file to use</entry>
1541 <entry colname="3">none</entry>
1542 </row>
1543 <row>
1544 <entry colname="1">tabletype</entry>
1545 <entry colname="2">MySQL Table Engine to use</entry>
1546 <entry colname="3">MySQL server default</entry>
1547 </row>
1548 </tbody>
1549 </tgroup>
1550 </table>
1551 <note>
1552 <para>
1553 Each parameter-name may only be defined once.
1554 </para>
1555 </note>
1556 </listitem>
1557 </varlistentry>
1558 <varlistentry>
1559 <term>LogSQLCreateTables</term>
1560 <listitem>
1561 <cmdsynopsis sepchar=" ">
1562 <command>LogSQLCreateTables</command>
1563 <arg choice="req">flag</arg>
1564 </cmdsynopsis>
1565 <simpara>Example: LogSQLCreateTables On</simpara>
1566 <simpara>Default: Off</simpara>
1567 <simpara>Context: main server config</simpara>
1568 <para>
1569 mod_log_sql has the ability to create its tables
1570 on-the-fly. The advantage to this is convenience: you
1571 don't have to execute any SQL by hand to prepare the
1572 table. This is especially helpful for people with lots
1573 of virtual hosts (who should also see the
1574 LogSQLMassVirtualHosting directive).
1575 </para>
1576 <para>
1577 There is a slight disadvantage: if you wish to activate
1578 this feature, then the userid specified in
1579 LogSQLLoginInfo must have CREATE privileges on the
1580 database. In an absolutely paranoid, locked-down
1581 situation you may only want to grant your mod_log_sql
1582 user INSERT privileges on the database; in that
1583 situation you are unable to take advantage of
1584 LogSQLCreateTables. But most people -- even the very
1585 security-conscious -- will find that granting CREATE on
1586 the logging database is reasonable.
1587 </para>
1588 <note>
1589 <para>
1590 This is defined only once in the
1591 <filename>httpd.conf</filename>
1592 file.
1593 </para>
1594 </note>
1595 </listitem>
1596 </varlistentry>
1597 <varlistentry>
1598 <term>LogSQLForcePreserve</term>
1599 <listitem>
1600 <cmdsynopsis sepchar=" ">
1601 <command>LogSQLForcePreserve</command>
1602 <arg choice="req">flag</arg>
1603 </cmdsynopsis>
1604 <simpara>Example: LogForcePreserve On</simpara>
1605 <simpara>Default: Off</simpara>
1606 <simpara>Context: main server config</simpara>
1607 <para>
1608 You may need to perform debugging on your database and
1609 specifically want mod_log_sql to make no attempts to log
1610 to it. This directive instructs the module to send all
1611 its log entries directly to the preserve file and to
1612 make no database INSERT attempts.
1613 </para>
1614 <para>
1615 This is presumably a directive for temporary use only;
1616 it could be dangerous if you set it and forget it, as
1617 all your entries will simply pile up in the preserve
1618 file.
1619 </para>
1620 <note>
1621 <para>
1622 This is defined only once in the
1623 <filename>httpd.conf</filename>
1624 file.
1625 </para>
1626 </note>
1627 </listitem>
1628 </varlistentry>
1629 <varlistentry>
1630 <term>LogSQLDisablePreserve</term>
1631 <listitem>
1632 <cmdsynopsis>
1633 <command>LogSQLDisablePreserve</command>
1634 <arg choice="req">flag</arg>
1635 </cmdsynopsis>
1636 <simpara>Example: LogDisablePreserve On</simpara>
1637 <simpara>Default: Off</simpara>
1638 <simpara>Context; main server config</simpara>
1639 <para>
1640 This option can be enabled to completely disable the
1641 preserve file fail back. This may be useful for servers
1642 where the file-system is read-only.
1643 </para>
1644 <para>
1645 If the database is not available those log entries will
1646 be lost.
1647 </para>
1648 <note>
1649 <para>
1650 This is defined only once in the
1651 <filename>httpd.conf</filename>
1652 file.
1653 </para>
1654 </note>
1655 </listitem>
1656 </varlistentry>
1657 <varlistentry>
1658 <term>LogSQLMachineID</term>
1659 <listitem>
1660 <cmdsynopsis sepchar=" ">
1661 <command>LogSQLMachineID</command>
1662 <arg choice="req">machineID</arg>
1663 </cmdsynopsis>
1664 <simpara>Example: LogSQLMachineID web01</simpara>
1665 <simpara>Context: main server config</simpara>
1666 <para>
1667 If you have a farm of webservers then you may wish to
1668 know which particular machine made each entry; this is
1669 useful for analyzing your load-balancing methodology.
1670 LogSQLMachineID permits you to distinguish each
1671 machine's entries if you assign each machine its own
1672 LogSQLMachineID: for example, the first webserver gets
1673 ``LogSQLMachineID web01,'' the second gets
1674 ``LogSQLMachineID web02,'' etc.
1675 </para>
1676 <note>
1677 <para>
1678 This is defined only once in the
1679 <filename>httpd.conf</filename>
1680 file.
1681 </para>
1682 </note>
1683 </listitem>
1684 </varlistentry>
1685 <varlistentry>
1686 <term>LogSQlPreserveFile</term>
1687 <listitem>
1688 <cmdsynopsis sepchar=" ">
1689 <command>LogSQLPreserveFile</command>
1690 <arg choice="req">
1691 <replaceable>filename</replaceable>
1692 </arg>
1693 </cmdsynopsis>
1694 <simpara>
1695 Example: LogSQLPreserveFile offline-preserve
1696 </simpara>
1697 <simpara>Default: /tmp/sql-preserve</simpara>
1698 <simpara>Context: virtual host</simpara>
1699 <para>
1700 mod_log_sql writes queries to this local preserve file
1701 in the event that it cannot reach the database, and thus
1702 ensures that your high-availability web frontend does
1703 not lose logs during a temporary database outage. This
1704 could happen for a number of reasons: the database goes
1705 offline, the network breaks, etc. You will not lose
1706 entries since the module has this backup. The file
1707 consists of a series of SQL statements that can be
1708 imported into your database at your convenience;
1709 furthermore, because the SQL queries contain the access
1710 timestamps you do not need to worry about out-of-order
1711 data after the import, which is done in a simple manner:
1712 </para>
1713 <programlisting format="linespecific"># mysql -uadminuser -p mydbname &lt; /tmp/sql-preserve</programlisting>
1714 <para>
1715 If you do not define LogSQLPreserveFile then all virtual
1716 servers will log to the same default preserve file (
1717 <filename>/tmp/sql-preserve</filename>
1718 ). You can redefine this on a virtual-host basis in
1719 order to segregate your preserve files if you desire.
1720 Note that segregation is not usually necessary, as the
1721 SQL statements that are written to the preserve file
1722 already distinguish between different virtual hosts if
1723 you include the 'v' character in your
1724 LogSQLTransferLogFormat directive. It is only necessary
1725 to segregate preserve-files by virualhost if you also
1726 segregate access logs by virtualhost.
1727 </para>
1728 <para>
1729 The module will log to Apache's ErrorLog when it notices
1730 a database outage, and upon database return. You will
1731 therefore know when the preserve file is being used,
1732 although it is your responsibility to import the file.
1733 </para>
1734 <para>
1735 The file does not need to be created in advance. It is
1736 safe to remove or rename the file without interrupting
1737 Apache, as the module closes the filehandle immediately
1738 after completing the write. The file is created with the
1739 user &amp; group ID of the running Apache process (e.g.
1740 'nobody' on many Linux distributions).
1741 </para>
1742 </listitem>
1743 </varlistentry>
1744 </variablelist>
1745 </section>
1746 <section>
1747 <title>Table Names</title>
1748 <variablelist>
1749 <varlistentry>
1750 <term>LogSQLTransferLogTable</term>
1751 <listitem>
1752 <cmdsynopsis sepchar=" ">
1753 <command>LogSQLTransferLogTable</command>
1754 <arg choice="req">
1755 <replaceable>table-name</replaceable>
1756 </arg>
1757 </cmdsynopsis>
1758 <simpara>
1759 Example: LogSQLTransferLogTable access_log_table
1760 </simpara>
1761 <simpara>Context: virtual host</simpara>
1762 <para>
1763 Defines which table is used for logging of Apache's
1764 transfers; this is analogous to Apache's TransferLog
1765 directive. table-name must be a valid table within the
1766 database defined in the LogSQLLoginInfo connection URI.
1767 </para>
1768 <para>
1769 This directive is
1770 <emphasis>not</emphasis>
1771 necessary if you declare LogSQLMassVirtualHosting On,
1772 since that directive activates dynamically-named tables.
1773 If you attempt to use LogSqlTransferlogTable at the same
1774 time a warning will be logged and it will be ignored,
1775 since LogSQLMassVirtualHosting takes priority.
1776 </para>
1777 <note>
1778 <para>
1779 Requires unless LogSQLMassVirtualHosting is set to On
1780 </para>
1781 </note>
1782 </listitem>
1783 </varlistentry>
1784 <varlistentry>
1785 <term>LogSQLCookieLogTable</term>
1786 <listitem>
1787 <cmdsynopsis sepchar=" ">
1788 <command>LogSQLCookieLogTable</command>
1789 <arg choice="req">
1790 <replaceable></replaceable>
1791 table-name
1792 </arg>
1793 </cmdsynopsis>
1794 <simpara>
1795 Example: LogSQLCookieLogTable cookie_log
1796 </simpara>
1797 <simpara>Default: cookies</simpara>
1798 <simpara>Context: virtual host</simpara>
1799 <para>
1800 Defines which table is used for logging of cookies.
1801 Working in conjunction with LogSQLWhichCookies, you can
1802 log many of each request's associated cookies to a
1803 separate table. For meaningful data retrieval the cookie
1804 table is keyed to the access table by the unique request
1805 ID supplied by the standard Apache module mod_unique_id.
1806 </para>
1807 <note>
1808 <para>
1809 You must create the table (see create-tables.sql,
1810 included in the package), or LogSQLCreateTables must
1811 be set to "on".
1812 </para>
1813 </note>
1814 </listitem>
1815 </varlistentry>
1816 <varlistentry>
1817 <term>LogSQLHeadersInLogTable</term>
1818 <listitem>
1819 <cmdsynopsis sepchar=" ">
1820 <command>LogSQLHeadersInLogTable</command>
1821 <arg choice="req">
1822 <replaceable>table-name</replaceable>
1823 </arg>
1824 </cmdsynopsis>
1825 <simpara>
1826 Example: LogSQLHeadersInLogTable headers
1827 </simpara>
1828 <simpara>Default: headers_in</simpara>
1829 <simpara>Context: virtual host</simpara>
1830 <para>
1831 Defines which table is used for logging of inbound
1832 headers. Working in conjunction with
1833 LogSQLWhichHeadersIn, you can log many of each request's
1834 associated headers to a separate table. For meaningful
1835 data retrieval the headers table is keyed to the access
1836 table by the unique request ID supplied by the standard
1837 Apache module mod_unique_id.
1838 </para>
1839 <note>
1840 <para>
1841 Note that you must create the table (see
1842 create-tables.sql, included in the package), or
1843 LogSQLCreateTables must be set to "on".
1844 </para>
1845 </note>
1846 </listitem>
1847 </varlistentry>
1848 <varlistentry>
1849 <term>LogSQLHeadersOutLogTable</term>
1850 <listitem>
1851 <cmdsynopsis sepchar=" ">
1852 <command>LogSQLHeadersOutLogTable</command>
1853 <arg choice="req">
1854 <replaceable>table-name</replaceable>
1855 </arg>
1856 </cmdsynopsis>
1857 <simpara>
1858 Example: LogSQLHeadersOutLogTable headers
1859 </simpara>
1860 <simpara>Default: headers_out</simpara>
1861 <simpara>Context: virtual host</simpara>
1862 <para>
1863 Defines which table is used for logging of outbound
1864 headers. Working in conjunction with
1865 LogSQLWhichHeadersOut, you can log many of each
1866 request's associated headers to a separate table. For
1867 meaningful data retrieval the headers table is keyed to
1868 the access table by the unique request ID supplied by
1869 the standard Apache module mod_unique_id.
1870 </para>
1871 <note>
1872 <para>
1873 Note that you must create the table (see
1874 create-tables.sql, included in the package), or
1875 LogSQLCreateTables must be set to "on".
1876 </para>
1877 </note>
1878 </listitem>
1879 </varlistentry>
1880 <varlistentry>
1881 <term>LogSQLNotesLogTable</term>
1882 <listitem>
1883 <cmdsynopsis sepchar=" ">
1884 <command>LogSQLNotesLogTable</command>
1885 <arg choice="req">
1886 <replaceable>table-name</replaceable>
1887 </arg>
1888 </cmdsynopsis>
1889 <simpara>Example: LogSQLNotesLogTable notes-log</simpara>
1890 <simpara>Default: notes</simpara>
1891 <simpara>Context: virtual_host</simpara>
1892 <para>
1893 Defines which table is used for logging of notes.
1894 Working in conjunction with LogSQLWhichNotes, you can
1895 log many of each request's associated notes to a
1896 separate table. For meaningful data retrieval the notes
1897 table is keyed to the access table by the unique request
1898 ID supplied by the standard Apache module mod_unique_id.
1899 </para>
1900 <note>
1901 <para>
1902 This table must be created (see create-tables.sql
1903 included in the package), or LogSQLCreateTables must
1904 be set to 'On'.
1905 </para>
1906 </note>
1907 </listitem>
1908 </varlistentry>
1909 <varlistentry>
1910 <term>LogSQLMassVirtualHosting</term>
1911 <listitem>
1912 <cmdsynopsis sepchar=" ">
1913 <command>LogSQLMassVirtualHosting</command>
1914 <arg choice="req">flag</arg>
1915 </cmdsynopsis>
1916 <simpara>Example: LogSQLMassVirtualHosting On</simpara>
1917 <simpara>Default: Off</simpara>
1918 <simpara>Context: main server config</simpara>
1919 <para>
1920 If you administer a site hosting many, many virtual
1921 hosts then this option will appeal to you. If you turn
1922 on LogSQLMassVirtualHosting then several things happen:
1923 </para>
1924 <itemizedlist>
1925 <listitem>
1926 <para>
1927 the on-the-fly table creation feature is activated
1928 automatically
1929 </para>
1930 </listitem>
1931 <listitem>
1932 <para>
1933 the transfer log table name is dynamically set from
1934 the virtual host's name after stripping out
1935 SQL-unfriendly characters (example: a virtual host
1936 www.grubbybaby.com gets logged to table
1937 access_www_grubbybaby_com)
1938 </para>
1939 </listitem>
1940 <listitem>
1941 <para>
1942 which, in turn, means that each virtual host logs to
1943 its own segregated table. Because there is no data
1944 shared between virtual servers you can grant your
1945 users access to the tables they need; they will be
1946 unable to view others' data.
1947 </para>
1948 </listitem>
1949 </itemizedlist>
1950 <para>
1951 This is a huge boost in convenience for sites with many
1952 virtual servers. Activating LogSQLMassVirtualHosting
1953 obviates the need to create every virtual server's table
1954 and provides more granular security possibilities.
1955 </para>
1956 <note>
1957 <para>
1958 This is defined only once in the
1959 <filename>httpd.conf</filename>
1960 file.
1961 </para>
1962 </note>
1963 </listitem>
1964 </varlistentry>
1965 </variablelist>
1966 </section>
1967 <section>
1968 <title>Configuring What Is logged</title>
1969 <variablelist>
1970 <varlistentry id="Conf.LogSQLTransferLogFormat">
1971 <term>LogSQLTransferLogFormat</term>
1972 <listitem>
1973 <cmdsynopsis sepchar=" ">
1974 <command>LogSQLTransferLogFormat</command>
1975 <arg choice="req">
1976 <replaceable>format-string</replaceable>
1977 </arg>
1978 </cmdsynopsis>
1979 <simpara>Example: LogSQLTransferLogFormat huSUTv</simpara>
1980 <simpara>Default: AbHhmRSsTUuv</simpara>
1981 <simpara>Context: virtual host</simpara>
1982 <para>
1983 Each character in the format-string defines an attribute
1984 of the request that you wish to log. The default logs
1985 the information required to create Combined Log Format
1986 logs, plus several extras. Here is the full list of
1987 allowable keys, which sometimes resemble their Apache
1988 counterparts, but do not always:
1989 </para>
1990 <table>
1991 <title>Core LogFormat parameters</title>
1992 <tgroup cols="5">
1993 <colspec colname="1" colnum="1" />
1994 <colspec colname="2" colnum="2" />
1995 <colspec colname="3" colnum="3" />
1996 <colspec colname="4" colnum="4" />
1997 <colspec colname="5" colnum="5" />
1998 <thead>
1999 <row>
2000 <entry colname="1">Symbol</entry>
2001 <entry colname="2">Meaning</entry>
2002 <entry colname="3">DB Field</entry>
2003 <entry colname="4">Data Type</entry>
2004 <entry colname="5">Example</entry>
2005 </row>
2006 </thead>
2007 <tbody>
2008 <row>
2009 <entry colname="1">A</entry>
2010 <entry colname="2">User Agent</entry>
2011 <entry colname="3">agent</entry>
2012 <entry colname="4">varchar(255)</entry>
2013 <entry colname="5">
2014 Mozilla/4.0 (compat; MSIE 6.0; Windows)
2015 </entry>
2016 </row>
2017 <row>
2018 <entry colname="1">a</entry>
2019 <entry colname="2">CGi request arguments</entry>
2020 <entry colname="3">request_args</entry>
2021 <entry colname="4">varchar(255)</entry>
2022 <entry colname="5">
2023 user=Smith&amp;cart=1231&amp;item=532
2024 </entry>
2025 </row>
2026 <row>
2027 <entry colname="1">b</entry>
2028 <entry colname="2">Bytes transfered</entry>
2029 <entry colname="3">bytes_sent</entry>
2030 <entry colname="4">int unsigned</entry>
2031 <entry colname="5">32561</entry>
2032 </row>
2033 <row>
2034 <entry colname="1">
2035 c
2036 <xref linkend="Foot.LogCookie" xrefstyle="footer" />
2037 </entry>
2038 <entry colname="2">Text of cookie</entry>
2039 <entry colname="3">cookie</entry>
2040 <entry colname="4">varchar(255)</entry>
2041 <entry colname="5">
2042 Apache=sdyn.fooonline.net 1300102700823
2043 </entry>
2044 </row>
2045 <row>
2046 <entry>f</entry>
2047 <entry>Local filename requested</entry>
2048 <entry>request_file</entry>
2049 <entry>varchar(255)</entry>
2050 <entry>/var/www/html/books-cycroad.html</entry>
2051 </row>
2052 <row>
2053 <entry>H</entry>
2054 <entry>HTTP request_protocol</entry>
2055 <entry>request_protocol</entry>
2056 <entry>varchar(10)</entry>
2057 <entry>HTTP/1.1</entry>
2058 </row>
2059 <row>
2060 <entry>h</entry>
2061 <entry>Name of remote host</entry>
2062 <entry>remote_host</entry>
2063 <entry>varchar(50)</entry>
2064 <entry>blah.foobar.com</entry>
2065 </row>
2066 <row>
2067 <entry>I</entry>
2068 <entry>Request ID (from modd_unique_id)</entry>
2069 <entry>id</entry>
2070 <entry>char(19)</entry>
2071 <entry>POlFcUBRH30AAALdBG8</entry>
2072 </row>
2073 <row>
2074 <entry>l</entry>
2075 <entry>Ident user info</entry>
2076 <entry>remote_logname</entry>
2077 <entry>varcgar(50)</entry>
2078 <entry>bobby</entry>
2079 </row>
2080 <row>
2081 <entry>M</entry>
2082 <entry>
2083 Machine ID
2084 <xref linkend="Foot.MachineID" xrefstyle="footer" />
2085 </entry>
2086 <entry>machine_id</entry>
2087 <entry>varchar(25)</entry>
2088 <entry>web01</entry>
2089 </row>
2090 <row>
2091 <entry>m</entry>
2092 <entry>HTTP request method</entry>
2093 <entry>request_method</entry>
2094 <entry>varchar(10)</entry>
2095 <entry>GET</entry>
2096 </row>
2097 <row>
2098 <entry>P</entry>
2099 <entry>httpd cchild PID</entry>
2100 <entry>child_pid</entry>
2101 <entry>smallint unsigned</entry>
2102 <entry>3215</entry>
2103 </row>
2104 <row>
2105 <entry>p</entry>
2106 <entry>http port</entry>
2107 <entry>server_port</entry>
2108 <entry>smallint unsigned</entry>
2109 <entry>80</entry>
2110 </row>
2111 <row>
2112 <entry>R</entry>
2113 <entry>Referer</entry>
2114 <entry>referer</entry>
2115 <entry>varchar(255)</entry>
2116 <entry>
2117 http://www.biglinks4u.com/linkpage.html
2118 </entry>
2119 </row>
2120 <row>
2121 <entry>r</entry>
2122 <entry>Request in full form</entry>
2123 <entry>request_line</entry>
2124 <entry>varchar(255)</entry>
2125 <entry>GET /books-cycroad.html HTTP/1.1</entry>
2126 </row>
2127 <row>
2128 <entry>S</entry>
2129 <entry>
2130 Time of request in UNIX time_t format
2131 </entry>
2132 <entry>time_stamp</entry>
2133 <entry>int unsigned</entry>
2134 <entry>1005598029</entry>
2135 </row>
2136 <row>
2137 <entry>s</entry>
2138 <entry>HTTP Response Code Status</entry>
2139 <entry>status</entry>
2140 <entry>smallint</entry>
2141 <entry>200</entry>
2142 </row>
2143 <row>
2144 <entry>T</entry>
2145 <entry>Seconds to service request</entry>
2146 <entry>request_duration</entry>
2147 <entry>smallint unsigned</entry>
2148 <entry>2</entry>
2149 </row>
2150 <row>
2151 <entry>t</entry>
2152 <entry>Time of request in human format</entry>
2153 <entry>request_time</entry>
2154 <entry>char(28)</entry>
2155 <entry>[02/Dec/2001:15:01:26 -0800]</entry>
2156 </row>
2157 <row>
2158 <entry>U</entry>
2159 <entry>Request in simple form</entry>
2160 <entry>request_uri</entry>
2161 <entry>varchar(255)</entry>
2162 <entry>/books-cycroad.html</entry>
2163 </row>
2164 <row>
2165 <entry>u</entry>
2166 <entry>User info from HTTP auth</entry>
2167 <entry>remote_user</entry>
2168 <entry>varchar(50)</entry>
2169 <entry>bobby</entry>
2170 </row>
2171 <row>
2172 <entry>v</entry>
2173 <entry>Virtual host servicing the request</entry>
2174 <entry>virtual_host</entry>
2175 <entry>varchar(255)</entry>
2176 <entry>www.foobar.com</entry>
2177 </row>
2178 <row>
2179 <entry>V</entry>
2180 <entry>
2181 requested Virtual host name (mass
2182 virtualhosting)
2183 </entry>
2184 <entry>virtual_host</entry>
2185 <entry>varchar(255)</entry>
2186 <entry>www.foobar.org</entry>
2187 </row>
2188 </tbody>
2189 </tgroup>
2190 </table>
2191 <note>
2192 <simpara id="Foot.LogCookie">
2193 [1] You must also specify LogSQLWhichCookie for this
2194 to take effect.
2195 </simpara>
2196 <simpara id="Foot.MachineID">
2197 [2] You must also specify LogSQLmachineID for this to
2198 take effect.
2199 </simpara>
2200 </note>
2201 <table>
2202 <title>SSL LogFormat Parameters</title>
2203 <tgroup cols="5">
2204 <colspec colname="1" colnum="1" />
2205 <colspec colname="2" colnum="2" />
2206 <colspec colname="3" colnum="3" />
2207 <colspec colname="4" colnum="4" />
2208 <colspec colname="5" colnum="5" />
2209 <thead>
2210 <row>
2211 <entry colname="1">Symbol</entry>
2212 <entry colname="2">Meaning</entry>
2213 <entry colname="3">DB Field</entry>
2214 <entry colname="4">Data Type</entry>
2215 <entry colname="5">Example</entry>
2216 </row>
2217 </thead>
2218 <tbody>
2219 <row>
2220 <entry colname="1">z</entry>
2221 <entry colname="2">SSL cipher used</entry>
2222 <entry colname="3">ssl_cipher</entry>
2223 <entry colname="4">varchar(25)</entry>
2224 <entry colname="5">RC4-MD5</entry>
2225 </row>
2226 <row>
2227 <entry colname="1">q</entry>
2228 <entry colname="2">
2229 Keysize of the SSL connection
2230 </entry>
2231 <entry colname="3">ssl_keysize</entry>
2232 <entry colname="4">smallint unsigned</entry>
2233 <entry colname="5">56</entry>
2234 </row>
2235 <row>
2236 <entry colname="1">Q</entry>
2237 <entry colname="2">
2238 maximum keysize supported
2239 </entry>
2240 <entry colname="3">ssl_maxkeysize</entry>
2241 <entry colname="4">smallint unsigned</entry>
2242 <entry colname="5">128</entry>
2243 </row>
2244 </tbody>
2245 </tgroup>
2246 </table>
2247 <table>
2248 <title>LogIO LogFormat Parameters</title>
2249 <tgroup cols="5">
2250 <colspec colname="1" colnum="1" />
2251 <colspec colname="2" colnum="2" />
2252 <colspec colname="3" colnum="3" />
2253 <colspec colname="4" colnum="4" />
2254 <colspec colname="5" colnum="5" />
2255 <thead>
2256 <row>
2257 <entry colname="1">Symbol</entry>
2258 <entry colname="2">Meaning</entry>
2259 <entry colname="3">DB Field</entry>
2260 <entry colname="4">Data Type</entry>
2261 <entry colname="5">Example</entry>
2262 </row>
2263 </thead>
2264 <tbody>
2265 <row>
2266 <entry colname="1">i</entry>
2267 <entry colname="2">Number of actual Bytes transfered in with the request</entry>
2268 <entry colname="3">bytes_in</entry>
2269 <entry colname="4">int unsigned</entry>
2270 <entry colname="5">505</entry>
2271 </row>
2272 <row>
2273 <entry colname="1">o</entry>
2274 <entry colname="2">Number of actual Bytes transfered out with the request</entry>
2275 <entry colname="3">bytes_out</entry>
2276 <entry colname="4">int unsigned</entry>
2277 <entry colname="5">4168</entry>
2278 </row>
2279 </tbody>
2280 </tgroup>
2281 </table>
2282 </listitem>
2283 </varlistentry>
2284 <varlistentry>
2285 <term>LogSQLRemhostIgnore</term>
2286 <listitem>
2287 <cmdsynopsis sepchar=" ">
2288 <command>LogSQLRemhostIgnore</command>
2289 <arg choice="req" rep="repeat">
2290 <replaceable>hostname</replaceable>
2291 </arg>
2292 </cmdsynopsis>
2293 <simpara>
2294 Example: LogSQLRemhostIgnore localnet.com
2295 </simpara>
2296 <simpara>Context: virtual host</simpara>
2297 <para>
2298 Lists a series of smortrings that, if present in the
2299 REMOTE_HOST, will cause that request to
2300 <emphasis>not</emphasis>
2301 be logged. This directive is useful for cutting down on
2302 log clutter when you are certain that you want to ignore
2303 requests from certain hosts, such as your own internal
2304 network machines. See section
2305 <xref endterm="Sect.Ignore.title" linkend="Sect.Ignore" />
2306 for some tips for using this directive.
2307 </para>
2308 <para>
2309 Each string may contain a + or - prefix in a
2310 &lt;VirtualHost&gt; context and will cause those strings
2311 to be added (+) or removed (-) from the global
2312 configuration. Otherwise the global is completely
2313 ignored and overridden if defined in a
2314 &lt;VirtualHost&gt;
2315 </para>
2316 <para>
2317 Each string is separated by a space, and no regular
2318 expressions or globbing are allowed. Each string is
2319 evaluated as a substring of the REMOTE_HOST using
2320 strstr(). The comparison is case sensitive.
2321 </para>
2322 </listitem>
2323 </varlistentry>
2324 <varlistentry>
2325 <term>LogSQLRequestAccept</term>
2326 <listitem>
2327 <cmdsynopsis sepchar=" ">
2328 <command>LogSQLRequestAccept</command>
2329 <arg choice="req" rep="repeat">
2330 <replaceable>substring</replaceable>
2331 </arg>
2332 </cmdsynopsis>
2333 <simpara>
2334 Example: LogSQLRequestAccept .html .php .jpg
2335 </simpara>
2336 <simpara>
2337 Default: if not specified, all requests are 'accepted'
2338 </simpara>
2339 <simpara>Context: virtual host</simpara>
2340 <para>
2341 Lists a series of strings that, if present in the URI,
2342 will permit that request to be considered for logging
2343 (depending on additional filtering by the "ignore"
2344 directives). Any request that fails to match one of the
2345 LogSQLRequestAccept entries will be discarded.
2346 </para>
2347 <para>
2348 Each string may contain a + or - prefix in a
2349 &lt;VirtualHost&gt; context and will cause those strings
2350 to be added (+) or removed (-) from the global
2351 configuration. Otherwise the global is completely
2352 ignored and overridden if defined in a
2353 &lt;VirtualHost&gt;
2354 </para>
2355 <para>
2356 This directive is useful for cutting down on log clutter
2357 when you are certain that you only want to log certain
2358 kinds of requests, and just blanket-ignore everything
2359 else. See section
2360 <xref endterm="Sect.Ignore.title" linkend="Sect.Ignore" />
2361 for some tips for using this directive.
2362 </para>
2363 <para>
2364 Each string is separated by a space, and no regular
2365 expressions or globbing are allowed. Each string is
2366 evaluated as a substring of the URI using strstr(). The
2367 comparison is case sensitive.
2368 </para>
2369 <para>
2370 This directive is completely optional. It is more
2371 general than LogSQLRequestIgnore and is evaluated before
2372 LogSQLRequestIgnore . If this directive is not used,
2373 <emphasis>all</emphasis>
2374 requests are accepted and passed on to the other
2375 filtering directives. Therefore, only use this directive
2376 if you have a specific reason to do so.
2377 </para>
2378 </listitem>
2379 </varlistentry>
2380 <varlistentry>
2381 <term>LogSQLRequestIgnore</term>
2382 <listitem>
2383 <cmdsynopsis sepchar=" ">
2384 <command>LogSQLRequestIgnore</command>
2385 <arg choice="req" rep="repeat">
2386 <replaceable>substring</replaceable>
2387 </arg>
2388 </cmdsynopsis>
2389 <simpara>
2390 Example: LogSQLRequestIgnore root.exe cmd.exe
2391 default.ida favicon.ico
2392 </simpara>
2393 <simpara>Context: virtual host</simpara>
2394 <para>
2395 Lists a series of strings that, if present in the URI,
2396 will cause that request to
2397 <emphasis>NOT</emphasis>
2398 be logged. This directive is useful for cutting down on
2399 log clutter when you are certain that you want to ignore
2400 requests for certain objects. See section
2401 <xref endterm="Sect.Ignore.title" linkend="Sect.Ignore" />
2402 for some tips for using this directive.
2403 </para>
2404 <para>
2405 Each string may contain a + or - prefix in a
2406 &lt;VirtualHost&gt; context and will cause those strings
2407 to be added (+) or removed (-) from the global
2408 configuration. Otherwise the global is completely
2409 ignored and overridden if defined in a
2410 &lt;VirtualHost&gt;
2411 </para>
2412 <para>
2413 Each string is separated by a space, and no regular
2414 expressions or globbing are allowed. Each string is
2415 evaluated as a substring of the URI using strstr(). The
2416 comparison is case sensitive.
2417 </para>
2418 </listitem>
2419 </varlistentry>
2420 <varlistentry>
2421 <term>LogSQLWhichCookie</term>
2422 <listitem>
2423 <cmdsynopsis sepchar=" ">
2424 <command>LogSQLWhichCookie</command>
2425 <arg choice="req">
2426 <replaceable>cookiename</replaceable>
2427 </arg>
2428 </cmdsynopsis>
2429 <simpara>Example; LogSQLWhichCookie Clicks</simpara>
2430 <simpara>Context: virtual host</simpara>
2431 <para>
2432 In HTTP, cookies have names to distinguish them from
2433 each other. Using mod_usertrack, for example, you can
2434 give your user-tracking cookies a name with the
2435 CookieName directive.
2436 </para>
2437 <para>
2438 mod_log_sql allows you to log cookie information.
2439 LogSQL_WhichCookie tells mod_log_sql which cookie to
2440 log. This is necessary because you will usually be
2441 setting and receiving more than one cookie from a
2442 client.
2443 </para>
2444 <note>
2445 <para>
2446 You must include a 'c' character in
2447 LogSQLTransferLogFormat for this directive to take
2448 effect.
2449 </para>
2450 <para>
2451 although this was origintally intended for people
2452 using mod_usertrack to create user-tracking cookies,
2453 you are not restricted in any way. You can choose
2454 which cookie you wish to log to the database - any
2455 cookie at all - and it does not necessarily have to
2456 have anything to do with mod_usertrack.
2457 </para>
2458 </note>
2459 </listitem>
2460 </varlistentry>
2461 <varlistentry>
2462 <term>LogSQLWhichCookies</term>
2463 <listitem>
2464 <cmdsynopsis sepchar=" ">
2465 <command>LogSQLWhichCookies</command>
2466 <arg choice="req" rep="repeat">
2467 <replaceable>cookie-name</replaceable>
2468 </arg>
2469 </cmdsynopsis>
2470 <simpara>
2471 Example: logSQLWhichCookies userlogin cookie1 cookie2
2472 </simpara>
2473 <simpara>Context: virtual host</simpara>
2474 <para>
2475 Defines the list of cookies you would like logged. This
2476 works in conjunction with LogSQLCookieLogTable. This
2477 directive does
2478 <emphasis>not</emphasis>
2479 require any additional characters to be added to the
2480 LogSQLTransferLogFormat string. The feature is activated
2481 simply by including this directive, upon which you will
2482 begin populating the separate cookie table with data.
2483 </para>
2484 <para>
2485 Each string may contain a + or - prefix in a
2486 &lt;VirtualHost&gt; context and will cause those strings
2487 to be added (+) or removed (-) from the global
2488 configuration. Otherwise the global is completely
2489 ignored and overridden if defined in a
2490 &lt;VirtualHost&gt;
2491 </para>
2492 <note>
2493 <para>
2494 The table must be created (see create-tables.sql,
2495 included in the package), or LogSQLCreateTables must
2496 be set to 'On'.
2497 </para>
2498 </note>
2499 </listitem>
2500 </varlistentry>
2501 <varlistentry>
2502 <term>LogSQLWhichHeadersIn</term>
2503 <listitem>
2504 <cmdsynopsis sepchar=" ">
2505 <command>LogSQLWhichHeadersIn</command>
2506 <arg choice="req" rep="repeat">
2507 <replaceable>header-name</replaceable>
2508 </arg>
2509 </cmdsynopsis>
2510 <simpara>
2511 Example: LogSQLWhichHeadersIn User-Agent Accept-Encoding
2512 Host
2513 </simpara>
2514 <simpara>Context: virtual host</simpara>
2515 <para>
2516 Defines the list of inbound headers you would like
2517 logged. This works in conjunction with
2518 LogSQLHeadersInLogTable. This directive does not require
2519 any additional characters to be added to the
2520 LogSQLTransferLogFormat string. The feature is activated
2521 simply by including this directive, upon which you will
2522 begin populating the separate inbound-headers table with
2523 data.
2524 </para>
2525 <para>
2526 Each string may contain a + or - prefix in a
2527 &lt;VirtualHost&gt; context and will cause those strings
2528 to be added (+) or removed (-) from the global
2529 configuration. Otherwise the global is completely
2530 ignored and overridden if defined in a
2531 &lt;VirtualHost&gt;
2532 </para>
2533 <note>
2534 <para>
2535 The table must be created (see create-tables.sql,
2536 included in the package), or LogSQLCreateTables must
2537 be set to 'On'.
2538 </para>
2539 </note>
2540 </listitem>
2541 </varlistentry>
2542 <varlistentry>
2543 <term>LogSQLWhichHeadersOut</term>
2544 <listitem>
2545 <cmdsynopsis sepchar=" ">
2546 <command>LogSQLWhichHeadersOut</command>
2547 <arg choice="req" rep="repeat">
2548 <replaceable>header-name</replaceable>
2549 </arg>
2550 </cmdsynopsis>
2551 <simpara>
2552 Example: LogSQLWhichHeadersOut Expires Content-Type
2553 Cache-Control
2554 </simpara>
2555 <simpara>Context: virtual host</simpara>
2556 <para>
2557 Defines the list of outbound headers you would like
2558 logged. This works in conjunction with
2559 LogSQLHeadersOutLogTable. This directive does not
2560 require any additional characters to be added to the
2561 LogSQLTransferLogFormat string. The feature is activated
2562 simply by including this directive, upon which you will
2563 begin populating the separate outbound-headers table
2564 with data.
2565 </para>
2566 <para>
2567 Each string may contain a + or - prefix in a
2568 &lt;VirtualHost&gt; context and will cause those strings
2569 to be added (+) or removed (-) from the global
2570 configuration. Otherwise the global is completely
2571 ignored and overridden if defined in a
2572 &lt;VirtualHost&gt;
2573 </para>
2574 <note>
2575 <para>
2576 The table must be created (see create-tables.sql,
2577 included in the package), or LogSQLCreateTables must
2578 be set to 'On'.
2579 </para>
2580 </note>
2581 </listitem>
2582 </varlistentry>
2583 <varlistentry>
2584 <term>LogSQLWhichNotes</term>
2585 <listitem>
2586 <cmdsynopsis sepchar=" ">
2587 <command>LogSQLWhichNotes</command>
2588 <arg choice="req" rep="repeat">
2589 <replaceable>note-name</replaceable>
2590 </arg>
2591 </cmdsynopsis>
2592 <simpara>
2593 Example: LogSQLWhichNotes mod_gzip_result
2594 mod_gzip_ompression_ratio
2595 </simpara>
2596 <simpara>Context: virtual host</simpara>
2597 <para>
2598 Defines the list of notes you would like logged. This
2599 works in conjunction with LogSQLNotesLogTable. This
2600 directive does not require any additional characters to
2601 be added to the LogSQLTransferLogFormat string. The
2602 feature is activated simply by including this directive,
2603 upon which you will begin populating the separate notes
2604 table with data.
2605 </para>
2606 <para>
2607 Each string may contain a + or - prefix in a
2608 &lt;VirtualHost&gt; context and will cause those strings
2609 to be added (+) or removed (-) from the global
2610 configuration. Otherwise the global is completely
2611 ignored and overridden if defined in a
2612 &lt;VirtualHost&gt;
2613 </para>
2614 <note>
2615 <para>
2616 The table must be created (see create-tables.sql,
2617 included in the package), or LogSQLCreateTables must
2618 be set to 'On'.
2619 </para>
2620 </note>
2621 </listitem>
2622 </varlistentry>
2623 </variablelist>
2624 </section>
2625 <section>
2626 <title>Deprecated Commands</title>
2627 <variablelist>
2628 <varlistentry>
2629 <term>LogSQLSocketFile [Deprecated]</term>
2630 <listitem>
2631 <cmdsynopsis sepchar=" ">
2632 <command>LogSQLSocketFile</command>
2633 <arg choice="req">
2634 <replaceable>filename</replaceable>
2635 </arg>
2636 </cmdsynopsis>
2637 <simpara>
2638 Example: LogSQLSocketFile /tmp/mysql.sock
2639 </simpara>
2640 <simpara>Default: (database specific)</simpara>
2641 <simpara>
2642 Default (MySQL): /var/lib/mysql/mysql.sock
2643 </simpara>
2644 <simpara>Context: main server config</simpara>
2645 <para>
2646 At Apache runtime you can specify the MySQL socket file
2647 to use. Set this once in your main server config to
2648 override the default value. This value is irrelevant if
2649 your database resides on a separate machine.
2650 </para>
2651 <para>
2652 mod_log_sql will automatically employ the socket for db
2653 communications if the database resides on the local
2654 host. If the db resides on a separate host the module
2655 will automatically use TCP/IP. This is a function of the
2656 MySQL API and is not user-configurable.
2657 </para>
2658 <note>
2659 <para>
2660 This directive is deprecated in favor of LogSQLDBParam
2661 socketfile [socketfilename]
2662 </para>
2663 <para>
2664 This is defined only once in the
2665 <filename>httpd.conf</filename>
2666 file.
2667 </para>
2668 </note>
2669 </listitem>
2670 </varlistentry>
2671 <varlistentry>
2672 <term>LogSQLTCPPort [Deprecated]</term>
2673 <listitem>
2674 <cmdsynopsis sepchar=" ">
2675 <command>LogSQLTCPPort</command>
2676 <arg choice="req">
2677 <replaceable>port-number</replaceable>
2678 </arg>
2679 </cmdsynopsis>
2680 <simpara>Example: LogSQLTCPPort 3309</simpara>
2681 <simpara>Default: (database specific)</simpara>
2682 <simpara>Default (MySQL): 3306</simpara>
2683 <simpara>Context: main server config</simpara>
2684 <para>
2685 Your database may listen on a different port than the
2686 default. If so, use this directive to instruct the
2687 module which port to use. This directive only applies if
2688 the database is on a different machine connected via
2689 TCP/IP.
2690 </para>
2691 <note>
2692 <para>
2693 This directive is deprecated in favor of LogSQLDBParam
2694 tcpport [port-number]
2695 </para>
2696 <para>
2697 This is defined only once in the
2698 <filename>httpd.conf</filename>
2699 file.
2700 </para>
2701 </note>
2702 </listitem>
2703 </varlistentry>
2704 <varlistentry>
2705 <term>LogSQLDatabase [Deprecated]</term>
2706 <listitem>
2707 <cmdsynopsis sepchar=" ">
2708 <command>LogSQLDatabase</command>
2709 <arg choice="req">
2710 <replaceable>database</replaceable>
2711 </arg>
2712 </cmdsynopsis>
2713 <simpara>Example: LogSQLDatabase loggingdb</simpara>
2714 <simpara>Context: main server config</simpara>
2715 <para>
2716 Defines the database that is used for logging.
2717 "database" must be a valid db on the MySQL host defined
2718 in LogSQLLoginInfo
2719 </para>
2720 <note>
2721 <para>
2722 This directive is deprecated in favor of the URI form
2723 of LogSQLLoginInfo.
2724 </para>
2725 <para>
2726 This is defined only once in the
2727 <filename>httpd.conf</filename>
2728 file.
2729 </para>
2730 </note>
2731 </listitem>
2732 </varlistentry>
2733 </variablelist>
2734 </section>
2735 </section>
2736 </section>
2737 <section id="Sect.FAQ">
2738 <title>FAQ</title>
2739 <qandaset>
2740 <qandadiv>
2741 <title>General module questions</title>
2742 <qandaentry id="FAQ.WhyLogToSQL">
2743 <question>
2744 <para>Why log to an SQL database?</para>
2745 </question>
2746 <answer>
2747 <para>
2748 To begin with, let's get it out of the way: logging to a
2749 database is not a panacea. But while there are
2750 complexities with this solution, the benefit can be
2751 substantial for certain classes of administrator or people
2752 with advanced requirements:
2753 </para>
2754 <itemizedlist>
2755 <listitem>
2756 <para>
2757 Chores like log rotation go away, as you can DELETE
2758 records from the SQL database once they are no longer
2759 useful. For example, the excellent and popular
2760 log-analysis tool Webalizer (http://www.webalizer.com)
2761 does not need historic logs after it has processed
2762 them, enabling you to delete older logs.
2763 </para>
2764 </listitem>
2765 <listitem>
2766 <para>
2767 People with clusters of web servers (for high
2768 availability) will benefit the most - all their
2769 webservers can log to a single SQL database. This
2770 obviates the need to collate/interleave the many
2771 separate logfiles, which can be / highly/ problematic.
2772 </para>
2773 </listitem>
2774 <listitem>
2775 <para>
2776 People acquainted with the power of SQL SELECT
2777 statements will know the flexibility of the extraction
2778 possibilities at their fingertips.
2779 </para>
2780 </listitem>
2781 </itemizedlist>
2782 <para>
2783 For example, do you want to see all your 404's? Do this:
2784 </para>
2785 <programlisting>SELECT remote_host, status, request_uri, bytes_sent, from_unixtime(time_stamp)
2786FROM acc_log_tbl WHERE status=404 ORDER BY time_stamp;</programlisting>
2787 <table>
2788 <title></title>
2789 <tgroup cols="5">
2790 <colspec colname="1" />
2791 <colspec colname="2" />
2792 <colspec colname="3" />
2793 <colspec colname="4" />
2794 <colspec colname="5" />
2795 <thead>
2796 <row>
2797 <entry colname="1">remote_host</entry>
2798 <entry colname="2">status</entry>
2799 <entry colname="3">request_uri</entry>
2800 <entry colname="4">bytes_sent</entry>
2801 <entry colname="5">from_unixtime(time_stamp)</entry>
2802 </row>
2803 </thead>
2804 <tbody>
2805 <row>
2806 <entry colname="1">marge.mmm.co.uk</entry>
2807 <entry colname="2">404</entry>
2808 <entry colname="3">/favicon.ico</entry>
2809 <entry colname="4">321</entry>
2810 <entry colname="5">2001-11-20 02:30:56</entry>
2811 </row>
2812 <row>
2813 <entry colname="1">62.180.239.251</entry>
2814 <entry colname="2">404</entry>
2815 <entry colname="3">/favicon.ico</entry>
2816 <entry colname="4">333</entry>
2817 <entry colname="5">2001-11-20 02:45:25</entry>
2818 </row>
2819 <row>
2820 <entry colname="1">212.234.12.66</entry>
2821 <entry colname="2">404</entry>
2822 <entry colname="3">/favicon.ico</entry>
2823 <entry colname="4">321</entry>
2824 <entry colname="5">2001-11-20 03:01:00</entry>
2825 </row>
2826 <row>
2827 <entry colname="1">212.210.78.254</entry>
2828 <entry colname="2">404</entry>
2829 <entry colname="3">/favicon.ico</entry>
2830 <entry colname="4">333</entry>
2831 <entry colname="5">2001-11-20 03:26:05</entry>
2832 </row>
2833 </tbody>
2834 </tgroup>
2835 </table>
2836 <para>
2837 Or do you want to see how many bytes you've sent within a
2838 certain directory or site? Do this:
2839 </para>
2840 <programlisting>SELECT request_uri,sum(bytes_sent) AS bytes, count(request_uri) AS howmany
2841FROM acc_log_tbl
2842WHERE request_uri LIKE '%mod_log_sql%'
2843GROUP BY request_uri ORDER BY howmany DESC;</programlisting>
2844 <table>
2845 <title></title>
2846 <tgroup cols="3">
2847 <colspec colname="1" />
2848 <colspec colname="2" />
2849 <colspec colname="3" />
2850 <thead>
2851 <row>
2852 <entry colname="1">request_uri</entry>
2853 <entry colname="2">bytes</entry>
2854 <entry colname="3">howmany</entry>
2855 </row>
2856 </thead>
2857 <tbody>
2858 <row>
2859 <entry colname="1">/mod_log_sql/style_1.css</entry>
2860 <entry colname="2">157396</entry>
2861 <entry colname="3">1288</entry>
2862 </row>
2863 <row>
2864 <entry colname="1">/mod_log_sql/</entry>
2865 <entry colname="2">2514337</entry>
2866 <entry colname="3">801</entry>
2867 </row>
2868 <row>
2869 <entry colname="1">
2870 /mod_log_sql/mod_log_sql.tar.gz
2871 </entry>
2872 <entry colname="2">9769312</entry>
2873 <entry colname="3">456</entry>
2874 </row>
2875 <row>
2876 <entry colname="1">/mod_log_sql/faq.html</entry>
2877 <entry colname="2">5038728</entry>
2878 <entry colname="3">436</entry>
2879 </row>
2880 </tbody>
2881 </tgroup>
2882 </table>
2883 <para>
2884 Or maybe you want to see who's linking to you? Do this:
2885 </para>
2886 <programlisting>SELECT count(referer) AS num,referer
2887FROM acc_log_tbl
2888WHERE request_uri='/mod_log_sql/'
2889GROUP BY referer ORDER BY num DESC;</programlisting>
2890 <table>
2891 <title></title>
2892 <tgroup cols="2">
2893 <colspec colname="1" />
2894 <colspec colname="2" />
2895 <thead>
2896 <row>
2897 <entry colname="1">num</entry>
2898 <entry colname="2">referer</entry>
2899 </row>
2900 </thead>
2901 <tbody>
2902 <row>
2903 <entry colname="1">271</entry>
2904 <entry colname="2">
2905 http://freshmeat.net/projects/mod_log_sql/
2906 </entry>
2907 </row>
2908 <row>
2909 <entry colname="1">96</entry>
2910 <entry colname="2">
2911 http://modules.apache.org/search?id=339
2912 </entry>
2913 </row>
2914 <row>
2915 <entry colname="1">48</entry>
2916 <entry colname="2">http://freshmeat.net/</entry>
2917 </row>
2918 <row>
2919 <entry colname="1">8</entry>
2920 <entry colname="2">http://freshmeat.net</entry>
2921 </row>
2922 </tbody>
2923 </tgroup>
2924 </table>
2925 <para>
2926 As you can see, there are myriad possibilities that can be
2927 constructed with the wonderful SQL SELECT statement.
2928 Logging to an SQL database can be really quite useful!
2929 </para>
2930 </answer>
2931 </qandaentry>
2932 <qandaentry>
2933 <question>
2934 <para>Why use MySQL? Are there alternatives?</para>
2935 </question>
2936 <answer>
2937 <para>
2938 MySQL is a robust, free, and very powerful
2939 production-quality database engine. It is well supported
2940 and comes with detailed documentation. Many 3rd-party
2941 software pacakges (e.g. Slashcode, the engine that powers
2942 Slashdot) run exclusively with MySQL. In other words, you
2943 will belong to a very robust and well-supported community
2944 by choosing MySQL.
2945 </para>
2946 <para>
2947 That being said, there are alternatives. PostgreSQL is
2948 probably MySQL's leading "competitor" in the free database
2949 world. There is also an excellent module available for
2950 Apache to permit logging to a PostgreSQL database, called
2951 <ulink url="http://www.digitalstratum.com/pglogd/">
2952 pgLOGd
2953 </ulink>
2954 </para>
2955 </answer>
2956 <answer>
2957 <note>
2958 <para>
2959 Currently a database abstraction system is in the works
2960 to allow any database to be used with mod_log_sql.
2961 </para>
2962 </note>
2963 </answer>
2964 </qandaentry>
2965 <qandaentry>
2966 <question>
2967 <para>Is this code production-ready?</para>
2968 </question>
2969 <answer>
2970 <para>
2971 By all accounts it is. It is known to work without a
2972 problem on many-thousands-of-hits-per-day webservers. Does
2973 that mean it is 100% bug free? Well, no software is, but
2974 it is well-tested and believed to be fully compatible with
2975 production environments. (The usual disclaimers apply.
2976 This software is provided without warranty of any kind.)
2977 </para>
2978 </answer>
2979 </qandaentry>
2980 <qandaentry>
2981 <question>
2982 <para>Who's using mod_log_sql?</para>
2983 </question>
2984 <answer>
2985 <para>
2986 Good question! It would be great to find out! If you are a
2987 production-level mod_log_sql user, please contact eddie at
2988 &EmailContact;
2989 so that you can be mentioned here.
2990 </para>
2991 </answer>
2992 </qandaentry>
2993 <qandaentry>
2994 <question>
2995 <para>
2996 Why doesn't the module also replace the Apache ErrorLog?
2997 </para>
2998 </question>
2999 <answer>
3000 <para>
3001 There are circumstances when that would be quite unwise --
3002 for example, if Apache could not reach the MySQL server
3003 for some reason and needed to log that fact. Without a
3004 text-based error log you'd never know anything was wrong,
3005 because Apache would be trying to log a database
3006 connection error to the database... you get the point.
3007 </para>
3008 </answer>
3009 <answer>
3010 <para>
3011 Error logs are usually not very high-traffic and are
3012 really best left as text files on a web server machine.
3013 </para>
3014 </answer>
3015 <answer>
3016 <para>
3017 The Error log is free format text.. (no specified
3018 formatting what, so ever) which is rather difficult to
3019 nicely format for storing in a database.
3020 </para>
3021 </answer>
3022 </qandaentry>
3023 <qandaentry>
3024 <question>
3025 <para>Does mod_log_sql work with Apache 2.x?</para>
3026 </question>
3027 <answer>
3028 <para>
3029 Yes. A port of mod_log_sql is available for Apache 2.x as
3030 of mod_log_sql 1.90
3031 </para>
3032 </answer>
3033 </qandaentry>
3034 <qandaentry>
3035 <question>
3036 <para>
3037 Does mod_log_sql connect to MySQL via TCP/IP or a socket?
3038 </para>
3039 </question>
3040 <answer>
3041 <para>Quick answer, Yes.</para>
3042 </answer>
3043 <answer>
3044 <para>
3045 It depends! This is not determined by mod_log_sql.
3046 mod_log_sql relies on a connection command that is
3047 supplied in the MySQL API, and that command is somewhat
3048 intelligent. How it works:
3049 </para>
3050 <itemizedlist>
3051 <listitem>
3052 <simpara>
3053 if the specified MySQL database is on the same
3054 machine, the connection command uses a socket to
3055 communicate with MySQL
3056 </simpara>
3057 </listitem>
3058 <listitem>
3059 <simpara>
3060 if the specified MySQL database is on a different
3061 machine, mod_log_sql connects using TCP/IP.
3062 </simpara>
3063 </listitem>
3064 </itemizedlist>
3065 <para>
3066 You don't have any control of which methodology is used.
3067 You can fine-tune some of the configuration, however. The
3068 LogSQLSocketFile runtime configuration directive overrides
3069 the default of "/var/lib/mysql/mysql.sock" for
3070 socket-based connections, whereas the LogSQLTCPPort
3071 command allows to you override the default TCP port of
3072 3306 for TCP/IP connections.
3073 </para>
3074 </answer>
3075 </qandaentry>
3076 <qandaentry>
3077 <question>
3078 <para>I have discovered a bug. Who can I contact?</para>
3079 </question>
3080 <answer>
3081 <para>
3082 Please contact Edward Rudd at
3083 &EmailContact;
3084 , or post a message to the mod_log_sql
3085 <xref endterm="Sect.MailingLists.title"
3086 linkend="Sect.MailingLists" />
3087 . Your comments, suggestions, bugfixes, bug catches, and
3088 usage testimonials are always welcome. As free software,
3089 mod_log_sql is intended to be a community effort -- any
3090 code contributions or other ideas will be fully and openly
3091 credited, of course.
3092 </para>
3093 </answer>
3094 </qandaentry>
3095 </qandadiv>
3096 <qandadiv>
3097 <title>Problems</title>
3098 <qandaentry>
3099 <question>
3100 <para>
3101 Apache segfaults or has other problems when using PHP and
3102 mod_log_sql
3103 </para>
3104 </question>
3105 <answer>
3106 <para>
3107 This occurs if you compiled PHP with MySQL database
3108 support. PHP utilizes its internal, bundled MySQL
3109 libraries by default. These conflict with the "real" MySQL
3110 libraries linked by mod_log_sql, causing the segmentation
3111 fault.
3112 </para>
3113 <para>
3114 PHP and mod_log_sql can be configured to happily coexist.
3115 The solution is to configure PHP to link against the real
3116 MySQL libraries: recompile PHP using
3117 --with-mysql=/your/path. Apache will run properly once the
3118 modules are all using the same version of the MySQL
3119 libraries.
3120 </para>
3121 </answer>
3122 </qandaentry>
3123 <qandaentry id="FAQ.NothingLogged">
3124 <question>
3125 <para>
3126 Apache appears to start up fine, but nothing is getting
3127 logged in the database
3128 </para>
3129 </question>
3130 <answer>
3131 <para>
3132 If you do not see any entries in the access_log, then
3133 something is preventing the inserts from happening. This
3134 could be caused by several things:
3135 </para>
3136 <itemizedlist>
3137 <listitem>
3138 <simpara>
3139 Improper privileges set up in the MySQL database
3140 </simpara>
3141 </listitem>
3142 <listitem>
3143 <simpara>
3144 You are not hitting a VirtualHost that has a
3145 LogSQLTransferLogTable entry
3146 </simpara>
3147 </listitem>
3148 <listitem>
3149 <simpara>
3150 You did not specify the right database host or login
3151 information
3152 </simpara>
3153 </listitem>
3154 <listitem>
3155 <simpara>
3156 Another factor is preventing a connection to the
3157 database
3158 </simpara>
3159 </listitem>
3160 </itemizedlist>
3161 <note>
3162 <para>
3163 It is improper to ask for help before you have followed
3164 these steps.
3165 </para>
3166 </note>
3167 <para>
3168 First examine the MySQL log that you established in step
3169 <xref linkend="Item.EnableLogging" />
3170 of section
3171 <xref endterm="Sect.Preperation.title"
3172 linkend="Sect.Preperation" />
3173 . Ensure that the INSERT statements are not being rejected
3174 because of a malformed table name or other typographical
3175 error. By enabling that log, you instructed MySQL to log
3176 every connection and command it receives -- if you see no
3177 INSERT attempts in the log, the module isn't successfully
3178 connecting to the database. If you see nothing at all in
3179 the log -- not even a record of your administrative
3180 connection attempts, then you did not enable the log
3181 correctly. If you do see INSERT attempts but they are
3182 failing, the log should tell you why.
3183 </para>
3184 <para>
3185 Second, confirm that your LogSQL* directives are all
3186 correct.
3187 </para>
3188 <para>
3189 Third, examine the Apache error logs for messages from
3190 mod_log_sql; the module will offer hints as to why it
3191 cannot connect, etc.
3192 </para>
3193 <para>
3194 The next thing to do is to change the LogLevel directive
3195 <emphasis>
3196 in the main server config as well as in each VirtualHost
3197 config:
3198 </emphasis>
3199 </para>
3200 <programlisting>LogLevel debug
3201ErrorLog /var/log/httpd/server-messages</programlisting>
3202 </answer>
3203 </qandaentry>
3204 <qandaentry>
3205 <question>
3206 <para>
3207 Why do I get the message "insufficient configuration info
3208 to establish database link" in my Apache error log?
3209 </para>
3210 </question>
3211 <answer>
3212 <para>
3213 At a minimum, LogSQLLoginInfo in the URl form and either
3214 LogSQLTableName or LogSQLMassVirtualHosting must be
3215 defined in order for the module to be able to establish a
3216 database link. If these are not defined or are incomplete
3217 you will receive this error message.
3218 </para>
3219 </answer>
3220 </qandaentry>
3221 <qandaentry>
3222 <question>
3223 <para>
3224 My database cannot handle all the open connections from
3225 mod_log_sql, is there anything I can do?
3226 </para>
3227 </question>
3228 <answer>
3229 <para>
3230 The rule of thumb: if you have n webservers each
3231 configured to support y MaxClients, then your database
3232 must be able to handle n times y simultaneous connections
3233 in the worst case. Certainly you must use common sense,
3234 consider reasonable traffic expectations and structure
3235 things accordingly.
3236 </para>
3237 </answer>
3238 <answer>
3239 <para>
3240 Tweaking my.cnf to scale to high connection loads is
3241 imperative. But if hardware limitations prevent your MySQL
3242 server from gracefully handling the number of incoming
3243 connections, it would be beneficial to upgrade the memory
3244 or CPU on that server in order to handle the load.
3245 </para>
3246 </answer>
3247 <answer>
3248 <para>
3249 Jeremy Zawodny, a highly respected MySQL user and
3250 contributor to Linux Magazine, has this very helpful and
3251 highly appropriate article on tuning MySQL:
3252 <ulink
3253 url="http://jeremy.zawodny.com/blog/archives/000173.html">
3254 MySQL, Linux, and Thread Caching
3255 </ulink>
3256 </para>
3257 </answer>
3258 <answer>
3259 <para>
3260 Please remember that mod_log_sql's overriding principle is
3261 performance -- that is what the target audience demands
3262 and expects. Other database logging solutions do not open
3263 and maintain many database connections, but their
3264 performance suffers drastically. For example, pgLOGd
3265 funnels all log connections through a separate daemon that
3266 connects to the database, but that bottlenecks the entire
3267 process. mod_log_sql achieves performance numbers an order
3268 of magnitude greater than the alternatives because it
3269 dispenses with the overhead associated with rapid
3270 connection cycling, and it does not attempt to shoehorn
3271 all the database traffic through a single extra daemon or
3272 proxy process.
3273 </para>
3274 </answer>
3275 <answer>
3276 <note>
3277 <para>
3278 Currently connection pooling is being implemented as
3279 part of the Database Abstraction layer to allow multiple
3280 httpd processes to share connections.
3281 </para>
3282 </note>
3283 </answer>
3284 </qandaentry>
3285 <qandaentry>
3286 <question>
3287 <para>
3288 Why do I occasionally see a "lost connection to MySQL
3289 server" message in my Apache error log?
3290 </para>
3291 </question>
3292 <answer>
3293 <para>
3294 This message may appear every now and then in your Apache
3295 error log, especially on very lightly loaded servers. This
3296 does not mean that anything is necessarily wrong. Within
3297 each httpd child process, mod_log_sql will open (and keep
3298 open) a connection to the MySQL server. MySQL, however,
3299 will close connections that have not been used in a while;
3300 the default timeout is 8 hours. When this occurs,
3301 mod_log_sql will notice and re-open the connection. That
3302 event is what is being logged, and looks like this:
3303 </para>
3304 <programlisting>[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: first attempt failed,
3305 API said: error 2013, Lost connection to MySQL server during query
3306[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: reconnect successful
3307[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: second attempt successful</programlisting>
3308 <para>
3309 Reference:
3310 <ulink
3311 url="http://dev.mysql.com/doc/mysql/en/Gone_away.html">
3312 MySQL documentation
3313 </ulink>
3314 </para>
3315 </answer>
3316 </qandaentry>
3317 <qandaentry>
3318 <question>
3319 <para>
3320 Sometimes a single VirtualHost gets logged to two
3321 different tables (e.g. access_foo_com,
3322 access_www_foo_com). Or, accesses to an unqualified
3323 hostname (e.g. "http://intranet/index.html") get logged in
3324 separate tables.
3325 </para>
3326 </question>
3327 <answer>
3328 <para>
3329 Proper usage of the Apache runtime ServerName directive
3330 and the directive UseCanonicalName On (or DNS) are
3331 necessary to prevent this problem. "On" is the default for
3332 UseCanonicalName, and specifies that self-referential URLs
3333 are generated from the ServerName part of your
3334 VirtualHost:
3335 </para>
3336 <para>
3337 With UseCanonicalName on (and in all versions prior to
3338 1.3) Apache will use the ServerName and Port directives to
3339 construct the canonical name for the server. With
3340 UseCanonicalName off Apache will form self-referential
3341 URLs using the hostname and port supplied by the client if
3342 any are supplied (otherwise it will use the canonical
3343 name, as defined above). [From
3344 <ulink
3345 url="http://httpd.apache.org/docs/mod/core.html#usecanonicalname">
3346 the Apache documentation
3347 </ulink>
3348 ]
3349 </para>
3350 <para>
3351 The module inherits Apache's "knowledge" about the server
3352 name being accessed. As long as those two directives are
3353 properly configured, mod_log_sql will log to only one
3354 table per virtual host while using
3355 LogSQLMassVirtualHosting.
3356 </para>
3357 </answer>
3358 </qandaentry>
3359 </qandadiv>
3360 <qandadiv>
3361 <title>Performance and Tuning</title>
3362 <qandaentry>
3363 <question>
3364 <para>How well does it perform?</para>
3365 </question>
3366 <answer>
3367 <para>
3368 mod_log_sql scales to very high loads. Apache 1.3.22 +
3369 mod_log_sql was benchmarked using the "ab" (Apache Bench)
3370 program that comes with the Apache distribution; here are
3371 the results.
3372 </para>
3373 <itemizedlist>
3374 <title>Overall configuration</title>
3375 <listitem>
3376 <simpara>Machine A: Apache webserver</simpara>
3377 </listitem>
3378 <listitem>
3379 <simpara>Machine B: MySQL server</simpara>
3380 </listitem>
3381 <listitem>
3382 <simpara>
3383 Machines A and B connected with 100Mbps Ethernet
3384 </simpara>
3385 </listitem>
3386 <listitem>
3387 <simpara>
3388 Webserver: Celeron 400, 128MB RAM, IDE storage
3389 </simpara>
3390 </listitem>
3391 </itemizedlist>
3392 <example>
3393 <title>Apache configuration</title>
3394 <programlisting>Timeout 300
3395KeepAlive On
3396MaxKeepAliveRequests 100
3397KeepAliveTimeout 15
3398MinSpareServers 5
3399StartServers 10
3400MaxSpareServers 15
3401MaxClients 256
3402MaxRequestsPerChild 5000
3403LogSQLTransferLogFormat AbHhmRSsTUuvc
3404LogSQLWhichCookie Clicks
3405CookieTracking on
3406CookieName Clicks</programlisting>
3407 </example>
3408 <example>
3409 <title>"ab" commandline</title>
3410 <programlisting>./ab -c 10 -t 20 -v 2 -C Clicks=ab_run http://www.hostname.com/target</programlisting>
3411 </example>
3412 <para>
3413 ( 10 concurrent requests; 20 second test; setting a cookie
3414 "Clicks=ab_run"; target = the mod_log_sql homepage. )
3415 </para>
3416 <para>
3417 Ten total ab runs were conducted: five with MySQL logging
3418 enabled, and five with all MySQL directives commented out
3419 of httpd.conf. Then each five were averaged. The results:
3420 </para>
3421 <itemizedlist>
3422 <listitem>
3423 <simpara>
3424 Average of five runs employing MySQL and standard text
3425 logging:
3426 <emphasis>
3427 139.01 requests per second, zero errors.
3428 </emphasis>
3429 </simpara>
3430 </listitem>
3431 <listitem>
3432 <simpara>
3433 Average of five runs employing only standard text
3434 logging:
3435 <emphasis>
3436 139.96 requests per second, zero errors.
3437 </emphasis>
3438 </simpara>
3439 </listitem>
3440 </itemizedlist>
3441 <para>
3442 In other words, any rate-limiting effects on this
3443 particular hardware setup are not caused by MySQL. Note
3444 that although this very simple webserver setup is hardly
3445 cutting-edge -- it is, after all, a fairly small machine
3446 -- 139 requests per second equal over twelve million hits
3447 per day.
3448 </para>
3449 <orderedlist>
3450 <title>
3451 If you run this benchmark yourself, take note of three
3452 things:
3453 </title>
3454 <listitem>
3455 <simpara>
3456 Use a target URL that is on your own webserver :-).
3457 </simpara>
3458 </listitem>
3459 <listitem>
3460 <simpara>
3461 Wait until all your connections are closed out between
3462 runs; after several thousand requests your TCP/IP
3463 stack will be filled with hundreds of connections in
3464 TIME_WAIT that need to close. Do a "netstat -t|wc -l"
3465 on the webserver to see. If you don't wait, you can
3466 expect to see a lot of messages like "ip_conntrack:
3467 table full, dropping packet" in your logs. (This has
3468 nothing to do with mod_log_sql, this is simply the
3469 nature of the TCP/IP stack in the Linux kernel.)
3470 </simpara>
3471 </listitem>
3472 <listitem>
3473 <simpara>
3474 When done with your runs, clean these many thousands
3475 of requests out of your database:
3476 </simpara>
3477 <programlisting>mysql&gt; delete from access_log where agent like 'ApacheBench%';
3478mysql&gt; optimize table access_log;</programlisting>
3479 </listitem>
3480 </orderedlist>
3481 </answer>
3482 </qandaentry>
3483 <qandaentry>
3484 <question>
3485 <para>
3486 Do I need to be worried about all the running MySQL
3487 children? Will holding open n Apache-to-MySQL connections
3488 consume a lot of memory?
3489 </para>
3490 </question>
3491 <answer>
3492 <para>Short answer: you shouldn't be worried.</para>
3493 </answer>
3494 <answer>
3495 <para>
3496 Long answer: you might be evaluating at the output of "ps
3497 -aufxw" and becoming alarmed at all the 7MB httpd
3498 processes or 22MB mysqld children that you see. Don't be
3499 alarmed. It's true that mod_log_sql opens and holds open
3500 many MySQL connections: each httpd child maintains one
3501 open database connection (and holds it open for
3502 performance reasons). Four webservers, each running 20
3503 Apache children, will hold open 80 MySQL connections,
3504 which means that your MySQL server needs to handle 80
3505 simultaneous connections. In truth, your MySQL server
3506 needs to handle far more than that if traffic to your
3507 website spikes and the Apache webservers spawn off an
3508 additional 30 children each...
3509 </para>
3510 <para>
3511 Fortunately the cost reported by 'ps -aufxw' is deceptive.
3512 This is due to an OS memory-management feature called
3513 "copy-on-write." When you have a number of identical child
3514 processes (e.g. Apache, MySQL), it would appear in "ps" as
3515 though each one occupies a great deal of RAM -- as much as
3516 7MB per httpd child! In actuality each additional child
3517 only occupies a small bit of extra memory -- most of the
3518 memory pages are common to each child and therefore shared
3519 in a "read-only" fashion. The OS can get away with this
3520 because the majority of memory pages for one child are
3521 identical across all children. Instead of thinking of each
3522 child as a rubber stamp of the others, think of each child
3523 as a basket of links to a common memory area.
3524 </para>
3525 <para>
3526 A memory page is only duplicated when it needs to be
3527 written to, hence "copy-on-write." The result is
3528 efficiency and decreased memory consumption. "ps" may
3529 report 7MB per child, but it might really only "cost" 900K
3530 of extra memory to add one more child. It is not correct
3531 to assume that 20 Apache children with a VSZ of 7MB each
3532 equals (2 x 7MB) of memory consumption -- the real answer
3533 is much, much lower. The same "copy-on-write" rules apply
3534 to all your MySQL children: 40 mysqld children @ 22MB each
3535 do not occupy 880MB of RAM.
3536 </para>
3537 <para>
3538 The bottom line: although there is a cost to spawn extra
3539 httpd or mysqld children, that cost is not as great as
3540 "ps" would lead you to believe.
3541 </para>
3542 </answer>
3543 </qandaentry>
3544 <qandaentry>
3545 <question>
3546 <para>
3547 My webserver cannot handle all the traffic that my site
3548 receives, is there anything I can do?
3549 </para>
3550 </question>
3551 <answer>
3552 <para>
3553 If you have exhausted all the tuning possibilities on your
3554 existing server, it is probably time you evaluated the
3555 benefits of clustering two or more webservers together in
3556 a load-balanced fashion. In fact, users of such a setup
3557 are mod_log_sql's target audience!
3558 </para>
3559 </answer>
3560 </qandaentry>
3561 <qandaentry id="FAQ.DelayedInsert">
3562 <question>
3563 <para>
3564 What is the issue with activating delayed inserts?
3565 </para>
3566 </question>
3567 <answer>
3568 <para>
3569 INSERT DELAYED is a specific syntax to MySQL and is not
3570 supported by any other database. Ergo, why is it needed,
3571 and what MySQL deficiency is it working around? INSERT
3572 DELAYED is a kluge.
3573 </para>
3574 </answer>
3575 <answer>
3576 <para>
3577 The MySQL documentation is unclear whether INSERT DELAYED
3578 is even necessary for an optimized database. It says, "The
3579 DELAYED option for the INSERT statement is a
3580 MySQL-specific option that is very useful if you have
3581 clients that can't wait for the INSERT to complete." But
3582 then it goes on to say, "Note that as MyISAM tables
3583 supports concurrent SELECT and INSERT, if there is no free
3584 blocks in the middle of the data file, you very seldom
3585 need to use INSERT DELAYED with MyISAM."
3586 </para>
3587 </answer>
3588 <answer>
3589 <para>
3590 Because INSERT DELAYED returns without waiting for the
3591 data to be written, a hard kill of your MySQL database at
3592 the right (wrong?) moment could lose those logfile
3593 entries.
3594 </para>
3595 </answer>
3596 <answer>
3597 <para>
3598 As of MySQL version 3.23.52, the error return functions
3599 disagree after a failed INSERT DELAYED: mysql_errno()
3600 always returns 0, even if mysql_error() returns a textual
3601 error. I have reported this bug to the MySQL folks.
3602 However, we have no way of knowing what solution they will
3603 adopt to fix this, and with the worst case solution
3604 mod_log_sql would not be able to tell if anything went
3605 wrong with a delayed insert.
3606 </para>
3607 </answer>
3608 <answer>
3609 <para>
3610 Instead of delayed inserts, you may wish to utilize InnoDB
3611 tables (instead of the standard MyISAM tables). InnoDB
3612 tables suppot row-level locking and are recommended for
3613 high-volume databases.
3614 </para>
3615 </answer>
3616 <answer>
3617 <para>
3618 If after understanding these problems you still wish to
3619 enable delayed inserts, section
3620 <xref endterm="Sect.DelayedInsert.title"
3621 linkend="Sect.DelayedInsert" />
3622 discusses how.
3623 </para>
3624 </answer>
3625 </qandaentry>
3626 </qandadiv>
3627 <qandadiv>
3628 <title>"How do I...?" -- accomplishing certain tasks</title>
3629 <qandaentry>
3630 <question>
3631 <para>
3632 How do I extract the data in a format that my analysis
3633 tool can understand?
3634 </para>
3635 </question>
3636 <answer>
3637 <para>
3638 mod_log_sql would be virtually useless if there weren't a
3639 way for you to extract the data from your database in a
3640 somewhat meaningful fashion. To that end there's a Perl
3641 script enclosed with the distribution. That script
3642 (make_combined_log.pl) is designed to extract N-many days
3643 worth of access logs and provide them in a Combined Log
3644 Format output. You can use this very tool right in
3645 /etc/crontab to extract logs on a regular basis so that
3646 your favorite web analysis tool can read them. Or you can
3647 examine the Perl code to construct your own custom tool.
3648 </para>
3649 <para>
3650 For example, let's say that you want your web statistics
3651 updated once per day in the wee hours of the morning. A
3652 good way to accomplish that could be the following entries
3653 in /etc/crontab:
3654 </para>
3655 <programlisting># Generate the temporary apache logs from the MySQL database (for webalizer)
365605 04 * * * root make_combined_log.pl 1 www.grubbybaby.com &gt; /var/log/temp01
3657# Run webalizer on httpd log
365830 04 * * * root webalizer -c /etc/webalizer.conf; rm -f /var/log/temp01</programlisting>
3659 <para>
3660 Or if you have a newer system that puts files in
3661 /etc/cron.daily etc., create a file called "webalizer" in
3662 the cron.daily subdirectory. Use the following as the
3663 contents of your file, and make sure to chmod 755 it when
3664 done.
3665 </para>
3666 <programlisting>#!/bin/sh
3667/usr/local/sbin/make_combined_log.pl 1 www.yourdomain.com &gt; /var/log/httpd/templog
3668/usr/local/bin/webalizer -q -c /etc/webalizer.conf
3669rm -f /var/log/httpd/templog</programlisting>
3670 <para>See? Easy.</para>
3671 </answer>
3672 </qandaentry>
3673 <qandaentry id="FAQ.Cookie">
3674 <question>
3675 <para>How can I log mod_usertrack cookies?</para>
3676 </question>
3677 <answer>
3678 <para>
3679 A number of people like to log mod_usertrack cookies in
3680 their Apache TransferLog to aid in understanding their
3681 visitors' clickstreams. This is accomplished, for example,
3682 with a statement as follows:
3683 </para>
3684 <programlisting>LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" \"%{cookie}n\""</programlisting>
3685 <para>
3686 Naturally it would be nice for mod_log_sql to permit the
3687 admin to log the cookie data as well, so as of version
3688 1.10 you can do this. You need to have already compiled
3689 mod_usertrack into httpd -- it's one of the standard
3690 Apache modules.
3691 </para>
3692 <para>
3693 First make sure you have a column called "cookie" in the
3694 MySQL database to hold the cookies, which can be done as
3695 follows if you already have a working database:
3696 </para>
3697 <programlisting>mysql&gt; alter table acc_log_tbl add column cookie varchar(255);</programlisting>
3698 <para>
3699 Next configure your server to set usertracking cookies as
3700 follows, and make sure you include the new 'c' directive
3701 in your LogSQLTransferLogFormat, which activates cookie
3702 logging. Here's an example:
3703 </para>
3704 <programlisting>&lt;VirtualHost 1.2.3.4&gt;
3705 CookieTracking on
3706 CookieStyle Cookie
3707 CookieName Foobar
3708 LogSQLTransferLogFormat huSUsbTvRAc
3709 LogSQLWhichCookie Foobar
3710&lt;/VirtualHost&gt;</programlisting>
3711 <para>
3712 The first three lines configure mod_usertrack to create a
3713 COOKIE (RFC 2109) format cookie called Foobar. The last
3714 two lines tell mod_log_sql to log cookies named Foobar.
3715 You have to choose which cookie to log because more than
3716 one cookie can/will be sent to the server by the client.
3717 </para>
3718 <para>
3719 Recap: the 'c' character activates cookie logging, and the
3720 LogSQLWhichCookie directive chooses which cookie to log.
3721 </para>
3722 <para>
3723 FYI, you are advised NOT to use CookieStyle Cookie2 -- it
3724 seems that even newer browsers (IE 5.5, etc.) have trouble
3725 with the new COOKIE2 (RFC 2965) format. Just stick with
3726 the standard COOKIE format and you'll be fine.
3727 </para>
3728 <para>
3729 Perform some hits on your server and run a select
3730 </para>
3731 <programlisting>SELECT request_uri,cookie
3732FROM access_log
3733WHERE cookie IS NOT NULL;</programlisting>
3734 <table>
3735 <title></title>
3736 <tgroup cols="2">
3737 <colspec colname="1" />
3738 <colspec colname="2" />
3739 <thead>
3740 <row>
3741 <entry colname="1">request_uri</entry>
3742 <entry colname="2">cookie</entry>
3743 </row>
3744 </thead>
3745 <tbody>
3746 <row>
3747 <entry colname="1">/mod_log_sql/</entry>
3748 <entry colname="2">
3749 ool-18e4.dyn.optonline.net.130051007102700823
3750 </entry>
3751 </row>
3752 <row>
3753 <entry colname="1">/mod_log_sql/usa.gif</entry>
3754 <entry colname="2">
3755 ool-18e4.dyn.optonline.net.130051007102700823
3756 </entry>
3757 </row>
3758 <row>
3759 <entry colname="1">/mod_log_sql/style_1.css</entry>
3760 <entry colname="2">
3761 ool-18e4.dyn.optonline.net.130051007102700823
3762 </entry>
3763 </row>
3764 </tbody>
3765 </tgroup>
3766 </table>
3767 </answer>
3768 </qandaentry>
3769 <qandaentry>
3770 <question>
3771 <para>
3772 What if I want to log more than one cookie? What is the
3773 difference between LogSQLWhichCookie and
3774 LogSQLWhichCookies?
3775 </para>
3776 </question>
3777 <answer>
3778 <para>
3779 As of version 1.17, you have a choice in how you want
3780 cookie logging handled.
3781 </para>
3782 <para>
3783 If you are interested in logging only one cookie per
3784 request, follow the instructions in FAQ entry
3785 <xref linkend="FAQ.Cookie" />
3786 above. That cookie will be logged to a column in the
3787 regular access_log table, and the actual cookie you want
3788 to log is specified with LogSQLWhichCookie. Don't forget
3789 to specify the 'c' character in LogSQLTransferLogFormat.
3790 </para>
3791 <para>
3792 If, however, you need to log multiple cookies per request,
3793 you must employ the LogSQLWhichCookies (note the plural)
3794 directive. The cookies you specify will be logged to a
3795 separate table (as discussed in section
3796 <xref endterm="Sect.MultiTable.title"
3797 linkend="Sect.MultiTable" />
3798 ), and entries in that table will be linked to the regular
3799 access_log entries via the unique ID that is supplied by
3800 mod_unique_id. Without mod_unique_id the information will
3801 still be logged but you will be unable to correlate which
3802 cookies go with which access-requests. Furthermore, with
3803 LogSQLWhichCookies, you do not need to include the 'c'
3804 character in LogSQLTransferLogFormat.
3805 </para>
3806 <para>
3807 LogSQLWhichCookie and LogSQLWhichCookies can coexist
3808 without conflict because they operate on entireley
3809 different tables, but you're better off choosing the one
3810 you need.
3811 </para>
3812 </answer>
3813 </qandaentry>
3814 <qandaentry>
3815 <question>
3816 <para>
3817 What are the SSL logging features, and how do I activate
3818 them?
3819 </para>
3820 </question>
3821 <answer>
3822 <note>
3823 <para>
3824 You do not need to compile SSL support into mod_log_sql
3825 in order to simply use it with a secure site. You only
3826 need to compile SSL support into mod_log_sql if you want
3827 to log SSL-specific data such as the cipher type used,
3828 or the keysize that was negotiated. If that information
3829 is unimportant to you, you can ignore this FAQ.
3830 </para>
3831 </note>
3832 <para>
3833 By adding certain characters to your
3834 LogSQLTransferLogFormat string you can tell mod_log_sql to
3835 log the SSL cipher, the SSL keysize of the connection, and
3836 the maximum keysize that was available. This would let you
3837 tell, for example, which clients were using only
3838 export-grade security to access your secure software area.
3839 </para>
3840 <para>
3841 You can compile mod_log_sql with SSL logging support if
3842 you have the right packages installed. If you already have
3843 an SSL-enabled Apache then you by definition have the
3844 correct packages already installed: OpenSSL and mod_ssl.
3845 </para>
3846 <para>
3847 You need to ensure that your database is set up to log the
3848 SSL data. Issue the following commands to MySQL if your
3849 access table does not already have them:
3850 </para>
3851 <programlisting>mysql&gt; alter table access_log add column ssl_cipher varchar(25);
3852mysql&gt; alter table access_log add column ssl_keysize smallint unsigned;
3853mysql&gt; alter table access_log add column ssl_maxkeysize smallint unsigned;</programlisting>
3854 <para>
3855 Finally configure httpd.conf to activate the SSL fields.
3856 Note that this is only meaningful in a VirtualHost that is
3857 set up for SSL.
3858 </para>
3859 <programlisting>&lt;VirtualHost 1.2.3.4:443&gt;
3860 LogSQLTransferLogFormat AbHhmRSsTUuvcQqz
3861&lt;/VirtualHost&gt;</programlisting>
3862 <para>
3863 You also need to make sure you have the mod_log_sql_ssl
3864 module loaded as well.
3865 </para>
3866 <para>
3867 The last three characters (Qqz) in the directive are the
3868 SSL ones; see section
3869 <xref linkend="Conf.LogSQLTransferLogFormat" />
3870 in the directives documentation for details of the
3871 LogSQLTransferLogFormat directive.
3872 </para>
3873 <para>
3874 Restart Apache, then perform some hits on your server.
3875 Then run the following select statement:
3876 </para>
3877 <programlisting>SELECT remote_host,request_uri,ssl_cipher,ssl_keysize,ssl_maxkeysize
3878FROM access_log
3879WHERE ssl_cipher IS NOT NULL;</programlisting>
3880 <table>
3881 <title></title>
3882 <tgroup cols="5">
3883 <colspec colname="1" />
3884 <colspec colname="2" />
3885 <colspec colname="3" />
3886 <colspec colname="4" />
3887 <colspec colname="5" />
3888 <thead>
3889 <row>
3890 <entry colname="1">remote_host</entry>
3891 <entry colname="2">request_uri</entry>
3892 <entry colname="3">ssl_cipher</entry>
3893 <entry colname="4">ssl_keysize</entry>
3894 <entry colname="5">ssl_maxkeysize</entry>
3895 </row>
3896 </thead>
3897 <tbody>
3898 <row>
3899 <entry colname="1">216.192.52.4</entry>
3900 <entry colname="2">/dir/somefile.html</entry>
3901 <entry colname="3">RC4-MD5</entry>
3902 <entry colname="4">128</entry>
3903 <entry colname="5">128</entry>
3904 </row>
3905 <row>
3906 <entry colname="1">216.192.52.4</entry>
3907 <entry colname="2">/dir/somefile.gif</entry>
3908 <entry colname="3">RC4-MD5</entry>
3909 <entry colname="4">128</entry>
3910 <entry colname="5">128</entry>
3911 </row>
3912 <row>
3913 <entry colname="1">216.192.52.4</entry>
3914 <entry colname="2">/dir/somefile.jpg</entry>
3915 <entry colname="3">RC4-MD5</entry>
3916 <entry colname="4">128</entry>
3917 <entry colname="5">128</entry>
3918 </row>
3919 </tbody>
3920 </tgroup>
3921 </table>
3922 </answer>
3923 </qandaentry>
3924 </qandadiv>
3925 </qandaset>
3926 </section>
3927</article>