From b5ac0bac5a4c6f70302b14691933e9c742cd4db9 Mon Sep 17 00:00:00 2001 From: Edward Rudd Date: Tue, 11 Jan 2005 21:01:56 +0000 Subject: updated docs using new OOO Docbook DTD --- docs/manual.xml | 390 +++++++++++++++++++++++++++++--------------------------- 1 file changed, 203 insertions(+), 187 deletions(-) diff --git a/docs/manual.xml b/docs/manual.xml index 4178dde..5ecaeda 100644 --- a/docs/manual.xml +++ b/docs/manual.xml @@ -1,6 +1,6 @@ -urkle <at> outoforder <dot> cc"> ]>
@@ -11,18 +11,22 @@ Rudd Conversion from Lyx to DocBook Current Maintainer - -
&EmailContact;
-
+ + + &EmailContact; + + Christopher B. Powell Original documentation author. - -
chris <at> grubbybaby <dot> com
-
+ + + chris <at> grubbybaby <dot> com + +
2001 @@ -52,25 +56,25 @@ - +
Introduction - +
Summary This Apache module will permit you to log to a SQL database; it can log each access request as well as data associated with each request: cookies, notes, and inbound/outbound headers. Unlike logging to a flat text file -- which is standard in Apache -- a SQL-based log exhibits tremendous flexibility and power of data extraction. (See FAQ entry for further discussion and examples of the advantages to SQL.) This module can either replace or happily coexist with mod_log_config, Apache's text file logging facility. In addition to being more configurable than the standard module, mod_log_sql is much more flexible. - - +
+
Approach This project was formerly known as "mod_log_mysql." It was renamed "mod_log_sql" in order to reflect the project goal of database in-specificity. The module currently supports MySQL, but support for other database back-ends is underway. In order to save speed and overhead, links are kept alive in between queries. This module uses one dedicated SQL link per httpd child, opened by each child process when it is born. Among other things, this means that this module supports logging into only one MySQL server, and for now, also, only one SQL database. But that's a small tradeoff compared to the blinding speed of this module. Error reporting is robust throughout the module and will inform the administrator of database issues in the Apache ErrorLog for the server/virtual server. Virtual hosts are supported in the same manner they are in the regular logging modules. The administrator defines some basic 'global' directives in the main server config, then defines more specific 'local' directives inside each VirtualHost stanza. A robust "preserve" capability has now been implemented. This permits the module to preserve any failed INSERT commands to a local file on its machine. In any situation that the database is unavailable -- e.g. the network fails or the database host is rebooted -- mod_log_sql will note this in the error log and begin appending its log entries to the preserve file (which is created with the user and group ID of the running Apache process, e.g. "nobody/nobody" on many Linux installations). When database availability returns, mod_log_sql seamlessly resumes logging to it. When convenient for the sysadmin, he/she can easily import the preserve file into the database because it is simply a series of SQL insert statements. - - +
+
What gets logged by default? All the data that would be contained in the "Combined Log Format" is logged by default, plus a little extra. Your best bet is to begin by accepting this default, then later customize the log configuration based on your needs. The documentation of the run-time directives includes a full explanation of what you can log, including examples -- see section . - - +
+
Miscellaneous Notes @@ -79,7 +83,7 @@ The 'time_stamp' field is stored in an UNSIGNED INTEGER format, in the standard unix "seconds since the epoch" format. This is superior to storing the access time as a string due to size requirements: an UNSIGNED INT requires 4 bytes, whereas an Apache date string (e.g. "18/Nov/2001:13:59:52 -0800") requires 26 bytes: those extra 22 bytes become significant when multiplied by thousands of accesses on a busy server. Besides, an INT type is far more flexible for comparisons, etc. In MySQL 3.21 and above you can easily convert this to a human readable format using from_unixtime(), e.g.: - select remote_host,request_uri,from_unixtime(time_stamp) from access_log; + select remote_host,request_uri,from_unixtime(time_stamp) from access_log; The enclosed perl program "make_combined_log.pl" extracts your access log in a format that is completely compatible with the Combined Log Format. You can then feed this to your favorite web log analysis tool. @@ -92,23 +96,23 @@ When Apache logs a numeric field, it uses a '-' character to mean "not applicable," e.g. the number of bytes returned on a 304 (unchanged) request. Since '-' is an illegal character in an SQL numeric field, such fields are assigned the value 0 instead of '-' which, of course, makes perfect sense anyway. - - +
+
Author / Maintainer The actual logging code was taken from the already existing flat file text modules, so all that credit goes to the Apache Software Foundation. The MySQL routines and directives were added by Zeev Suraski <bourbon@netvision.net.il>. All changes from 1.06+ and the new documentation were added by Chris Powell chris <at> grubbybaby <dot> com. It seems that the module had fallen into the "un-maintained" category -- it had not been updated since 1998 -- so Chris adopted it as the new maintainer. In December of 2003, Edward Rudd &EmailContact; porting the module to Apache 2.0, cleaning up the code, converting the documentation to DocBook, optimizing the main logging loop, and added the much anticipated database abstraction layer. As of February 2004, Chris Powell handed over maintenance of the module over to Edward Rudd. So you should contact Edward Rudd about the module from now on. - - +
+
Mailing Lists A general discussion and support mailing list is provided for mod_log_sq at lists.outoforder.cc. To subscribe to the mailing list send a blank e-mail to mod_log_sql-subscribe@lists.outoforder.cc. The list archives can be accessed via Gmane.org's mailng list gateway via any new reader news://news.gmane.org/gmane.comp.apache.mod-log-sql, or via a web browser at http://news.gmane.org/gmane.comp.apache.mod-log-sql. - - - +
+
+
Installation - +
Requirements @@ -127,18 +131,18 @@ Optionally, if you want to be able to log SSL information such as keysize or cipher, you need OpenSSL and mod_ssl installed. - - +
+
Compiling and Installing Unpack the archive into a working directory. - $ tar -xzf mod_log_sql-1.94.tar.gz -$ cd mod_log_sql-1.9 + $ tar -xzf mod_log_sql-1.94.tar.gz +$ cd mod_log_sql-1.9 run configure to configure the source directory. - $ ./configure + $ ./configure The configure script should automatically detect all the required libraries and program if the are installed in standard locations.. If it returns an error, here is a description of the arguments you can specify when you run configure. @@ -176,9 +180,9 @@ $ cd mod_log_sql-1.9 --with-db-inc=/usr/include/db1 This argument is only needed when compiling SSL support for Apache 1.3, and needs to be the directory which contains the ndbm.h header file. You can find this by using - $ locate ndbm.h + $ locate ndbm.h /usr/include/db1/ndbm.h -/usr/include/gdbm/ndbm.h +/usr/include/gdbm/ndbm.h As far as I can tell, there is no difference as to which you specify, but it should be the one that you compiled mod_ssl with. The default is /usr/include/db1, which should work on most systems. @@ -199,18 +203,28 @@ $ cd mod_log_sql-1.9 Now compile the module with GNU make. You may have to specify gmake on some systems like FreeBSD. - $ gmake + $ gmake If there were no errors, you can now install the module(s). If you compiled as a non-root user you may need to switch users with su or sudo. - $ su -c "gmake install" -Password: + $ su -c "gmake install" +Password: Now edit your Apache configuration and load the modules. - If you are loading the SSL logging module, you need to make sure it is loaded after mod_ssl and mod_log_sql. - If you have previously used mod_log_sql version 1.18, the name of the module has changed from sql_log_module to log_sql_module (the first parameter to LoadModule) - If you are upgrading from any release earlier than 1.97 you need to add an extra LoadModule directive to load the database driver (ie mysql). + + + + If you are loading the SSL logging module, you need to make sure it is loaded after mod_ssl and mod_log_sql. + + + If you have previously used mod_log_sql version 1.18, the name of the module has changed from sql_log_module to log_sql_module (the first parameter to LoadModule) + + + If you are upgrading from any release earlier than 1.97 you need to add an extra LoadModule directive to load the database driver (ie mysql). + + + Insert these lines to either the main httpd.conf or a file included via an include directive. @@ -219,7 +233,7 @@ LoadModule log_sql_mysql_module modules/mod_log_sql_mysql.so <IfModule mod_ssl.c> LoadModule log_sql_ssl_module moduels/mod_log_sql_ssl.so </IfModule> - If you did not compile SSL support in mod_log_sql, do not include the lines between the <IfModule> directives. + If you did not compile SSL support in mod_log_sql, do not include the lines between the <IfModule> directives. If you are using Apache 1.3 you may need add these lines later in the configuration. @@ -228,16 +242,16 @@ AddModule mod_log_sql_mysql.c <IfModule mod_ssl.c> AddModule mod_log_sql_ssl.c </IfModule> - If you did not compile SSL support in mod_log_sql, do not include the lines between the <IfModule> directives. + If you did not compile SSL support in mod_log_sql, do not include the lines between the <IfModule> directives. - - - +
+
+
Configuration - +
Preparing MySQL for logging You have to prepare the database to receive data from mod_log_sql, and set up run-time directives in httpd.conf to control how and what mod_log_sql logs. This section will discuss how to get started with a basic configuration. Full documentation of all available run-time directives is available in section . @@ -247,33 +261,33 @@ AddModule mod_log_sql_ssl.c We still need to have a logging database created and ready, so run the MySQL command line client and create a database: - # mysql -uadmin -pmypassword + # mysql -uadmin -pmypassword Enter password: -mysql> create database apachelogs; +mysql> create database apachelogs; If you want to hand-create the tables, run the enclosed 'create-tables' SQL script as follows ("create_tables.sql" needs to be in your current working directory). - mysql> use apachelogs + mysql> use apachelogs Database changed -mysql> source create_tables.sql +mysql> source create_tables.sql Create a specific MySQL userid that httpd will use to authenticate and enter data. This userid need not be an actual Unix user. It is a userid internal to MySQL with specific privileges. In the following example command, "apachelogs" is the database, "loguser" is the userid to create, "my.apachemachine.com" is the name of the Apache machine, and "l0gger" is the password to assign. Choose values that are different from these examples. - mysql> grant insert,create on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger'; + mysql> grant insert,create on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger'; You may be especially security-paranoid and want "loguser" to not have "create" capability within the "apachelogs" database. You can disable that privilege, but the cost is that you will not be able to use the module's on-the-fly table creation feature. If that cost is acceptable, hand-create the tables as described in step and use the following GRANT statement instead of the one above: - mysql> grant insert on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger'; + mysql> grant insert on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger'; Enable full logging of your MySQL daemon (at least temporarily for debugging purposes) if you don't do this already. Edit /etc/my.cnf and add the following line to your [mysqld] section: log=/var/log/mysql-messages Then restart MySQL - # /etc/rc.d/init.d/mysql restart + # /etc/rc.d/init.d/mysql restart - - +
+
A very basic logging setup in Apache @@ -300,28 +314,28 @@ LogSQLCreateTables on Restart apache. - # /etc/rc.d/init.d/httpd stop -# /etc/rc.d/init.d/httpd start + # /etc/rc.d/init.d/httpd stop +# /etc/rc.d/init.d/httpd start - - +
+
Testing the basic setup Visit your web site in a browser to trigger some hits, then confirm that the entries are being successfully logged: - # mysql -hdbmachine.foo.com -umysqladmin -p -e "select * from access_log" apachelogs -Enter password: + # mysql -hdbmachine.foo.com -umysqladmin -p -e "select * from access_log" apachelogs +Enter password: Several lines of output should follow, corresponding to your hits on the site. You now have basic functionality. Don't disable your regular Apache logs until you feel comfortable that the database is behaving as you'd like and that things are going well. If you do not see any entries in the access_log, please consult section of the FAQ on how to debug and fix the situation. You can now activate the advanced features of mod_log_sql, which are described in the next section. - - +
+
How to tune logging with run-time directives - +
Instructing the module what to log The most basic directive for the module is LogSQLTransferLogFormat, which tells the module which information to send to the database; logging to the database will not take place without it. Place a LogSQLTransferLogFormat directive in the VirtualHost stanza of each virtual host that you want to activate. After LogSQLTransferLogFormat you supply a string of characters that tell the module what information to log. In the configuration directive reference (section ) there is a table which clearly defines all the possible things to log. Let's say you want to log only the "request time," the "remote host," and the "request"; you'd use: @@ -339,9 +353,9 @@ Enter password: If you specify 'M' to indicate that you want to log the machine ID, you must also tell the module this machine's identity using the LogSQLMachineID directive. Fail to specify LogSQLMachineID, and a simple '-' character will be logged in the machine_id column. - - - Instructing the module what NOT to log using filtering directives +
+
+ Instructing the module what NOT to log using filtering directives One "accept" and two "ignore" directives allow you to fine-tune what the module should not log. These are very handy for keeping your database as uncluttered as possible and keeping your statistics free of unneeded numbers. Think of each one as a gatekeeper. It is important to remember that each of these three directives is purely optional. mod_log_sql's default is to log everything. When a request comes in, the contents of LogSQLRequestAccept are evaluated first. This optional, "blanket" directive lets you specify that only certain things are to be accepted for logging, and everything else discarded. Because it is evaluated before LogSQLRequestIgnore and LogSQLRemhostIgnore it can halt logging before those two filtering directives "get their chance." @@ -366,23 +380,27 @@ Enter password: LogSQLRequestAccept .html .gif .jpg LogSQLRequestIgnore statistics.html bluedot.jpg So the first line instructs the module to only log files with html, gif and jpg suffixes; requests for "formail.cgi" and "shopping-cart.pl" will never be considered for logging. ("LeftArrow.JPG" will also never be considered for logging -- remember, the comparison is case sensitive.) The second line prunes the list further -- you never want to log requests for those two objects. - - If you want to match all the hosts in your domain such as "host1.corp.foo.com" and "server.dmz.foo.com", simply specify: - LogSQLRemhostIgnore foo.com - - - A great way to catch the vast majority of worm-attack requests and prevent them from being logged is to specify: - LogSQLRequestIgnore root.exe cmd.exe default.ida - - - To prevent the logging of requests for common graphic types, make sure to put a '.' before the suffix to avoid matches that you didn't intend: - LogSQLRequestIgnore .gif .jpg - - - - + + + + If you want to match all the hosts in your domain such as "host1.corp.foo.com" and "server.dmz.foo.com", simply specify: + LogSQLRemhostIgnore foo.com + + + A great way to catch the vast majority of worm-attack requests and prevent them from being logged is to specify: + LogSQLRequestIgnore root.exe cmd.exe default.ida + + + To prevent the logging of requests for common graphic types, make sure to put a '.' before the suffix to avoid matches that you didn't intend: + LogSQLRequestIgnore .gif .jpg + + + +
+
+
Advanced logging scenarios - +
Using the module in an ISP environment mod_log_sql has three basic tiers of operation: @@ -407,16 +425,16 @@ LogSQLRequestIgnore statistics.html bluedot.jpg There are numerous benefits. The admin will not need to create new tables for every new VirtualHost. (Although the admin will still need to drop the tables of virtual hosts that are removed.) The admin will not need to set LogSQLTransferLogTable for each virtual host -- it will be configured automatically based on the host's name. Because each virtual host will log to its own segregated table, data about one virtual server will segregate from others; an admin can grant users access to the tables they need, and they will be unable to view data about another user's virtual host. In an ISP scenario the admin is likely to have a cluster of many front-end webservers logging to a back-end database. mod_log_sql has a feature that permits analysis of how well the web servers are loadbalancing: the LogSQLMachineID directive. The administrator uses this directive to assign a unique identifier to each machine in the web cluster, e.g. "LogSQLMachineID web01," "LogSQLMachineID web02," etc. Used in conjunction with the 'M' character in LogSQLTransferLogFormat, each entry in the SQL log will include the machine ID of the machine that created the entry. This permits the administrator to count the entries made by each particular machine and thereby analyze the front-end loadbalancing algorithm. - - +
+
Logging many-to-one data in separate tables A given HTTP request can have a one-to-many relationship with certain kinds of data. For example, a single HTTP request can have 4 cookies, 3 headers and 5 "mod_gzip" notes associated with it. mod_log_sql is capable of logging these relationships due to the elegance of SQL relational data. You already have a single table containing access requests. One of the columns in that table is 'id' which is intended to contain the unique request ID supplied by the standard Apache module mod_unique_id -- all you need to do is compile in that module and employ the LogSQLTransferLogFormat character 'I'. Thereafter, each request gets a unique ID that can be thought of as a primary key within the database, useful for joining multiple tables. So let's envision several new tables: a notes table, a cookies table, and a table for inbound and outbound headers. <tblAcc>access_log - - + + @@ -508,10 +526,10 @@ LogSQLRequestIgnore statistics.html bluedot.jpg
We have a certain request, and its unique ID is "PPIDskBRH30AAGPtAsg". Within each separate table will be multiple entries with that request ID: several cookie entries, several header entries, etc. As you can see in tables [tblAcc], [tblNotes] and [tblHdr], you have a one-to-many relationship for request PPIDskBRH30AAGPtAsg: that one access has two associated notes and four associated headers. You can extract this data easily using the power of SQL's "select" statement and table joins. To see the notes associated with a particular request: - select a.remote_host, a.request_uri, n.item, n.val from access_log a, notes_log n -where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; + select a.remote_host, a.request_uri, n.item, n.val from access_log a, notes_log n +where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; - + access_log joined to notes_log @@ -566,14 +584,14 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; </VirtualHost> - - + +
Using the same database for production and test Although sub-optimal, it is not uncommon to use the same back-end database for the "production" webservers as well as the "test" webservers (budgetary constraints, rack-space limits, etc.). Furthermore, an administrator in this situation may be unable to use LogSQLRemhostIgnore to exclude requests from the test servers -- perhaps the generated entries are genuinely useful for analytical or QA purposes, but their value after analysis is minimal. It is wasteful and potentially confusing to permit this internal test data to clutter the database, and a solution to the problem is the proper use of the LogSQLMachineID directive. Assume a scenario where the production webservers have IDs like "web01," "web02," and so on -- and the test webservers have IDs like "test01," "test02," etc. Because entries in the log database are distinguished by their source machine, an administrator may purge unneeded test data from the access log as follows: delete from access_log where machine_id like 'test%'; - - +
+
Optimizing for a busy database A busy MySQL database will have SELECT statements running concurrently with INSERT and UPDATE statements. A long-running SELECT can in certain circumstances block INSERTs and therefore block mod_log_sql. A workaround is to enable mod_log_sql for "delayed inserts," which are described as follows in the MySQL documentation. The DELAYED option for the INSERT statement is a MySQL-specific option that is very useful if you have clients that can't wait for the INSERT to complete. This is a common problem when you use MySQL for logging and you also periodically run SELECT and UPDATE statements that take a long time to complete. DELAYED was introduced in MySQL Version 3.22.15. It is a MySQL extension to ANSI SQL92. @@ -589,16 +607,16 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; There is additional overhead for the server to handle a separate thread for each table on which you use INSERT DELAYED. - + The MySQL documentation concludes, "This means that you should only use INSERT DELAYED when you are really sure you need it!" Furthermore, the current state of error return from a failed INSERT DELAYED seems to be in flux, and may behave in unpredictable ways between different MySQL versions. See FAQ entry -- you have been warned. - + If you are experiencing issues which could be solved by delayed inserts, then set LogSqlDelayedInserts On in the httpd.conf. All regular INSERT statements are now INSERT DELAYED, and you should see no more blocking of the module. - - - +
+ +
Configuration Directive Reference It is imperative that you understand which directives are used only once in the main server config, and which are used inside VirtualHost stanzas and therefore multiple times within httpd.conf. The "context" listed with each entry informs you of this. - +
DataBase Configuration @@ -651,7 +669,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. This directive Must be defined for logging to be enabled. @@ -660,9 +678,9 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLDBParam - LogSQLDBParam - parameter-name - value + LogSQLDBParam + parameter-name + value Example: LogSQLDBParam socketfile /var/lib/mysql/mysql.socket Context: main server config @@ -676,8 +694,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLCreateTables - LogSQLCreateTables - flag + LogSQLCreateTables + flag Example: LogSQLCreateTables On Default: Off @@ -685,7 +703,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; mod_log_sql has the ability to create its tables on-the-fly. The advantage to this is convenience: you don't have to execute any SQL by hand to prepare the table. This is especially helpful for people with lots of virtual hosts (who should also see the LogSQLMassVirtualHosting directive). There is a slight disadvantage: if you wish to activate this feature, then the userid specified in LogSQLLoginInfo must have CREATE privileges on the database. In an absolutely paranoid, locked-down situation you may only want to grant your mod_log_sql user INSERT privileges on the database; in that situation you are unable to take advantage of LogSQLCreateTables. But most people -- even the very security-conscious -- will find that granting CREATE on the logging database is reasonable. - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. @@ -693,8 +711,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLForcePreserve - LogSQLForcePreserve - flag + LogSQLForcePreserve + flag Example: LogForcePreserve On Default: Off @@ -702,7 +720,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; You may need to perform debugging on your database and specifically want mod_log_sql to make no attempts to log to it. This directive instructs the module to send all its log entries directly to the preserve file and to make no database INSERT attempts. This is presumably a directive for temporary use only; it could be dangerous if you set it and forget it, as all your entries will simply pile up in the preserve file. - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. @@ -719,7 +737,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; This option can be enabled to completely disable the preserve file fail back. This may be useful for servers where the file-system is read-only. If the database is not available those log entries will be lost. - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. @@ -727,14 +745,14 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLMachineID - LogSQLMachineID - machineID + LogSQLMachineID + machineID Example: LogSQLMachineID web01 Context: main server config If you have a farm of webservers then you may wish to know which particular machine made each entry; this is useful for analyzing your load-balancing methodology. LogSQLMachineID permits you to distinguish each machine's entries if you assign each machine its own LogSQLMachineID: for example, the first webserver gets ``LogSQLMachineID web01,'' the second gets ``LogSQLMachineID web02,'' etc. - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. @@ -742,30 +760,30 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQlPreserveFile - LogSQLPreserveFile - filename + LogSQLPreserveFile + filename Example: LogSQLPreserveFile offline-preserve Default: /tmp/sql-preserve Context: virtual host mod_log_sql writes queries to this local preserve file in the event that it cannot reach the database, and thus ensures that your high-availability web frontend does not lose logs during a temporary database outage. This could happen for a number of reasons: the database goes offline, the network breaks, etc. You will not lose entries since the module has this backup. The file consists of a series of SQL statements that can be imported into your database at your convenience; furthermore, because the SQL queries contain the access timestamps you do not need to worry about out-of-order data after the import, which is done in a simple manner: - # mysql -uadminuser -p mydbname < /tmp/sql-preserve - If you do not define LogSQLPreserveFile then all virtual servers will log to the same default preserve file (/tmp/sql-preserve). You can redefine this on a virtual-host basis in order to segregate your preserve files if you desire. Note that segregation is not usually necessary, as the SQL statements that are written to the preserve file already distinguish between different virtual hosts if you include the 'v' character in your LogSQLTransferLogFormat directive. It is only necessary to segregate preserve-files by virualhost if you also segregate access logs by virtualhost. + # mysql -uadminuser -p mydbname < /tmp/sql-preserve + If you do not define LogSQLPreserveFile then all virtual servers will log to the same default preserve file (/tmp/sql-preserve). You can redefine this on a virtual-host basis in order to segregate your preserve files if you desire. Note that segregation is not usually necessary, as the SQL statements that are written to the preserve file already distinguish between different virtual hosts if you include the 'v' character in your LogSQLTransferLogFormat directive. It is only necessary to segregate preserve-files by virualhost if you also segregate access logs by virtualhost. The module will log to Apache's ErrorLog when it notices a database outage, and upon database return. You will therefore know when the preserve file is being used, although it is your responsibility to import the file. The file does not need to be created in advance. It is safe to remove or rename the file without interrupting Apache, as the module closes the filehandle immediately after completing the write. The file is created with the user & group ID of the running Apache process (e.g. 'nobody' on many Linux distributions). - - +
+
Table Names LogSQLTransferLogTable - LogSQLTransferLogTable - table-name + LogSQLTransferLogTable + table-name Example: LogSQLTransferLogTable access_log_table Context: virtual host @@ -780,8 +798,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLCookieLogTable - LogSQLCookieLogTable - table-name + LogSQLCookieLogTable + table-name Example: LogSQLCookieLogTable cookie_log Default: cookies @@ -796,8 +814,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLHeadersInLogTable - LogSQLHeadersInLogTable - table-name + LogSQLHeadersInLogTable + table-name Example: LogSQLHeadersInLogTable headers Default: headers_in @@ -812,8 +830,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLHeadersOutLogTable - LogSQLHeadersOutLogTable - table-name + LogSQLHeadersOutLogTable + table-name Example: LogSQLHeadersOutLogTable headers Default: headers_out @@ -828,8 +846,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLNotesLogTable - LogSQLNotesLogTable - table-name + LogSQLNotesLogTable + table-name Example: LogSQLNotesLogTable notes-log Default: notes @@ -844,8 +862,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLMassVirtualHosting - LogSQLMassVirtualHosting - flag + LogSQLMassVirtualHosting + flag Example: LogSQLMassVirtualHosting On Default: Off @@ -864,21 +882,21 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; This is a huge boost in convenience for sites with many virtual servers. Activating LogSQLMassVirtualHosting obviates the need to create every virtual server's table and provides more granular security possibilities. - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. - - +
+
Configuring What Is logged LogSQLTransferLogFormat - LogSQLTransferLogFormat - format-string + LogSQLTransferLogFormat + format-string Example: LogSQLTransferLogFormat huSUTv Default: AbHhmRSsTUuv @@ -924,7 +942,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; 32561 - c + c Text of cookie cookie varchar(255) @@ -967,7 +985,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; M - Machine ID + Machine ID machine_id varchar(25) web01 @@ -1059,11 +1077,9 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg';
- - [1] You must also specify LogSQLWhichCookie for this to take effect. - - - [2] You must also specify LogSQLmachineID for this to take effect. + + [1] You must also specify LogSQLWhichCookie for this to take effect. + [2] You must also specify LogSQLmachineID for this to take effect. SSL LogFormat Parameters @@ -1113,12 +1129,12 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg';LogSQLRemhostIgnore - LogSQLRemhostIgnore + LogSQLRemhostIgnore hostname Example: LogSQLRemhostIgnore localnet.com Context: virtual host - Lists a series of strings that, if present in the REMOTE_HOST, will cause that request to not be logged. This directive is useful for cutting down on log clutter when you are certain that you want to ignore requests from certain hosts, such as your own internal network machines. See section for some tips for using this directive. + Lists a series of smortrings that, if present in the REMOTE_HOST, will cause that request to not be logged. This directive is useful for cutting down on log clutter when you are certain that you want to ignore requests from certain hosts, such as your own internal network machines. See section for some tips for using this directive. Each string may contain a + or - prefix in a <VirtualHost> context and will cause those strings to be added (+) or removed (-) from the global configuration. Otherwise the global is completely ignored and overridden if defined in a <VirtualHost> Each string is separated by a space, and no regular expressions or globbing are allowed. Each string is evaluated as a substring of the REMOTE_HOST using strstr(). The comparison is case sensitive. @@ -1127,7 +1143,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg';LogSQLRequestAccept - LogSQLRequestAccept + LogSQLRequestAccept substring Example: LogSQLRequestAccept .html .php .jpg @@ -1144,7 +1160,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLRequestIgnore - LogSQLRequestIgnore + LogSQLRequestIgnore substring Example: LogSQLRequestIgnore root.exe cmd.exe default.ida favicon.ico @@ -1158,8 +1174,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLWhichCookie - LogSQLWhichCookie - cookiename + LogSQLWhichCookie + cookiename Example; LogSQLWhichCookie Clicks Context: virtual host @@ -1175,7 +1191,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLWhichCookies - LogSQLWhichCookies + LogSQLWhichCookies cookie-name Example: logSQLWhichCookies userlogin cookie1 cookie2 @@ -1191,7 +1207,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLWhichHeadersIn - LogSQLWhichHeadersIn + LogSQLWhichHeadersIn header-name Example: LogSQLWhichHeadersIn UserAgent Accept-Encodeing Host @@ -1207,7 +1223,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLWhichHeadersOut - LogSQLWhichHeadersOut + LogSQLWhichHeadersOut header-name Example: LogSQLWhichHeadersOut Expires Content-Type Cache-Control @@ -1223,7 +1239,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLWhichNotes - LogSQLWhichNotes + LogSQLWhichNotes note-name Example: LogSQLWhichNotes mod_gzip_result mod_gzip_ompression_ratio @@ -1236,16 +1252,16 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; - - - Deprecated Command + +
+ Deprecated Commands LogSQLSocketFile [Deprecated] - LogSQLSocketFile - filename + LogSQLSocketFile + filename Example: LogSQLSocketFile /tmp/mysql.sock Default: (database specific) @@ -1255,7 +1271,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; mod_log_sql will automatically employ the socket for db communications if the database resides on the local host. If the db resides on a separate host the module will automatically use TCP/IP. This is a function of the MySQL API and is not user-configurable. This directive is deprecated in favor of LogSQLDBParam socketfile [socketfilename] - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. @@ -1263,8 +1279,8 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLTCPPort [Deprecated] - LogSQLTCPPort - port-number + LogSQLTCPPort + port-number Example: LogSQLTCPPort 3309 Default: (database specific) @@ -1273,7 +1289,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; Your database may listen on a different port than the default. If so, use this directive to instruct the module which port to use. This directive only applies if the database is on a different machine connected via TCP/IP. This directive is deprecated in favor of LogSQLDBParam tcpport [port-number] - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. @@ -1281,23 +1297,23 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; LogSQLDatabase [Deprecated] - LogSQLDatabase - database + LogSQLDatabase + database Example: LogSQLDatabase loggingdb Context: main server config Defines the database that is used for logging. "database" must be a valid db on the MySQL host defined in LogSQLLoginInfo This directive is deprecated in favor of the URI form of LogSQLLoginInfo. - This is defined only once in the httpd.conf file. + This is defined only once in the httpd.conf file. - - - - +
+ + +
FAQ @@ -1320,7 +1336,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; For example, do you want to see all your 404's? Do this: - select remote_host,status,request_uri,bytes_sent,from_unixtime(time_stamp) from acc_log_tbl where status=404 order by time_stamp; + select remote_host,status,request_uri,bytes_sent,from_unixtime(time_stamp) from acc_log_tbl where status=404 order by time_stamp;
@@ -1371,7 +1387,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg';
Or do you want to see how many bytes you've sent within a certain directory or site? Do this: - select request_uri,sum(bytes_sent) as bytes,count(request_uri) as howmany from acc_log_tbl where request_uri like '%mod_log_sql%' group by request_uri order by howmany desc; + select request_uri,sum(bytes_sent) as bytes,count(request_uri) as howmany from acc_log_tbl where request_uri like '%mod_log_sql%' group by request_uri order by howmany desc; @@ -1410,7 +1426,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg';
Or maybe you want to see who's linking to you? Do this: - select count(referer) as num,referer from acc_log_tbl where request_uri='/mod_log_sql/' group by referer order by num desc; + select count(referer) as num,referer from acc_log_tbl where request_uri='/mod_log_sql/' group by referer order by num desc; @@ -1451,10 +1467,10 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; MySQL is a robust, free, and very powerful production-quality database engine. It is well supported and comes with detailed documentation. Many 3rd-party software pacakges (e.g. Slashcode, the engine that powers Slashdot) run exclusively with MySQL. In other words, you will belong to a very robust and well-supported community by choosing MySQL. - That being said, there are alternatives. PostgreSQL is probably MySQL's leading "competitor" in the free database world. There is also an excellent module available for Apache to permit logging to a PostgreSQL database, calledpgLOGd + That being said, there are alternatives. PostgreSQL is probably MySQL's leading "competitor" in the free database world. There is also an excellent module available for Apache to permit logging to a PostgreSQL database, called pgLOGd - Currently a database abstraction system is in the works to allow any database to be used with mod_log_sql. + Currently a database abstraction system is in the works to allow any database to be used with mod_log_sql. @@ -1555,7 +1571,7 @@ where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; Another factor is preventing a connection to the database - It is improper to ask for help before you have followed these steps. + It is improper to ask for help before you have followed these steps. First examine the MySQL log that you established in step of section . Ensure that the INSERT statements are not being rejected because of a malformed table name or other typographical error. By enabling that log, you instructed MySQL to log every connection and command it receives -- if you see no INSERT attempts in the log, the module isn't successfully connecting to the database. If you see nothing at all in the log -- not even a record of your administrative connection attempts, then you did not enable the log correctly. If you do see INSERT attempts but they are failing, the log should tell you why. Second, confirm that your LogSQL* directives are all correct. Third, examine the Apache error logs for messages from mod_log_sql; the module will offer hints as to why it cannot connect, etc. @@ -1586,10 +1602,10 @@ ErrorLog /var/log/httpd/server-messages Jeremy Zawodny, a highly respected MySQL user and contributor to Linux Magazine, has this very helpful and highly appropriate article on tuning MySQL: MySQL, Linux, and Thread Caching - Please remember that mod_log_sql's overriding principle is performance -- that is what the target audience demands and expects. Other database logging solutions do not open and maintain many database connections, but their performance suffers drastically. For example, pgLOGd funnels all log connections through a separate daemon that connects to the database, but that bottlenecks the entire process. mod_log_sql achieves performance numbers an order of magnitude greater than the alternatives because it dispenses with the overhead associated with rapid connection cycling, and it does not attempt to shoehorn all the database traffic through a single extra daemon or proxy process. + Please remember that mod_log_sql's overriding principle is performance -- that is what the target audience demands and expects. Other database logging solutions do not open and maintain many database connections, but their performance suffers drastically. For example, pgLOGd funnels all log connections through a separate daemon that connects to the database, but that bottlenecks the entire process. mod_log_sql achieves performance numbers an order of magnitude greater than the alternatives because it dispenses with the overhead associated with rapid connection cycling, and it does not attempt to shoehorn all the database traffic through a single extra daemon or proxy process. - Currently connection pooling is being implemented as part of the Database Abstraction layer to allow multiple httpd processes to share connections. + Currently connection pooling is being implemented as part of the Database Abstraction layer to allow multiple httpd processes to share connections. @@ -1598,10 +1614,10 @@ ErrorLog /var/log/httpd/server-messages This message may appear every now and then in your Apache error log, especially on very lightly loaded servers. This does not mean that anything is necessarily wrong. Within each httpd child process, mod_log_sql will open (and keep open) a connection to the MySQL server. MySQL, however, will close connections that have not been used in a while; the default timeout is 8 hours. When this occurs, mod_log_sql will notice and re-open the connection. That event is what is being logged, and looks like this: - [Tue Nov 12 19:04:10 2002] [error] mod_log_sql: first attempt failed, + [Tue Nov 12 19:04:10 2002] [error] mod_log_sql: first attempt failed, API said: error 2013, Lost connection to MySQL server during query [Tue Nov 12 19:04:10 2002] [error] mod_log_sql: reconnect successful -[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: second attempt successful +[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: second attempt successful Reference: MySQL documentation @@ -1657,8 +1673,8 @@ CookieName Clicks "ab" commandline - ./ab -c 10 -t 20 -v 2 -C Clicks=ab_run -http://www.hostname.com/target + ./ab -c 10 -t 20 -v 2 -C Clicks=ab_run +http://www.hostname.com/target ( 10 concurrent requests; 20 second test; setting a cookie "Clicks=ab_run"; target = the mod_log_sql homepage. ) Ten total ab runs were conducted: five with MySQL logging enabled, and five with all MySQL directives commented out of httpd.conf. Then each five were averaged. The results: @@ -1681,8 +1697,8 @@ http://www.hostname.com/target When done with your runs, clean these many thousands of requests out of your database: - mysql> delete from access_log where agent like 'ApacheBench%'; -mysql> optimize table access_log; + mysql> delete from access_log where agent like 'ApacheBench%'; +mysql> optimize table access_log; @@ -1763,7 +1779,7 @@ rm -f /var/log/httpd/templog LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" \"%{cookie}n\"" Naturally it would be nice for mod_log_sql to permit the admin to log the cookie data as well, so as of version 1.10 you can do this. You need to have already compiled mod_usertrack into httpd -- it's one of the standard Apache modules. First make sure you have a column called "cookie" in the MySQL database to hold the cookies, which can be done as follows if you already have a working database: - mysql> alter table acc_log_tbl add column cookie varchar(255); + mysql> alter table acc_log_tbl add column cookie varchar(255); Next configure your server to set usertracking cookies as follows, and make sure you include the new 'c' directive in your LogSQLTransferLogFormat, which activates cookie logging. Here's an example: <VirtualHost 1.2.3.4> CookieTracking on @@ -1776,7 +1792,7 @@ rm -f /var/log/httpd/templog Recap: the 'c' character activates cookie logging, and the LogSQLWhichCookie directive chooses which cookie to log. FYI, you are advised NOT to use CookieStyle Cookie2 -- it seems that even newer browsers (IE 5.5, etc.) have trouble with the new COOKIE2 (RFC 2965) format. Just stick with the standard COOKIE format and you'll be fine. Perform some hits on your server and run a select - mysql> select request_uri,cookie from access_log where cookie is not null; + mysql> select request_uri,cookie from access_log where cookie is not null;
@@ -1828,9 +1844,9 @@ rm -f /var/log/httpd/templog By adding certain characters to your LogSQLTransferLogFormat string you can tell mod_log_sql to log the SSL cipher, the SSL keysize of the connection, and the maximum keysize that was available. This would let you tell, for example, which clients were using only export-grade security to access your secure software area. You can compile mod_log_sql with SSL logging support if you have the right packages installed. If you already have an SSL-enabled Apache then you by definition have the correct packages already installed: OpenSSL and mod_ssl. You need to ensure that your database is set up to log the SSL data. Issue the following commands to MySQL if your access table does not already have them: - mysql> alter table access_log add column ssl_cipher varchar(25); + mysql> alter table access_log add column ssl_cipher varchar(25); mysql> alter table access_log add column ssl_keysize smallint unsigned; -mysql> alter table access_log add column ssl_maxkeysize smallint unsigned; +mysql> alter table access_log add column ssl_maxkeysize smallint unsigned; Finally configure httpd.conf to activate the SSL fields. Note that this is only meaningful in a VirtualHost that is set up for SSL. <VirtualHost 1.2.3.4:443> LogSQLTransferLogFormat AbHhmRSsTUuvcQqz @@ -1838,7 +1854,7 @@ mysql> alter table access_log add column ssl_maxkeysize smallint unsigned;You also need to make sure you have the mod_log_sql_ssl module loaded as well. The last three characters (Qqz) in the directive are the SSL ones; see section in the directives documentation for details of the LogSQLTransferLogFormat directive. Restart Apache, then perform some hits on your server. Then run the following select statement: - mysql> select remote_host,request_uri,ssl_cipher,ssl_keysize,ssl_maxkeysize from access_log where ssl_cipher is not null; + mysql> select remote_host,request_uri,ssl_cipher,ssl_keysize,ssl_maxkeysize from access_log where ssl_cipher is not null;
@@ -1885,5 +1901,5 @@ mysql> alter table access_log add column ssl_maxkeysize smallint unsigned; - + -- cgit