From f1fd1c9714256bb9b212462dd31ca6dc56ea31ef Mon Sep 17 00:00:00 2001 From: Edward Rudd Date: Thu, 21 Jul 2011 19:10:20 -0400 Subject: add in project web page --- docs-2.0/index.xml | 11 + docs-2.0/manual.xml | 3927 +++++++++++++++++++++++++++++++++++++++++++++++ docs/contents.png | Bin 0 -> 278 bytes docs/crossref.png | Bin 0 -> 147 bytes docs/documentation.css | 33 + docs/documentation.html | 269 ++++ docs/img1.png | Bin 0 -> 154 bytes docs/img2.png | Bin 0 -> 214 bytes docs/img3.png | Bin 0 -> 318 bytes docs/img4.png | Bin 0 -> 564 bytes docs/index.html | 269 ++++ docs/next.png | Bin 0 -> 245 bytes docs/next_g.png | Bin 0 -> 272 bytes docs/node1.html | 128 ++ docs/node2.html | 276 ++++ docs/node3.html | 664 ++++++++ docs/node4.html | 2014 ++++++++++++++++++++++++ docs/node5.html | 1270 +++++++++++++++ docs/node6.html | 74 + docs/prev.png | Bin 0 -> 279 bytes docs/prev_g.png | Bin 0 -> 327 bytes docs/style_1.css | 33 + docs/up.png | Bin 0 -> 211 bytes docs/up_g.png | Bin 0 -> 231 bytes index.xml | 260 ++++ 25 files changed, 9228 insertions(+) create mode 100644 docs-2.0/index.xml create mode 100644 docs-2.0/manual.xml create mode 100644 docs/contents.png create mode 100644 docs/crossref.png create mode 100644 docs/documentation.css create mode 100644 docs/documentation.html create mode 100644 docs/img1.png create mode 100644 docs/img2.png create mode 100644 docs/img3.png create mode 100644 docs/img4.png create mode 100644 docs/index.html create mode 100644 docs/next.png create mode 100644 docs/next_g.png create mode 100644 docs/node1.html create mode 100644 docs/node2.html create mode 100644 docs/node3.html create mode 100644 docs/node4.html create mode 100644 docs/node5.html create mode 100644 docs/node6.html create mode 100644 docs/prev.png create mode 100644 docs/prev_g.png create mode 100644 docs/style_1.css create mode 100644 docs/up.png create mode 100644 docs/up_g.png create mode 100644 index.xml diff --git a/docs-2.0/index.xml b/docs-2.0/index.xml new file mode 100644 index 0000000..e04d194 --- /dev/null +++ b/docs-2.0/index.xml @@ -0,0 +1,11 @@ + + + + + + + Error loading XML Documentation + + + + diff --git a/docs-2.0/manual.xml b/docs-2.0/manual.xml new file mode 100644 index 0000000..9019e80 --- /dev/null +++ b/docs-2.0/manual.xml @@ -0,0 +1,3927 @@ + + +urkle <at> outoforder <dot> cc"> +]> +
+ + mod_log_sql Manual + + Edward + Rudd + Conversion from Lyx to DocBook + Current Maintainer + + + &EmailContact; + + + + + Christopher + B. + Powell + Original documentation author. + + + chris <at> grubbybaby <dot> com + + + + + 2001 + 2002 + 2003 + Christopher B. Powell + + + 2004 + 2005 + 2006 + Edward Rudd + + + + 1.5 + 2006-11-04 + Added documentation about logio parameters and added DBParam Mysql driver parameters (including tabletype) + + + 1.4 + 2006-02-13 + Added missing logformat types, switched to simplified docbook 1.1 + + + 1.3 + 2005-01-11 + Updated for mod_log_sql v1.100 + + + 1.2 + 2004-04-08 + Updated for mod_log_sql v1.97 + + + 1.1 + 2004-03-02 + Updated for mod_log_sql v1.96 + + + 1.0 + 2004-01-22 + Initial Conversion from Lyx to Docbook + + + +
+ Introduction +
+ Summary + + This Apache module will permit you to log to a SQL database; it + can log each access request as well as data associated with each + request: cookies, notes, and inbound/outbound headers. Unlike + logging to a flat text file -- which is standard in Apache -- a + SQL-based log exhibits tremendous flexibility and power of data + extraction. (See FAQ entry + + for further discussion and examples of the advantages to SQL.) + + + This module can either replace or happily coexist with + mod_log_config, Apache's text file logging facility. In addition + to being more configurable than the standard module, mod_log_sql + is much more flexible. + +
+
+ Approach + + This project was formerly known as "mod_log_mysql." It was + renamed "mod_log_sql" in order to reflect the project goal of + database in-specificity. The module currently supports MySQL, + but support for other database back-ends is underway. + + + In order to save speed and overhead, links are kept alive in + between queries. This module uses one dedicated SQL link per + httpd child, opened by each child process when it is born. Among + other things, this means that this module supports logging into + only one MySQL server, and for now, also, only one SQL database. + But that's a small tradeoff compared to the blinding speed of + this module. Error reporting is robust throughout the module and + will inform the administrator of database issues in the Apache + ErrorLog for the server/virtual server. + + + Virtual hosts are supported in the same manner they are in the + regular logging modules. The administrator defines some basic + 'global' directives in the main server config, then defines more + specific 'local' directives inside each VirtualHost stanza. + + + A robust "preserve" capability has now been implemented. This + permits the module to preserve any failed INSERT commands to a + local file on its machine. In any situation that the database is + unavailable -- e.g. the network fails or the database host is + rebooted -- mod_log_sql will note this in the error log and + begin appending its log entries to the preserve file (which is + created with the user and group ID of the running Apache + process, e.g. "nobody/nobody" on many Linux installations). When + database availability returns, mod_log_sql seamlessly resumes + logging to it. When convenient for the sysadmin, he/she can + easily import the preserve file into the database because it is + simply a series of SQL insert statements. + +
+
+ What gets logged by default? + + All the data that would be contained in the "Combined Log + Format" is logged by default, plus a little extra. Your best bet + is to begin by accepting this default, then later customize the + log configuration based on your needs. The documentation of the + run-time directives includes a full explanation of what you can + log, including examples -- see section + + . + +
+
+ Miscellaneous Notes + + + + Note which directives go in the 'main server config' and + which directives apply to the 'virtual host config'. This is + made clear in the directive documentation. + + + + + The 'time_stamp' field is stored in an UNSIGNED INTEGER + format, in the standard unix "seconds since the epoch" + format. This is superior to storing the access time as a + string due to size requirements: an UNSIGNED INT requires 4 + bytes, whereas an Apache date string (e.g. + "18/Nov/2001:13:59:52 -0800") requires 26 bytes: those extra + 22 bytes become significant when multiplied by thousands of + accesses on a busy server. Besides, an INT type is far more + flexible for comparisons, etc. + + + In MySQL 3.21 and above you can easily convert this to a + human readable format using from_unixtime(), e.g.: + + SELECT remote_host,request_uri,from_unixtime(time_stamp) +FROM access_log; + + The enclosed perl program "make_combined_log.pl" extracts + your access log in a format that is completely compatible + with the Combined Log Format. You can then feed this to your + favorite web log analysis tool. + + + + + The table's string values can be CHAR or VARCHAR, at a + length of your choice. VARCHAR is superior because it + truncates long strings; CHAR types are fixed-length and will + be padded with spaces, resulting in waste. Just like the + time_stamp issue described above, that kind of space waste + multiplies over thousands of records. + + + + + Be careful not to go overboard setting fields to NOT NULL. + If a field is marked NOT NULL then it must contain data in + the INSERT statement, or the INSERT will fail. These + mysterious failures can be quite frustrating and difficult + to debug. + + + + + When Apache logs a numeric field, it uses a '-' character to + mean "not applicable," e.g. the number of bytes returned on + a 304 (unchanged) request. Since '-' is an illegal character + in an SQL numeric field, such fields are assigned the value + 0 instead of '-' which, of course, makes perfect sense + anyway. + + + +
+
+ Author / Maintainer + + The actual logging code was taken from the already existing flat + file text modules, so all that credit goes to the Apache + Software Foundation. + + + The MySQL routines and directives were added by Zeev Suraski + <bourbon@netvision.net.il>. + + + All changes from 1.06+ and the new documentation were added by + Chris Powell + chris <at> grubbybaby <dot> com + . It seems that the module had fallen into the "un-maintained" + category -- it had not been updated since 1998 -- so Chris + adopted it as the new maintainer. + + + In December of 2003, Edward Rudd + &EmailContact; + porting the module to Apache 2.0, cleaning up the code, + converting the documentation to DocBook, optimizing the main + logging loop, and added the much anticipated database + abstraction layer. + + + As of February 2004, Chris Powell handed over maintenance of the + module over to Edward Rudd. So you should contact Edward Rudd + about the module from now on. + +
+
+ Mailing Lists + + A general discussion and support mailing list is provided for + mod_log_sq at lists.outoforder.cc. To subscribe to the mailing + list send a blank e-mail to + mod_log_sql-subscribe@lists.outoforder.cc. The list archives can + be accessed via Gmane.org's mailng list gateway via any new + reader + + news://news.gmane.org/gmane.comp.apache.mod-log-sql + + , or via a web browser at + + http://news.gmane.org/gmane.comp.apache.mod-log-sql + + . + +
+
+
+ Installation +
+ Requirements + + + + A compatible system. mod_log_sql was authored and tested on + systems based on Red Hat Linux (Red Hat, Mandrake), but the + module should easily adapt to any modern distribution. + mod_log_sql has also been ported successfully to Solaris and + FreeBSD. + + + + + Apache 1.3 or 2.0, 1.2 is no longer supported, but may still + compile. Ideally you should already have successfully + compiled Apache and understand the process, but this + document tries to make it simple for beginners. + + + + + The MySQL development headers. This package is called + different things on different distributions. For example, + Red Hat 6.x calls this RPM "MySQL-devel" whereas Mandrake + calls it "libmysql10-devel." Both MySQL 3.23.x and 4.x are + supported. + + + + + MySQL >= 3.23.15 configured, installed and running on + either localhost or an accessible networked machine. You + should already have a basic understanding of MySQL and how + it functions. + + + + + Optionally, if you want to be able to log SSL information + such as keysize or cipher, you need OpenSSL and mod_ssl + installed. + + + +
+
+ Compiling and Installing + + + Unpack the archive into a working directory. + $ tar -xzf mod_log_sql-1.94.tar.gz +$ cd mod_log_sql-1.94 + + + run configure to configure the source directory. + $ ./configure + + The + configure + script should automatically detect all the required + libraries and program if the are installed in standard + locations.. If it returns an error, here is a description of + the arguments you can specify when you run + configure + . + + + + --with-apxs=/usr/sbin/apxs + + + This is the full path to the apxs binary, or the + directory which contains the program. This program is + part of the Apache 1.3 and 2.0 installation. + + + The default is to search + /usr/bin/apxs + and + /usr/sbin/apxs + . + + + Specifying a directory here will search + $directory/apxs, $directory/bin/apxs, and + $directory/sbin/apxs + + + If you have more than one version of Apache installed, + you need to specify the correct apxs binary for the + one you wish to compile for. + + + + + --with-mysql=/path/to/mysql + + + This is the directory to search for the + libmysqlclient + library and the + MySQL + headers. + + + The default is to search + /usr/include + , + /usr/include/mysql + , + /usr/local/include + , and + /usr/local/include/mysql + for + MySQL + headers.. And + /usr/lib + . + /usr/lib/mysql + , + /usr/local/lib + , and + /usr/local/lin/mysql + for the + MySQL + libraries. + + + Specifying this testargument will search + $directory/include and $directory/mysql for + MySQL + headers. And $directory/lib and $directory/lib/mysql + for + MySQL + libraries. + + + + + --enable-ssl + + + Specifying this argument will enable the search for + mod_ssl and SSL headers, and if found will enable + compilation of SSL support into mod_log_sql. SSL + support is compiled into a separate module that can be + loaded after the main mod_log_sql. + + + + + --with-ssl-inc=/usr/include/openssl + + + This is the path to the SSL toolkit header files that + were used to compile mod_ssl. If you want SSL support + you most likely need to specify this. + + + The default is to search + /usr/include + and + /usr/include/openssl + . + + + Specifying this argument will search that directory + for the SSL headers. + + + + + --with-db-inc=/usr/include/db1 + + + This argument is only needed when compiling SSL + support for Apache 1.3, and needs to be the directory + which contains the ndbm.h header file. You can find + this by using + + $ locate ndbm.h +/usr/include/db1/ndbm.h +/usr/include/gdbm/ndbm.h + + As far as I can tell, there is no difference as to + which you specify, but it should be the one that you + compiled mod_ssl with. + + + The default is + /usr/include/db1 + , which should work on most systems. + + + + + --disable-apachetest + + + This will disable the apache version test. However + there is a side affect if you specify this where I + will not be able to determine which version of Apache + you are compiling for. So don't specify this.. If you + are having troubles with the script detecting your + Apache version, then send a bug report along with your + system OS version and versions of related packages. + + + + + --disable-mysqltest + + + This will disable the MySQL compile test. Specify this + if for some reason the test fail but you know you have + specified the correct directories. If mod_los_sql also + fails to compile report a bug along with your system + OS version and versions of related packages. + + + + + + + + Now compile the module with GNU make. You may have to + specify gmake on some systems like FreeBSD. + + $ gmake + + + + If there were no errors, you can now install the module(s). + If you compiled as a non-root user you may need to switch + users with + su + or + sudo + . + + $ su -c "gmake install" +Password: + + + + Now edit your Apache configuration and load the modules. + + + + + + If you are loading the SSL logging module, you need to + make sure it is loaded after mod_ssl and mod_log_sql. + + + + + If you have previously used mod_log_sql version 1.18, + the name of the module has changed from sql_log_module + to log_sql_module (the first parameter to LoadModule) + + + + + If you are upgrading from any release earlier than + 1.97 you need to add an extra LoadModule directive to + load the database driver (ie mysql). + + + + + + + + Insert these lines to either the main + httpd.conf + or a file included via an include directive. + + LoadModule log_sql_module modules/mod_log_sql.so +LoadModule log_sql_mysql_module modules/mod_log_sql_mysql.so +<IfModule mod_ssl.c> +LoadModule log_sql_ssl_module moduels/mod_log_sql_ssl.so +</IfModule> + + + If you did not compile SSL support in mod_log_sql, do + not include the lines between the <IfModule> + directives. + + + + + + If you are using Apache 1.3 you may need add these lines + later in the configuration. + + AddModule mod_log_sql.c +AddModule mod_log_sql_mysql.c +<IfModule mod_ssl.c> +AddModule mod_log_sql_ssl.c +</IfModule> + + + If you did not compile SSL support in mod_log_sql, do + not include the lines between the <IfModule> + directives. + + + + + + +
+
+
+ Configuration +
+ + Preparing MySQL for logging + + + You have to prepare the database to receive data from + mod_log_sql + , and set up run-time directives in + httpd.conf + to control how and what + mod_log_sql + logs. + + + This section will discuss how to get started with a basic + configuration. Full documentation of all available run-time + directives is available in section + + . + + + + + mod_log_sql can make its own tables on-the-fly, or you can + pre-make the tables by hand. The advantage of letting the + module make the tables is ease-of-use, but for raw + performance you will want to pre-make the tables in order to + save some overhead. In this basic setup we'll just let the + module create tables for us. + + + + + We still need to have a logging database created and ready, + so run the MySQL command line client and create a database: + + # mysql -uadmin -p +Enter password: +mysql> create database apachelogs; + + + + If you want to hand-create the tables, run the enclosed + 'create-tables' SQL script as follows ("create_tables.sql" + needs to be in your current working directory). + + mysql> use apachelogs +Database changed +mysql> source create_tables.sql + + + + Create a specific + MySQL + userid that + httpd + will use to authenticate and enter data. This userid need + not be an actual Unix user. It is a userid internal to + MySQL + with specific privileges. In the following example command, + "apachelogs" is the database, "loguser" is the userid to + create, "my.apachemachine.com" is the name of the Apache + machine, and "l0gger" is the password to assign. Choose + values that are different from these examples. + + mysql> grant insert,create on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger'; + + + + You may be especially security-paranoid and want "loguser" + to not have "create" capability within the "apachelogs" + database. You can disable that privilege, but the cost is + that you will not be able to use the module's on-the-fly + table creation feature. If that cost is acceptable, + hand-create the tables as described in step + + and use the following GRANT statement instead of the one + above: + + mysql> grant insert on apachelogs.* to loguser@my.apachemachine.com identified by 'l0gger'; + + + + Enable full logging of your + MySQL + daemon (at least temporarily for debugging purposes) if you + don't do this already. Edit /etc/my.cnf and add the + following line to your [mysqld] section: + + log=/var/log/mysql-messages + + Then restart + MySQL + + # /etc/rc.d/init.d/mysql restart + + +
+
+ A very basic logging setup in Apache + + + + Tell the module what database to use and the appropriate + authentication information. + + + So, edit httpd.conf and insert the following lines somewhere + after any LoadModule / AddModule statements. Make sure these + statements are "global," i.e. not inside any VirtualHost + stanza. You will also note that you are embedding a password + in the file. Therefore you are advised to "chmod 660 + httpd.conf" to prevent unauthorized regular users from + viewing your database user and password. + + + Use the + MySQL + database called "apachelogs" running on "dbmachine.foo.com". + Use username "loguser" and password "l0gg3r" to authenticate + to the database. Permit the module create tables for us. + + + Basic Example + LogSQLLoginInfo mysql://loguser:l0gg3r@dbmachine.foo.com/apachelogs +LogSQLCreateTables on + + + If your database resides on localhost instead of another + host, specify the MySQL server's socket file as follows: + + LogSQLDBParam socketfile /your/path/to/mysql.sock + + If your database is listening on a port other than 3306, + specify the correct TCP port as follows: + + LogSQLDBParam port 1234 + + + + The actual logging is set up on a virtual-host-by-host + basis. So, skip down to the virtual host you want to set up. + Instruct this virtual host to log entries to the table + "access_log" by inserting a LogSQLTransferLogTable + directive. (The LogSQLTransferLogTable directive is the + minimum required to log -- other directives that you will + learn about later simply tune the module's behavior.) + + <VirtualHost 1.2.3.4> + [snip] + LogSQLTransferLogTable access_log + [snip] +</VirtualHost> + + + Restart apache. + # /etc/rc.d/init.d/httpd stop +# /etc/rc.d/init.d/httpd start + + +
+
+ Testing the basic setup + + + + Visit your web site in a browser to trigger some hits, then + confirm that the entries are being successfully logged: + + # mysql -hdbmachine.foo.com -umysqladmin -p -e "SELECT * FROM access_log" apachelogs +Enter password: + + Several lines of output should follow, corresponding to your + hits on the site. You now have basic functionality. Don't + disable your regular Apache logs until you feel comfortable + that the database is behaving as you'd like and that things + are going well. If you do not see any entries in the + access_log, please consult section + + of the FAQ on how to debug and fix the situation. + + + + + You can now activate the advanced features of mod_log_sql, + which are described in the next section. + + + +
+
+ How to tune logging with run-time directives +
+ Instructing the module what to log + + The most basic directive for the module is + LogSQLTransferLogFormat, which tells the module which + information to send to the database; logging to the database + will not take place without it. Place a + LogSQLTransferLogFormat directive in the VirtualHost stanza of + each virtual host that you want to activate. + + + After LogSQLTransferLogFormat you supply a string of + characters that tell the module what information to log. In + the configuration directive reference (section + + ) there is a table which clearly defines all the possible + things to log. Let's say you want to log only the "request + time," the "remote host," and the "request"; you'd use: + + LogSQLTransferLogFormat hUS + But a more appropriate string to use is + LogSQLTransferLogFormat AbHhmRSsTUuv + + which logs all the information required to be compatible with + the Combined Log Format (CLF). + + + If you don't choose to log everything that is available, + that's fine. Fields in the unused columns in your table will + simply contain NULL. + + + Some of the LogSQLTransferLogFormat characters require a + little extra configuration: + + + + + If you specify 'c' to indicate that you want to log the + cookie value, you must also tell the module which cookie + you mean by using LogSQLWhichCookie -- after all, there + could be many cookies associated with a given request. + Fail to specify LogSQLWhichCookie, and no cookie + information at all will be logged. + + + + + If you specify 'M' to indicate that you want to log the + machine ID, you must also tell the module this machine's + identity using the LogSQLMachineID directive. Fail to + specify LogSQLMachineID, and a simple '-' character will + be logged in the machine_id column. + + + +
+
+ + Instructing the module what NOT to log using filtering + directives + + + One "accept" and two "ignore" directives allow you to + fine-tune what the module should not log. These are very handy + for keeping your database as uncluttered as possible and + keeping your statistics free of unneeded numbers. Think of + each one as a gatekeeper. + + + + It is important to remember that each of these three + directives is purely optional. mod_log_sql's default is to + log everything. + + + + When a request comes in, the contents of LogSQLRequestAccept + are evaluated first. This optional, "blanket" directive lets + you specify that only certain things are to be accepted for + logging, and everything else discarded. Because it is + evaluated before LogSQLRequestIgnore and LogSQLRemhostIgnore + it can halt logging before those two filtering directives "get + their chance." + + + Once a request makes it past LogSQLRequestAccept, it still can + be excluded based on LogSQLRemhostIgnore and + LogSQLRequestIgnore. A good way to use LogSQLRemhostIgnore is + to prevent the module from logging the traffic that your + internal hosts generate. LogSQLRequestIgnore is great for + preventing things like requests for "favicon.ico" from + cluttering up your database, as well as excluding the various + requests that worms make, etc. + + + You can specify a series of strings after each directive. Do + not use any type of globbing or regular-expression syntax -- + each string is considered a match + + if it is a substring of the larger request or remote-host; + the comarison is case-sensitive + + . This means that "LogSQLRemhostIgnore micro" will ignore + requests from "microsoft.com," "microworld.net," + "mymicroscope.org," etc. "LogSQLRequestIgnore gif" will + instruct the module to ignore requests for "leftbar.gif," + "bluedot.gif" and even "giftwrap.jpg" -- but "RED.GIF" and + "Tree.Gif" would still get logged because of case sensitivity. + + A summary of the decision flow: + + + + If LogSQLRequestAccept exists and a request does not match + anything in that list, it is discarded. + + + + + If a request matches anything in the LogSQLRequestIgnore + list, it is discarded. + + + + + If a reqiest matches anything in the LogSQLRemhostIgnore + list, it is discarded. + + + + Otherwise the request is logged. + + + + This means that you can have a series of directives similar to + the following: + + LogSQLRequestAccept .html .gif .jpg +LogSQLRequestIgnore statistics.html bluedot.jpg + + So the first line instructs the module to only log files with + html, gif and jpg suffixes; requests for "formail.cgi" and + "shopping-cart.pl" will never be considered for logging. + ("LeftArrow.JPG" will also never be considered for logging -- + remember, the comparison is case sensitive.) The second line + prunes the list further -- you never want to log requests for + those two objects. + + + + + + If you want to match all the hosts in your domain such + as "host1.corp.foo.com" and "server.dmz.foo.com", simply + specify: + + LogSQLRemhostIgnore foo.com + + + + A great way to catch the vast majority of worm-attack + requests and prevent them from being logged is to + specify: + + LogSQLRequestIgnore root.exe cmd.exe default.ida + + + + To prevent the logging of requests for common graphic + types, make sure to put a '.' before the suffix to avoid + matches that you didn't intend: + + LogSQLRequestIgnore .gif .jpg + + + +
+
+
+ Advanced logging scenarios +
+ Using the module in an ISP environment + mod_log_sql has three basic tiers of operation: + + + + The administrator creates all necessary tables by hand and + configures each Apache VirtualHost by hand. + (LogSQLCreateTables Off) + + + + + The module is permitted to create necessary tables + on-the-fly, but the administrator configures each Apache + VirtualHost by hand. (LogSQLCreateTables On) + + + + + The module is permitted to create all necessary tables and + to make intelligent, on-the-fly configuration of each + VirtualHost. (LogSQLMassVirtualHosting On) + + + + + Many users are happy to use the module in its most minimal + form: they hand-create any necessary tables (using + "create_tables.sql"), and they configure each VirtualHost by + hand to suit their needs. However, some administrators need + extra features due to a large and growing number of + VirtualHosts. The LogSQLMassVirtualHosting directive activates + module capabilities that make it far easier to manage an ISP + environment, or any situation characterized by a large and + varying number of virtual servers. + + + + + the on-the-fly table creation feature is activated + automatically + + + + + the transfer log table name is dynamically set from the + virtual host's name (example: a virtual host + "www.grubbybaby.com" gets logged to table + "access_www_grubbybaby_com") + + + + + There are numerous benefits. The admin will not need to create + new tables for every new VirtualHost. (Although the admin will + still need to drop the tables of virtual hosts that are + removed.) The admin will not need to set + LogSQLTransferLogTable for each virtual host -- it will be + configured automatically based on the host's name. Because + each virtual host will log to its own segregated table, data + about one virtual server will segregate from others; an admin + can grant users access to the tables they need, and they will + be unable to view data about another user's virtual host. + + + In an ISP scenario the admin is likely to have a cluster of + many front-end webservers logging to a back-end database. + mod_log_sql has a feature that permits analysis of how well + the web servers are loadbalancing: the LogSQLMachineID + directive. The administrator uses this directive to assign a + unique identifier to each machine in the web cluster, e.g. + "LogSQLMachineID web01," "LogSQLMachineID web02," etc. Used in + conjunction with the 'M' character in LogSQLTransferLogFormat, + each entry in the SQL log will include the machine ID of the + machine that created the entry. This permits the administrator + to count the entries made by each particular machine and + thereby analyze the front-end loadbalancing algorithm. + +
+
+ + Logging many-to-one data in separate tables + + + A given HTTP request can have a one-to-many relationship with + certain kinds of data. For example, a single HTTP request can + have 4 cookies, 3 headers and 5 "mod_gzip" notes associated + with it. mod_log_sql is capable of logging these relationships + due to the elegance of SQL relational data. + + + You already have a single table containing access requests. + One of the columns in that table is 'id' which is intended to + contain the unique request ID supplied by the standard Apache + module mod_unique_id -- all you need to do is compile in that + module and employ the LogSQLTransferLogFormat character 'I'. + Thereafter, each request gets a unique ID that can be thought + of as a primary key within the database, useful for joining + multiple tables. So let's envision several new tables: a notes + table, a cookies table, and a table for inbound and outbound + headers. + + + <tblAcc>access_log + + + + + + + + + + id + remote_host + request_uri + time_stamp + status + bytes_sent + + + + + PPIDskBRH30AAGPtAsg + zerberus.aiacs.net + /mod_log_sql/index.html + 1022493617 + 200 + 2215 + + + +
+ + <tblNotes>notes_log + + + + + + + id + item + val + + + + + PPIDskBRH30AAGPtAsg + mod_gzip_result + OK + + + PPIDskBRH30AAGPtAsg + mod_gzip_compression_ratio + 69 + + + +
+ + <tblHdr>headers_log + + + + + + + id + item + val + + + + + PPIDskBRH30AAGPtAsg + Content-Type + text/html + + + PPIDskBRH30AAGPtAsg + Accept-Encoding + gzip, deflate + + + PPIDskBRH30AAGPtAsg + Expires + Tue, 28 May 2002 10:00:18 GMT + + + PPIDskBRH30AAGPtAsg + Cache-Control + max-age=86400 + + + +
+ + We have a certain request, and its unique ID is + "PPIDskBRH30AAGPtAsg". Within each separate table will be + multiple entries with that request ID: several cookie entries, + several header entries, etc. As you can see in tables + [tblAcc], [tblNotes] and [tblHdr], you have a one-to-many + relationship for request PPIDskBRH30AAGPtAsg: that one access + has two associated notes and four associated headers. You can + extract this data easily using the power of SQL's "select" + statement and table joins. To see the notes associated with a + particular request: + + SELECT a.remote_host, a.request_uri, n.item, n.val +FROM access_log a JOIN notes_log n ON a.id=n.id +WHERE a.id='PPIDskBRH30AAGPtAsg'; + + access_log joined to notes_log + + + + + + + + remote_host + request_uri + item + val + + + + + zerberus.aiacs.net + /mod_log_sql/index.html + mod_gzip_result + OK + + + zerberus.aiacs.net + /mod_log_sql/index.html + mod_gzip_compression_ratio + 69 + + + +
+ + Naturally you can craft similar statements for the outboud + headers, inbound headers and cookies, all of which can live in + separate tables. Your statements are limited in power only by + your skill with SQL. + + + In order to use this capability of mod_log_sql, you must do + several things. + + + + + Compile mod_unique_id into Apache (statically or as a + DSO). mod_log_sql employs the unique request ID that + mod_unique_id provides in order to key between the + separate tables. You can still log the data without + mod_unqiue_id, but it will be completely uncorrelated and + you will have no way to discern any meaning. + + + + + Create the appropriate tables. This will be done for you + if you permit mod_log_sql to create its own tables using + LogSQLCreateTables On, or if you use the enclosed + "create_tables.sql" script. + + + + + Create a SQL index on the "id" column. Without this index, + table joins will be deathly slow. I recommend you consult + the MySQL documentation on the proper way to create a + column index if you are not familiar with this operation. + + + + + Within each appropriate VirtualHost stanza, use the + LogSQLWhich* and LogSQL*LogTable directives to tell the + module what and where to log the data. In the following + example, I have overridden the name for the notes table + whereas I have left the other table names at their + defaults. I have then specified the cookies, headers and + notes that interest me. (And as you can see, these + directives do not require me to add any characters to + LogSQLTransferLogTable.) + + <VirtualHost 216.231.36.128> + (snip) + LogSQLNotesLogTable notestable + LogSQLWhichCookies bluecookie redcookie greencookie + LogSQLWhichNotes mod_gzip_result mod_gzip_compression_ratio + LogSQLWhichHeadersOut Expires Content-Type Cache-Control + LogSQLWhichHeadersIn User-Agent Accept-Encoding Host + (snip) +</VirtualHost> + + +
+
+ Using the same database for production and test + + Although sub-optimal, it is not uncommon to use the same + back-end database for the "production" webservers as well as + the "test" webservers (budgetary constraints, rack-space + limits, etc.). Furthermore, an administrator in this situation + may be unable to use LogSQLRemhostIgnore to exclude requests + from the test servers -- perhaps the generated entries are + genuinely useful for analytical or QA purposes, but their + value after analysis is minimal. + + + It is wasteful and potentially confusing to permit this + internal test data to clutter the database, and a solution to + the problem is the proper use of the LogSQLMachineID + directive. Assume a scenario where the production webservers + have IDs like "web01," "web02," and so on -- and the test + webservers have IDs like "test01," "test02," etc. Because + entries in the log database are distinguished by their source + machine, an administrator may purge unneeded test data from + the access log as follows: + + DELETE FROM access_log WHERE machine_id like 'test%'; +
+
+ + Optimizing for a busy database + + + A busy MySQL database will have SELECT statements running + concurrently with INSERT and UPDATE statements. A long-running + SELECT can in certain circumstances block INSERTs and + therefore block mod_log_sql. A workaround is to enable + mod_log_sql for "delayed inserts," which are described as + follows in the MySQL documentation. + + + The DELAYED option for the INSERT statement is a + MySQL-specific option that is very useful if you have clients + that can't wait for the INSERT to complete. This is a common + problem when you use MySQL for logging and you also + periodically run SELECT and UPDATE statements that take a long + time to complete. DELAYED was introduced in MySQL Version + 3.22.15. It is a MySQL extension to ANSI SQL92. + + + INSERT DELAYED only works with ISAM and MyISAM tables. Note + that as MyISAM tables supports concurrent SELECT and INSERT, + if there is no free blocks in the middle of the data file, you + very seldom need to use INSERT DELAYED with MyISAM. + + + When you use INSERT DELAYED, the client will get an OK at once + and the row will be inserted when the table is not in use by + any other thread. + + + Another major benefit of using INSERT DELAYED is that inserts + from many clients are bundled together and written in one + block. This is much faster than doing many separate inserts. + + The general disadvantages of delayed inserts are + + + + The queued rows are only stored in memory until they are + inserted into the table. If mysqld dies unexpectedly, any + queued rows that were not written to disk are lost. + + + + + There is additional overhead for the server to handle a + separate thread for each table on which you use INSERT + DELAYED. + + + + + + The MySQL documentation concludes, "This means that you + should only use INSERT DELAYED when you are really sure you + need it!" Furthermore, the current state of error return + from a failed INSERT DELAYED seems to be in flux, and may + behave in unpredictable ways between different MySQL + versions. See FAQ entry + + -- you have been warned. + + + + If you are experiencing issues which could be solved by + delayed inserts, then set LogSqlDelayedInserts On in the + httpd.conf + . All regular INSERT statements are now INSERT DELAYED, and + you should see no more blocking of the module. + +
+
+
+ + Configuration Directive Reference + + + It is imperative that you understand which directives are used + only once in the main server config, and which are used inside + VirtualHost stanzas and therefore multiple times within + httpd.conf. The "context" listed with each entry informs you of + this. + +
+ DataBase Configuration + + + LogSQLLoginInfo + + + LogSQLLoginInfo + + connection URI + + + + Example: LogSQLLoginInfo + mysql://logwriter:passw0rd@foobar.baz.com/Apache_log + + Context: main server config + + Defines the basic connection URI to connect to the + database with. The format of the connection URI is + + + driver://username[:password]@hostname[:port]/database + + + + driver + + + The database driver to use (mysql, pgsql, etc..) + + + + + username + + + The database username to login with INSERT + privileges on the logging table defined in + LogSQLtransferLogTable. + + + + + password + + + The password to use for username, and can be + omitted if there is no password. + + + + + hostname + + + The hostname or Ip address of the Database + machine, ans is simple "localhost" if the database + lives on the same machine as Apache. + + + + + port + + + Port on hostname to connect to the Database, if + not specified use the default port for the + database. + + + + + database + + + The database to connect to on the server. + + + + + + + This is defined only once in the + httpd.conf + file. + + + This directive Must be defined for logging to be + enabled. + + + + + + LogSQLDBParam + + + LogSQLDBParam + + parameter-name + + + value + + + + Example: LogSQLDBParam socketfile + /var/lib/mysql/mysql.socket + + Context: main server config + + This is the new method of specifying Database connection + credentials and settings. This is used to define + database driver specific options. For a list of options + read the documentation for each specific database + driver. + + + MySQL Driver parameters + + + + + + + Parameter + Meaning + Default + + + + + hostname + MySQL Server hostname + none (use LogSQLLoginInfo to set) + + + username + The username to log in with + none (use LogSQLLoginInfo to set) + + + password + The password to use + none (use LogSQLLoginInfo to set) + + + database + Which database to connect to + none (use LogSQLLoginInfo to set) + + + port + The TCP port to connect to the MySQL server over + 3306 (use LogSQLLoginInfo to set) + + + socketfile + The MySQL Unix socket file to use + none + + + tabletype + MySQL Table Engine to use + MySQL server default + + + +
+ + + Each parameter-name may only be defined once. + + +
+
+ + LogSQLCreateTables + + + LogSQLCreateTables + flag + + Example: LogSQLCreateTables On + Default: Off + Context: main server config + + mod_log_sql has the ability to create its tables + on-the-fly. The advantage to this is convenience: you + don't have to execute any SQL by hand to prepare the + table. This is especially helpful for people with lots + of virtual hosts (who should also see the + LogSQLMassVirtualHosting directive). + + + There is a slight disadvantage: if you wish to activate + this feature, then the userid specified in + LogSQLLoginInfo must have CREATE privileges on the + database. In an absolutely paranoid, locked-down + situation you may only want to grant your mod_log_sql + user INSERT privileges on the database; in that + situation you are unable to take advantage of + LogSQLCreateTables. But most people -- even the very + security-conscious -- will find that granting CREATE on + the logging database is reasonable. + + + + This is defined only once in the + httpd.conf + file. + + + + + + LogSQLForcePreserve + + + LogSQLForcePreserve + flag + + Example: LogForcePreserve On + Default: Off + Context: main server config + + You may need to perform debugging on your database and + specifically want mod_log_sql to make no attempts to log + to it. This directive instructs the module to send all + its log entries directly to the preserve file and to + make no database INSERT attempts. + + + This is presumably a directive for temporary use only; + it could be dangerous if you set it and forget it, as + all your entries will simply pile up in the preserve + file. + + + + This is defined only once in the + httpd.conf + file. + + + + + + LogSQLDisablePreserve + + + LogSQLDisablePreserve + flag + + Example: LogDisablePreserve On + Default: Off + Context; main server config + + This option can be enabled to completely disable the + preserve file fail back. This may be useful for servers + where the file-system is read-only. + + + If the database is not available those log entries will + be lost. + + + + This is defined only once in the + httpd.conf + file. + + + + + + LogSQLMachineID + + + LogSQLMachineID + machineID + + Example: LogSQLMachineID web01 + Context: main server config + + If you have a farm of webservers then you may wish to + know which particular machine made each entry; this is + useful for analyzing your load-balancing methodology. + LogSQLMachineID permits you to distinguish each + machine's entries if you assign each machine its own + LogSQLMachineID: for example, the first webserver gets + ``LogSQLMachineID web01,'' the second gets + ``LogSQLMachineID web02,'' etc. + + + + This is defined only once in the + httpd.conf + file. + + + + + + LogSQlPreserveFile + + + LogSQLPreserveFile + + filename + + + + Example: LogSQLPreserveFile offline-preserve + + Default: /tmp/sql-preserve + Context: virtual host + + mod_log_sql writes queries to this local preserve file + in the event that it cannot reach the database, and thus + ensures that your high-availability web frontend does + not lose logs during a temporary database outage. This + could happen for a number of reasons: the database goes + offline, the network breaks, etc. You will not lose + entries since the module has this backup. The file + consists of a series of SQL statements that can be + imported into your database at your convenience; + furthermore, because the SQL queries contain the access + timestamps you do not need to worry about out-of-order + data after the import, which is done in a simple manner: + + # mysql -uadminuser -p mydbname < /tmp/sql-preserve + + If you do not define LogSQLPreserveFile then all virtual + servers will log to the same default preserve file ( + /tmp/sql-preserve + ). You can redefine this on a virtual-host basis in + order to segregate your preserve files if you desire. + Note that segregation is not usually necessary, as the + SQL statements that are written to the preserve file + already distinguish between different virtual hosts if + you include the 'v' character in your + LogSQLTransferLogFormat directive. It is only necessary + to segregate preserve-files by virualhost if you also + segregate access logs by virtualhost. + + + The module will log to Apache's ErrorLog when it notices + a database outage, and upon database return. You will + therefore know when the preserve file is being used, + although it is your responsibility to import the file. + + + The file does not need to be created in advance. It is + safe to remove or rename the file without interrupting + Apache, as the module closes the filehandle immediately + after completing the write. The file is created with the + user & group ID of the running Apache process (e.g. + 'nobody' on many Linux distributions). + + + +
+
+
+ Table Names + + + LogSQLTransferLogTable + + + LogSQLTransferLogTable + + table-name + + + + Example: LogSQLTransferLogTable access_log_table + + Context: virtual host + + Defines which table is used for logging of Apache's + transfers; this is analogous to Apache's TransferLog + directive. table-name must be a valid table within the + database defined in the LogSQLLoginInfo connection URI. + + + This directive is + not + necessary if you declare LogSQLMassVirtualHosting On, + since that directive activates dynamically-named tables. + If you attempt to use LogSqlTransferlogTable at the same + time a warning will be logged and it will be ignored, + since LogSQLMassVirtualHosting takes priority. + + + + Requires unless LogSQLMassVirtualHosting is set to On + + + + + + LogSQLCookieLogTable + + + LogSQLCookieLogTable + + + table-name + + + + Example: LogSQLCookieLogTable cookie_log + + Default: cookies + Context: virtual host + + Defines which table is used for logging of cookies. + Working in conjunction with LogSQLWhichCookies, you can + log many of each request's associated cookies to a + separate table. For meaningful data retrieval the cookie + table is keyed to the access table by the unique request + ID supplied by the standard Apache module mod_unique_id. + + + + You must create the table (see create-tables.sql, + included in the package), or LogSQLCreateTables must + be set to "on". + + + + + + LogSQLHeadersInLogTable + + + LogSQLHeadersInLogTable + + table-name + + + + Example: LogSQLHeadersInLogTable headers + + Default: headers_in + Context: virtual host + + Defines which table is used for logging of inbound + headers. Working in conjunction with + LogSQLWhichHeadersIn, you can log many of each request's + associated headers to a separate table. For meaningful + data retrieval the headers table is keyed to the access + table by the unique request ID supplied by the standard + Apache module mod_unique_id. + + + + Note that you must create the table (see + create-tables.sql, included in the package), or + LogSQLCreateTables must be set to "on". + + + + + + LogSQLHeadersOutLogTable + + + LogSQLHeadersOutLogTable + + table-name + + + + Example: LogSQLHeadersOutLogTable headers + + Default: headers_out + Context: virtual host + + Defines which table is used for logging of outbound + headers. Working in conjunction with + LogSQLWhichHeadersOut, you can log many of each + request's associated headers to a separate table. For + meaningful data retrieval the headers table is keyed to + the access table by the unique request ID supplied by + the standard Apache module mod_unique_id. + + + + Note that you must create the table (see + create-tables.sql, included in the package), or + LogSQLCreateTables must be set to "on". + + + + + + LogSQLNotesLogTable + + + LogSQLNotesLogTable + + table-name + + + Example: LogSQLNotesLogTable notes-log + Default: notes + Context: virtual_host + + Defines which table is used for logging of notes. + Working in conjunction with LogSQLWhichNotes, you can + log many of each request's associated notes to a + separate table. For meaningful data retrieval the notes + table is keyed to the access table by the unique request + ID supplied by the standard Apache module mod_unique_id. + + + + This table must be created (see create-tables.sql + included in the package), or LogSQLCreateTables must + be set to 'On'. + + + + + + LogSQLMassVirtualHosting + + + LogSQLMassVirtualHosting + flag + + Example: LogSQLMassVirtualHosting On + Default: Off + Context: main server config + + If you administer a site hosting many, many virtual + hosts then this option will appeal to you. If you turn + on LogSQLMassVirtualHosting then several things happen: + + + + + the on-the-fly table creation feature is activated + automatically + + + + + the transfer log table name is dynamically set from + the virtual host's name after stripping out + SQL-unfriendly characters (example: a virtual host + www.grubbybaby.com gets logged to table + access_www_grubbybaby_com) + + + + + which, in turn, means that each virtual host logs to + its own segregated table. Because there is no data + shared between virtual servers you can grant your + users access to the tables they need; they will be + unable to view others' data. + + + + + This is a huge boost in convenience for sites with many + virtual servers. Activating LogSQLMassVirtualHosting + obviates the need to create every virtual server's table + and provides more granular security possibilities. + + + + This is defined only once in the + httpd.conf + file. + + + + + +
+
+ Configuring What Is logged + + + LogSQLTransferLogFormat + + + LogSQLTransferLogFormat + + format-string + + + Example: LogSQLTransferLogFormat huSUTv + Default: AbHhmRSsTUuv + Context: virtual host + + Each character in the format-string defines an attribute + of the request that you wish to log. The default logs + the information required to create Combined Log Format + logs, plus several extras. Here is the full list of + allowable keys, which sometimes resemble their Apache + counterparts, but do not always: + + + Core LogFormat parameters + + + + + + + + + Symbol + Meaning + DB Field + Data Type + Example + + + + + A + User Agent + agent + varchar(255) + + Mozilla/4.0 (compat; MSIE 6.0; Windows) + + + + a + CGi request arguments + request_args + varchar(255) + + user=Smith&cart=1231&item=532 + + + + b + Bytes transfered + bytes_sent + int unsigned + 32561 + + + + c + + + Text of cookie + cookie + varchar(255) + + Apache=sdyn.fooonline.net 1300102700823 + + + + f + Local filename requested + request_file + varchar(255) + /var/www/html/books-cycroad.html + + + H + HTTP request_protocol + request_protocol + varchar(10) + HTTP/1.1 + + + h + Name of remote host + remote_host + varchar(50) + blah.foobar.com + + + I + Request ID (from modd_unique_id) + id + char(19) + POlFcUBRH30AAALdBG8 + + + l + Ident user info + remote_logname + varcgar(50) + bobby + + + M + + Machine ID + + + machine_id + varchar(25) + web01 + + + m + HTTP request method + request_method + varchar(10) + GET + + + P + httpd cchild PID + child_pid + smallint unsigned + 3215 + + + p + http port + server_port + smallint unsigned + 80 + + + R + Referer + referer + varchar(255) + + http://www.biglinks4u.com/linkpage.html + + + + r + Request in full form + request_line + varchar(255) + GET /books-cycroad.html HTTP/1.1 + + + S + + Time of request in UNIX time_t format + + time_stamp + int unsigned + 1005598029 + + + s + HTTP Response Code Status + status + smallint + 200 + + + T + Seconds to service request + request_duration + smallint unsigned + 2 + + + t + Time of request in human format + request_time + char(28) + [02/Dec/2001:15:01:26 -0800] + + + U + Request in simple form + request_uri + varchar(255) + /books-cycroad.html + + + u + User info from HTTP auth + remote_user + varchar(50) + bobby + + + v + Virtual host servicing the request + virtual_host + varchar(255) + www.foobar.com + + + V + + requested Virtual host name (mass + virtualhosting) + + virtual_host + varchar(255) + www.foobar.org + + + +
+ + + [1] You must also specify LogSQLWhichCookie for this + to take effect. + + + [2] You must also specify LogSQLmachineID for this to + take effect. + + + + SSL LogFormat Parameters + + + + + + + + + Symbol + Meaning + DB Field + Data Type + Example + + + + + z + SSL cipher used + ssl_cipher + varchar(25) + RC4-MD5 + + + q + + Keysize of the SSL connection + + ssl_keysize + smallint unsigned + 56 + + + Q + + maximum keysize supported + + ssl_maxkeysize + smallint unsigned + 128 + + + +
+ + LogIO LogFormat Parameters + + + + + + + + + Symbol + Meaning + DB Field + Data Type + Example + + + + + i + Number of actual Bytes transfered in with the request + bytes_in + int unsigned + 505 + + + o + Number of actual Bytes transfered out with the request + bytes_out + int unsigned + 4168 + + + +
+
+
+ + LogSQLRemhostIgnore + + + LogSQLRemhostIgnore + + hostname + + + + Example: LogSQLRemhostIgnore localnet.com + + Context: virtual host + + Lists a series of smortrings that, if present in the + REMOTE_HOST, will cause that request to + not + be logged. This directive is useful for cutting down on + log clutter when you are certain that you want to ignore + requests from certain hosts, such as your own internal + network machines. See section + + for some tips for using this directive. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + Each string is separated by a space, and no regular + expressions or globbing are allowed. Each string is + evaluated as a substring of the REMOTE_HOST using + strstr(). The comparison is case sensitive. + + + + + LogSQLRequestAccept + + + LogSQLRequestAccept + + substring + + + + Example: LogSQLRequestAccept .html .php .jpg + + + Default: if not specified, all requests are 'accepted' + + Context: virtual host + + Lists a series of strings that, if present in the URI, + will permit that request to be considered for logging + (depending on additional filtering by the "ignore" + directives). Any request that fails to match one of the + LogSQLRequestAccept entries will be discarded. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + This directive is useful for cutting down on log clutter + when you are certain that you only want to log certain + kinds of requests, and just blanket-ignore everything + else. See section + + for some tips for using this directive. + + + Each string is separated by a space, and no regular + expressions or globbing are allowed. Each string is + evaluated as a substring of the URI using strstr(). The + comparison is case sensitive. + + + This directive is completely optional. It is more + general than LogSQLRequestIgnore and is evaluated before + LogSQLRequestIgnore . If this directive is not used, + all + requests are accepted and passed on to the other + filtering directives. Therefore, only use this directive + if you have a specific reason to do so. + + + + + LogSQLRequestIgnore + + + LogSQLRequestIgnore + + substring + + + + Example: LogSQLRequestIgnore root.exe cmd.exe + default.ida favicon.ico + + Context: virtual host + + Lists a series of strings that, if present in the URI, + will cause that request to + NOT + be logged. This directive is useful for cutting down on + log clutter when you are certain that you want to ignore + requests for certain objects. See section + + for some tips for using this directive. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + Each string is separated by a space, and no regular + expressions or globbing are allowed. Each string is + evaluated as a substring of the URI using strstr(). The + comparison is case sensitive. + + + + + LogSQLWhichCookie + + + LogSQLWhichCookie + + cookiename + + + Example; LogSQLWhichCookie Clicks + Context: virtual host + + In HTTP, cookies have names to distinguish them from + each other. Using mod_usertrack, for example, you can + give your user-tracking cookies a name with the + CookieName directive. + + + mod_log_sql allows you to log cookie information. + LogSQL_WhichCookie tells mod_log_sql which cookie to + log. This is necessary because you will usually be + setting and receiving more than one cookie from a + client. + + + + You must include a 'c' character in + LogSQLTransferLogFormat for this directive to take + effect. + + + although this was origintally intended for people + using mod_usertrack to create user-tracking cookies, + you are not restricted in any way. You can choose + which cookie you wish to log to the database - any + cookie at all - and it does not necessarily have to + have anything to do with mod_usertrack. + + + + + + LogSQLWhichCookies + + + LogSQLWhichCookies + + cookie-name + + + + Example: logSQLWhichCookies userlogin cookie1 cookie2 + + Context: virtual host + + Defines the list of cookies you would like logged. This + works in conjunction with LogSQLCookieLogTable. This + directive does + not + require any additional characters to be added to the + LogSQLTransferLogFormat string. The feature is activated + simply by including this directive, upon which you will + begin populating the separate cookie table with data. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + + The table must be created (see create-tables.sql, + included in the package), or LogSQLCreateTables must + be set to 'On'. + + + + + + LogSQLWhichHeadersIn + + + LogSQLWhichHeadersIn + + header-name + + + + Example: LogSQLWhichHeadersIn User-Agent Accept-Encoding + Host + + Context: virtual host + + Defines the list of inbound headers you would like + logged. This works in conjunction with + LogSQLHeadersInLogTable. This directive does not require + any additional characters to be added to the + LogSQLTransferLogFormat string. The feature is activated + simply by including this directive, upon which you will + begin populating the separate inbound-headers table with + data. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + + The table must be created (see create-tables.sql, + included in the package), or LogSQLCreateTables must + be set to 'On'. + + + + + + LogSQLWhichHeadersOut + + + LogSQLWhichHeadersOut + + header-name + + + + Example: LogSQLWhichHeadersOut Expires Content-Type + Cache-Control + + Context: virtual host + + Defines the list of outbound headers you would like + logged. This works in conjunction with + LogSQLHeadersOutLogTable. This directive does not + require any additional characters to be added to the + LogSQLTransferLogFormat string. The feature is activated + simply by including this directive, upon which you will + begin populating the separate outbound-headers table + with data. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + + The table must be created (see create-tables.sql, + included in the package), or LogSQLCreateTables must + be set to 'On'. + + + + + + LogSQLWhichNotes + + + LogSQLWhichNotes + + note-name + + + + Example: LogSQLWhichNotes mod_gzip_result + mod_gzip_ompression_ratio + + Context: virtual host + + Defines the list of notes you would like logged. This + works in conjunction with LogSQLNotesLogTable. This + directive does not require any additional characters to + be added to the LogSQLTransferLogFormat string. The + feature is activated simply by including this directive, + upon which you will begin populating the separate notes + table with data. + + + Each string may contain a + or - prefix in a + <VirtualHost> context and will cause those strings + to be added (+) or removed (-) from the global + configuration. Otherwise the global is completely + ignored and overridden if defined in a + <VirtualHost> + + + + The table must be created (see create-tables.sql, + included in the package), or LogSQLCreateTables must + be set to 'On'. + + + + +
+
+
+ Deprecated Commands + + + LogSQLSocketFile [Deprecated] + + + LogSQLSocketFile + + filename + + + + Example: LogSQLSocketFile /tmp/mysql.sock + + Default: (database specific) + + Default (MySQL): /var/lib/mysql/mysql.sock + + Context: main server config + + At Apache runtime you can specify the MySQL socket file + to use. Set this once in your main server config to + override the default value. This value is irrelevant if + your database resides on a separate machine. + + + mod_log_sql will automatically employ the socket for db + communications if the database resides on the local + host. If the db resides on a separate host the module + will automatically use TCP/IP. This is a function of the + MySQL API and is not user-configurable. + + + + This directive is deprecated in favor of LogSQLDBParam + socketfile [socketfilename] + + + This is defined only once in the + httpd.conf + file. + + + + + + LogSQLTCPPort [Deprecated] + + + LogSQLTCPPort + + port-number + + + Example: LogSQLTCPPort 3309 + Default: (database specific) + Default (MySQL): 3306 + Context: main server config + + Your database may listen on a different port than the + default. If so, use this directive to instruct the + module which port to use. This directive only applies if + the database is on a different machine connected via + TCP/IP. + + + + This directive is deprecated in favor of LogSQLDBParam + tcpport [port-number] + + + This is defined only once in the + httpd.conf + file. + + + + + + LogSQLDatabase [Deprecated] + + + LogSQLDatabase + + database + + + Example: LogSQLDatabase loggingdb + Context: main server config + + Defines the database that is used for logging. + "database" must be a valid db on the MySQL host defined + in LogSQLLoginInfo + + + + This directive is deprecated in favor of the URI form + of LogSQLLoginInfo. + + + This is defined only once in the + httpd.conf + file. + + + + + +
+
+
+
+ FAQ + + + General module questions + + + Why log to an SQL database? + + + + To begin with, let's get it out of the way: logging to a + database is not a panacea. But while there are + complexities with this solution, the benefit can be + substantial for certain classes of administrator or people + with advanced requirements: + + + + + Chores like log rotation go away, as you can DELETE + records from the SQL database once they are no longer + useful. For example, the excellent and popular + log-analysis tool Webalizer (http://www.webalizer.com) + does not need historic logs after it has processed + them, enabling you to delete older logs. + + + + + People with clusters of web servers (for high + availability) will benefit the most - all their + webservers can log to a single SQL database. This + obviates the need to collate/interleave the many + separate logfiles, which can be / highly/ problematic. + + + + + People acquainted with the power of SQL SELECT + statements will know the flexibility of the extraction + possibilities at their fingertips. + + + + + For example, do you want to see all your 404's? Do this: + + SELECT remote_host, status, request_uri, bytes_sent, from_unixtime(time_stamp) +FROM acc_log_tbl WHERE status=404 ORDER BY time_stamp; + + + + + + + + + + + remote_host + status + request_uri + bytes_sent + from_unixtime(time_stamp) + + + + + marge.mmm.co.uk + 404 + /favicon.ico + 321 + 2001-11-20 02:30:56 + + + 62.180.239.251 + 404 + /favicon.ico + 333 + 2001-11-20 02:45:25 + + + 212.234.12.66 + 404 + /favicon.ico + 321 + 2001-11-20 03:01:00 + + + 212.210.78.254 + 404 + /favicon.ico + 333 + 2001-11-20 03:26:05 + + + +
+ + Or do you want to see how many bytes you've sent within a + certain directory or site? Do this: + + SELECT request_uri,sum(bytes_sent) AS bytes, count(request_uri) AS howmany +FROM acc_log_tbl +WHERE request_uri LIKE '%mod_log_sql%' +GROUP BY request_uri ORDER BY howmany DESC; + + + + + + + + + request_uri + bytes + howmany + + + + + /mod_log_sql/style_1.css + 157396 + 1288 + + + /mod_log_sql/ + 2514337 + 801 + + + + /mod_log_sql/mod_log_sql.tar.gz + + 9769312 + 456 + + + /mod_log_sql/faq.html + 5038728 + 436 + + + +
+ + Or maybe you want to see who's linking to you? Do this: + + SELECT count(referer) AS num,referer +FROM acc_log_tbl +WHERE request_uri='/mod_log_sql/' +GROUP BY referer ORDER BY num DESC; + + + + + + + + num + referer + + + + + 271 + + http://freshmeat.net/projects/mod_log_sql/ + + + + 96 + + http://modules.apache.org/search?id=339 + + + + 48 + http://freshmeat.net/ + + + 8 + http://freshmeat.net + + + +
+ + As you can see, there are myriad possibilities that can be + constructed with the wonderful SQL SELECT statement. + Logging to an SQL database can be really quite useful! + +
+
+ + + Why use MySQL? Are there alternatives? + + + + MySQL is a robust, free, and very powerful + production-quality database engine. It is well supported + and comes with detailed documentation. Many 3rd-party + software pacakges (e.g. Slashcode, the engine that powers + Slashdot) run exclusively with MySQL. In other words, you + will belong to a very robust and well-supported community + by choosing MySQL. + + + That being said, there are alternatives. PostgreSQL is + probably MySQL's leading "competitor" in the free database + world. There is also an excellent module available for + Apache to permit logging to a PostgreSQL database, called + + pgLOGd + + + + + + + Currently a database abstraction system is in the works + to allow any database to be used with mod_log_sql. + + + + + + + Is this code production-ready? + + + + By all accounts it is. It is known to work without a + problem on many-thousands-of-hits-per-day webservers. Does + that mean it is 100% bug free? Well, no software is, but + it is well-tested and believed to be fully compatible with + production environments. (The usual disclaimers apply. + This software is provided without warranty of any kind.) + + + + + + Who's using mod_log_sql? + + + + Good question! It would be great to find out! If you are a + production-level mod_log_sql user, please contact eddie at + &EmailContact; + so that you can be mentioned here. + + + + + + + Why doesn't the module also replace the Apache ErrorLog? + + + + + There are circumstances when that would be quite unwise -- + for example, if Apache could not reach the MySQL server + for some reason and needed to log that fact. Without a + text-based error log you'd never know anything was wrong, + because Apache would be trying to log a database + connection error to the database... you get the point. + + + + + Error logs are usually not very high-traffic and are + really best left as text files on a web server machine. + + + + + The Error log is free format text.. (no specified + formatting what, so ever) which is rather difficult to + nicely format for storing in a database. + + + + + + Does mod_log_sql work with Apache 2.x? + + + + Yes. A port of mod_log_sql is available for Apache 2.x as + of mod_log_sql 1.90 + + + + + + + Does mod_log_sql connect to MySQL via TCP/IP or a socket? + + + + Quick answer, Yes. + + + + It depends! This is not determined by mod_log_sql. + mod_log_sql relies on a connection command that is + supplied in the MySQL API, and that command is somewhat + intelligent. How it works: + + + + + if the specified MySQL database is on the same + machine, the connection command uses a socket to + communicate with MySQL + + + + + if the specified MySQL database is on a different + machine, mod_log_sql connects using TCP/IP. + + + + + You don't have any control of which methodology is used. + You can fine-tune some of the configuration, however. The + LogSQLSocketFile runtime configuration directive overrides + the default of "/var/lib/mysql/mysql.sock" for + socket-based connections, whereas the LogSQLTCPPort + command allows to you override the default TCP port of + 3306 for TCP/IP connections. + + + + + + I have discovered a bug. Who can I contact? + + + + Please contact Edward Rudd at + &EmailContact; + , or post a message to the mod_log_sql + + . Your comments, suggestions, bugfixes, bug catches, and + usage testimonials are always welcome. As free software, + mod_log_sql is intended to be a community effort -- any + code contributions or other ideas will be fully and openly + credited, of course. + + + +
+ + Problems + + + + Apache segfaults or has other problems when using PHP and + mod_log_sql + + + + + This occurs if you compiled PHP with MySQL database + support. PHP utilizes its internal, bundled MySQL + libraries by default. These conflict with the "real" MySQL + libraries linked by mod_log_sql, causing the segmentation + fault. + + + PHP and mod_log_sql can be configured to happily coexist. + The solution is to configure PHP to link against the real + MySQL libraries: recompile PHP using + --with-mysql=/your/path. Apache will run properly once the + modules are all using the same version of the MySQL + libraries. + + + + + + + Apache appears to start up fine, but nothing is getting + logged in the database + + + + + If you do not see any entries in the access_log, then + something is preventing the inserts from happening. This + could be caused by several things: + + + + + Improper privileges set up in the MySQL database + + + + + You are not hitting a VirtualHost that has a + LogSQLTransferLogTable entry + + + + + You did not specify the right database host or login + information + + + + + Another factor is preventing a connection to the + database + + + + + + It is improper to ask for help before you have followed + these steps. + + + + First examine the MySQL log that you established in step + + of section + + . Ensure that the INSERT statements are not being rejected + because of a malformed table name or other typographical + error. By enabling that log, you instructed MySQL to log + every connection and command it receives -- if you see no + INSERT attempts in the log, the module isn't successfully + connecting to the database. If you see nothing at all in + the log -- not even a record of your administrative + connection attempts, then you did not enable the log + correctly. If you do see INSERT attempts but they are + failing, the log should tell you why. + + + Second, confirm that your LogSQL* directives are all + correct. + + + Third, examine the Apache error logs for messages from + mod_log_sql; the module will offer hints as to why it + cannot connect, etc. + + + The next thing to do is to change the LogLevel directive + + in the main server config as well as in each VirtualHost + config: + + + LogLevel debug +ErrorLog /var/log/httpd/server-messages + + + + + + Why do I get the message "insufficient configuration info + to establish database link" in my Apache error log? + + + + + At a minimum, LogSQLLoginInfo in the URl form and either + LogSQLTableName or LogSQLMassVirtualHosting must be + defined in order for the module to be able to establish a + database link. If these are not defined or are incomplete + you will receive this error message. + + + + + + + My database cannot handle all the open connections from + mod_log_sql, is there anything I can do? + + + + + The rule of thumb: if you have n webservers each + configured to support y MaxClients, then your database + must be able to handle n times y simultaneous connections + in the worst case. Certainly you must use common sense, + consider reasonable traffic expectations and structure + things accordingly. + + + + + Tweaking my.cnf to scale to high connection loads is + imperative. But if hardware limitations prevent your MySQL + server from gracefully handling the number of incoming + connections, it would be beneficial to upgrade the memory + or CPU on that server in order to handle the load. + + + + + Jeremy Zawodny, a highly respected MySQL user and + contributor to Linux Magazine, has this very helpful and + highly appropriate article on tuning MySQL: + + MySQL, Linux, and Thread Caching + + + + + + Please remember that mod_log_sql's overriding principle is + performance -- that is what the target audience demands + and expects. Other database logging solutions do not open + and maintain many database connections, but their + performance suffers drastically. For example, pgLOGd + funnels all log connections through a separate daemon that + connects to the database, but that bottlenecks the entire + process. mod_log_sql achieves performance numbers an order + of magnitude greater than the alternatives because it + dispenses with the overhead associated with rapid + connection cycling, and it does not attempt to shoehorn + all the database traffic through a single extra daemon or + proxy process. + + + + + + Currently connection pooling is being implemented as + part of the Database Abstraction layer to allow multiple + httpd processes to share connections. + + + + + + + + Why do I occasionally see a "lost connection to MySQL + server" message in my Apache error log? + + + + + This message may appear every now and then in your Apache + error log, especially on very lightly loaded servers. This + does not mean that anything is necessarily wrong. Within + each httpd child process, mod_log_sql will open (and keep + open) a connection to the MySQL server. MySQL, however, + will close connections that have not been used in a while; + the default timeout is 8 hours. When this occurs, + mod_log_sql will notice and re-open the connection. That + event is what is being logged, and looks like this: + + [Tue Nov 12 19:04:10 2002] [error] mod_log_sql: first attempt failed, + API said: error 2013, Lost connection to MySQL server during query +[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: reconnect successful +[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: second attempt successful + + Reference: + + MySQL documentation + + + + + + + + Sometimes a single VirtualHost gets logged to two + different tables (e.g. access_foo_com, + access_www_foo_com). Or, accesses to an unqualified + hostname (e.g. "http://intranet/index.html") get logged in + separate tables. + + + + + Proper usage of the Apache runtime ServerName directive + and the directive UseCanonicalName On (or DNS) are + necessary to prevent this problem. "On" is the default for + UseCanonicalName, and specifies that self-referential URLs + are generated from the ServerName part of your + VirtualHost: + + + With UseCanonicalName on (and in all versions prior to + 1.3) Apache will use the ServerName and Port directives to + construct the canonical name for the server. With + UseCanonicalName off Apache will form self-referential + URLs using the hostname and port supplied by the client if + any are supplied (otherwise it will use the canonical + name, as defined above). [From + + the Apache documentation + + ] + + + The module inherits Apache's "knowledge" about the server + name being accessed. As long as those two directives are + properly configured, mod_log_sql will log to only one + table per virtual host while using + LogSQLMassVirtualHosting. + + + + + + Performance and Tuning + + + How well does it perform? + + + + mod_log_sql scales to very high loads. Apache 1.3.22 + + mod_log_sql was benchmarked using the "ab" (Apache Bench) + program that comes with the Apache distribution; here are + the results. + + + Overall configuration + + Machine A: Apache webserver + + + Machine B: MySQL server + + + + Machines A and B connected with 100Mbps Ethernet + + + + + Webserver: Celeron 400, 128MB RAM, IDE storage + + + + + Apache configuration + Timeout 300 +KeepAlive On +MaxKeepAliveRequests 100 +KeepAliveTimeout 15 +MinSpareServers 5 +StartServers 10 +MaxSpareServers 15 +MaxClients 256 +MaxRequestsPerChild 5000 +LogSQLTransferLogFormat AbHhmRSsTUuvc +LogSQLWhichCookie Clicks +CookieTracking on +CookieName Clicks + + + "ab" commandline + ./ab -c 10 -t 20 -v 2 -C Clicks=ab_run http://www.hostname.com/target + + + ( 10 concurrent requests; 20 second test; setting a cookie + "Clicks=ab_run"; target = the mod_log_sql homepage. ) + + + Ten total ab runs were conducted: five with MySQL logging + enabled, and five with all MySQL directives commented out + of httpd.conf. Then each five were averaged. The results: + + + + + Average of five runs employing MySQL and standard text + logging: + + 139.01 requests per second, zero errors. + + + + + + Average of five runs employing only standard text + logging: + + 139.96 requests per second, zero errors. + + + + + + In other words, any rate-limiting effects on this + particular hardware setup are not caused by MySQL. Note + that although this very simple webserver setup is hardly + cutting-edge -- it is, after all, a fairly small machine + -- 139 requests per second equal over twelve million hits + per day. + + + + If you run this benchmark yourself, take note of three + things: + + + + Use a target URL that is on your own webserver :-). + + + + + Wait until all your connections are closed out between + runs; after several thousand requests your TCP/IP + stack will be filled with hundreds of connections in + TIME_WAIT that need to close. Do a "netstat -t|wc -l" + on the webserver to see. If you don't wait, you can + expect to see a lot of messages like "ip_conntrack: + table full, dropping packet" in your logs. (This has + nothing to do with mod_log_sql, this is simply the + nature of the TCP/IP stack in the Linux kernel.) + + + + + When done with your runs, clean these many thousands + of requests out of your database: + + mysql> delete from access_log where agent like 'ApacheBench%'; +mysql> optimize table access_log; + + + + + + + + Do I need to be worried about all the running MySQL + children? Will holding open n Apache-to-MySQL connections + consume a lot of memory? + + + + Short answer: you shouldn't be worried. + + + + Long answer: you might be evaluating at the output of "ps + -aufxw" and becoming alarmed at all the 7MB httpd + processes or 22MB mysqld children that you see. Don't be + alarmed. It's true that mod_log_sql opens and holds open + many MySQL connections: each httpd child maintains one + open database connection (and holds it open for + performance reasons). Four webservers, each running 20 + Apache children, will hold open 80 MySQL connections, + which means that your MySQL server needs to handle 80 + simultaneous connections. In truth, your MySQL server + needs to handle far more than that if traffic to your + website spikes and the Apache webservers spawn off an + additional 30 children each... + + + Fortunately the cost reported by 'ps -aufxw' is deceptive. + This is due to an OS memory-management feature called + "copy-on-write." When you have a number of identical child + processes (e.g. Apache, MySQL), it would appear in "ps" as + though each one occupies a great deal of RAM -- as much as + 7MB per httpd child! In actuality each additional child + only occupies a small bit of extra memory -- most of the + memory pages are common to each child and therefore shared + in a "read-only" fashion. The OS can get away with this + because the majority of memory pages for one child are + identical across all children. Instead of thinking of each + child as a rubber stamp of the others, think of each child + as a basket of links to a common memory area. + + + A memory page is only duplicated when it needs to be + written to, hence "copy-on-write." The result is + efficiency and decreased memory consumption. "ps" may + report 7MB per child, but it might really only "cost" 900K + of extra memory to add one more child. It is not correct + to assume that 20 Apache children with a VSZ of 7MB each + equals (2 x 7MB) of memory consumption -- the real answer + is much, much lower. The same "copy-on-write" rules apply + to all your MySQL children: 40 mysqld children @ 22MB each + do not occupy 880MB of RAM. + + + The bottom line: although there is a cost to spawn extra + httpd or mysqld children, that cost is not as great as + "ps" would lead you to believe. + + + + + + + My webserver cannot handle all the traffic that my site + receives, is there anything I can do? + + + + + If you have exhausted all the tuning possibilities on your + existing server, it is probably time you evaluated the + benefits of clustering two or more webservers together in + a load-balanced fashion. In fact, users of such a setup + are mod_log_sql's target audience! + + + + + + + What is the issue with activating delayed inserts? + + + + + INSERT DELAYED is a specific syntax to MySQL and is not + supported by any other database. Ergo, why is it needed, + and what MySQL deficiency is it working around? INSERT + DELAYED is a kluge. + + + + + The MySQL documentation is unclear whether INSERT DELAYED + is even necessary for an optimized database. It says, "The + DELAYED option for the INSERT statement is a + MySQL-specific option that is very useful if you have + clients that can't wait for the INSERT to complete." But + then it goes on to say, "Note that as MyISAM tables + supports concurrent SELECT and INSERT, if there is no free + blocks in the middle of the data file, you very seldom + need to use INSERT DELAYED with MyISAM." + + + + + Because INSERT DELAYED returns without waiting for the + data to be written, a hard kill of your MySQL database at + the right (wrong?) moment could lose those logfile + entries. + + + + + As of MySQL version 3.23.52, the error return functions + disagree after a failed INSERT DELAYED: mysql_errno() + always returns 0, even if mysql_error() returns a textual + error. I have reported this bug to the MySQL folks. + However, we have no way of knowing what solution they will + adopt to fix this, and with the worst case solution + mod_log_sql would not be able to tell if anything went + wrong with a delayed insert. + + + + + Instead of delayed inserts, you may wish to utilize InnoDB + tables (instead of the standard MyISAM tables). InnoDB + tables suppot row-level locking and are recommended for + high-volume databases. + + + + + If after understanding these problems you still wish to + enable delayed inserts, section + + discusses how. + + + + + + "How do I...?" -- accomplishing certain tasks + + + + How do I extract the data in a format that my analysis + tool can understand? + + + + + mod_log_sql would be virtually useless if there weren't a + way for you to extract the data from your database in a + somewhat meaningful fashion. To that end there's a Perl + script enclosed with the distribution. That script + (make_combined_log.pl) is designed to extract N-many days + worth of access logs and provide them in a Combined Log + Format output. You can use this very tool right in + /etc/crontab to extract logs on a regular basis so that + your favorite web analysis tool can read them. Or you can + examine the Perl code to construct your own custom tool. + + + For example, let's say that you want your web statistics + updated once per day in the wee hours of the morning. A + good way to accomplish that could be the following entries + in /etc/crontab: + + # Generate the temporary apache logs from the MySQL database (for webalizer) +05 04 * * * root make_combined_log.pl 1 www.grubbybaby.com > /var/log/temp01 +# Run webalizer on httpd log +30 04 * * * root webalizer -c /etc/webalizer.conf; rm -f /var/log/temp01 + + Or if you have a newer system that puts files in + /etc/cron.daily etc., create a file called "webalizer" in + the cron.daily subdirectory. Use the following as the + contents of your file, and make sure to chmod 755 it when + done. + + #!/bin/sh +/usr/local/sbin/make_combined_log.pl 1 www.yourdomain.com > /var/log/httpd/templog +/usr/local/bin/webalizer -q -c /etc/webalizer.conf +rm -f /var/log/httpd/templog + See? Easy. + + + + + How can I log mod_usertrack cookies? + + + + A number of people like to log mod_usertrack cookies in + their Apache TransferLog to aid in understanding their + visitors' clickstreams. This is accomplished, for example, + with a statement as follows: + + LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" \"%{cookie}n\"" + + Naturally it would be nice for mod_log_sql to permit the + admin to log the cookie data as well, so as of version + 1.10 you can do this. You need to have already compiled + mod_usertrack into httpd -- it's one of the standard + Apache modules. + + + First make sure you have a column called "cookie" in the + MySQL database to hold the cookies, which can be done as + follows if you already have a working database: + + mysql> alter table acc_log_tbl add column cookie varchar(255); + + Next configure your server to set usertracking cookies as + follows, and make sure you include the new 'c' directive + in your LogSQLTransferLogFormat, which activates cookie + logging. Here's an example: + + <VirtualHost 1.2.3.4> + CookieTracking on + CookieStyle Cookie + CookieName Foobar + LogSQLTransferLogFormat huSUsbTvRAc + LogSQLWhichCookie Foobar +</VirtualHost> + + The first three lines configure mod_usertrack to create a + COOKIE (RFC 2109) format cookie called Foobar. The last + two lines tell mod_log_sql to log cookies named Foobar. + You have to choose which cookie to log because more than + one cookie can/will be sent to the server by the client. + + + Recap: the 'c' character activates cookie logging, and the + LogSQLWhichCookie directive chooses which cookie to log. + + + FYI, you are advised NOT to use CookieStyle Cookie2 -- it + seems that even newer browsers (IE 5.5, etc.) have trouble + with the new COOKIE2 (RFC 2965) format. Just stick with + the standard COOKIE format and you'll be fine. + + + Perform some hits on your server and run a select + + SELECT request_uri,cookie +FROM access_log +WHERE cookie IS NOT NULL; + + + + + + + + request_uri + cookie + + + + + /mod_log_sql/ + + ool-18e4.dyn.optonline.net.130051007102700823 + + + + /mod_log_sql/usa.gif + + ool-18e4.dyn.optonline.net.130051007102700823 + + + + /mod_log_sql/style_1.css + + ool-18e4.dyn.optonline.net.130051007102700823 + + + + +
+
+
+ + + + What if I want to log more than one cookie? What is the + difference between LogSQLWhichCookie and + LogSQLWhichCookies? + + + + + As of version 1.17, you have a choice in how you want + cookie logging handled. + + + If you are interested in logging only one cookie per + request, follow the instructions in FAQ entry + + above. That cookie will be logged to a column in the + regular access_log table, and the actual cookie you want + to log is specified with LogSQLWhichCookie. Don't forget + to specify the 'c' character in LogSQLTransferLogFormat. + + + If, however, you need to log multiple cookies per request, + you must employ the LogSQLWhichCookies (note the plural) + directive. The cookies you specify will be logged to a + separate table (as discussed in section + + ), and entries in that table will be linked to the regular + access_log entries via the unique ID that is supplied by + mod_unique_id. Without mod_unique_id the information will + still be logged but you will be unable to correlate which + cookies go with which access-requests. Furthermore, with + LogSQLWhichCookies, you do not need to include the 'c' + character in LogSQLTransferLogFormat. + + + LogSQLWhichCookie and LogSQLWhichCookies can coexist + without conflict because they operate on entireley + different tables, but you're better off choosing the one + you need. + + + + + + + What are the SSL logging features, and how do I activate + them? + + + + + + You do not need to compile SSL support into mod_log_sql + in order to simply use it with a secure site. You only + need to compile SSL support into mod_log_sql if you want + to log SSL-specific data such as the cipher type used, + or the keysize that was negotiated. If that information + is unimportant to you, you can ignore this FAQ. + + + + By adding certain characters to your + LogSQLTransferLogFormat string you can tell mod_log_sql to + log the SSL cipher, the SSL keysize of the connection, and + the maximum keysize that was available. This would let you + tell, for example, which clients were using only + export-grade security to access your secure software area. + + + You can compile mod_log_sql with SSL logging support if + you have the right packages installed. If you already have + an SSL-enabled Apache then you by definition have the + correct packages already installed: OpenSSL and mod_ssl. + + + You need to ensure that your database is set up to log the + SSL data. Issue the following commands to MySQL if your + access table does not already have them: + + mysql> alter table access_log add column ssl_cipher varchar(25); +mysql> alter table access_log add column ssl_keysize smallint unsigned; +mysql> alter table access_log add column ssl_maxkeysize smallint unsigned; + + Finally configure httpd.conf to activate the SSL fields. + Note that this is only meaningful in a VirtualHost that is + set up for SSL. + + <VirtualHost 1.2.3.4:443> + LogSQLTransferLogFormat AbHhmRSsTUuvcQqz +</VirtualHost> + + You also need to make sure you have the mod_log_sql_ssl + module loaded as well. + + + The last three characters (Qqz) in the directive are the + SSL ones; see section + + in the directives documentation for details of the + LogSQLTransferLogFormat directive. + + + Restart Apache, then perform some hits on your server. + Then run the following select statement: + + SELECT remote_host,request_uri,ssl_cipher,ssl_keysize,ssl_maxkeysize +FROM access_log +WHERE ssl_cipher IS NOT NULL; + + + + + + + + + + + remote_host + request_uri + ssl_cipher + ssl_keysize + ssl_maxkeysize + + + + + 216.192.52.4 + /dir/somefile.html + RC4-MD5 + 128 + 128 + + + 216.192.52.4 + /dir/somefile.gif + RC4-MD5 + 128 + 128 + + + 216.192.52.4 + /dir/somefile.jpg + RC4-MD5 + 128 + 128 + + + +
+
+
+
+
+
+
diff --git a/docs/contents.png b/docs/contents.png new file mode 100644 index 0000000..0c752c6 Binary files /dev/null and b/docs/contents.png differ diff --git a/docs/crossref.png b/docs/crossref.png new file mode 100644 index 0000000..7dd2ddd Binary files /dev/null and b/docs/crossref.png differ diff --git a/docs/documentation.css b/docs/documentation.css new file mode 100644 index 0000000..f12a3af --- /dev/null +++ b/docs/documentation.css @@ -0,0 +1,33 @@ +body { + margin-left: 3%; + background-color: #ccccFF; + font-family: sans-serif; +} + +h1 { + text-align: center; +} + +h2 { + margin-left: -1%; + font-size: 150%; +} + +h3 { + margin-left: -1%; +} + +h4 { + text-align: center; +} + +p { +} + +pre { + font-family: monospace; +} + +DD { + font-family: monospace; +} \ No newline at end of file diff --git a/docs/documentation.html b/docs/documentation.html new file mode 100644 index 0000000..d63b31f --- /dev/null +++ b/docs/documentation.html @@ -0,0 +1,269 @@ + + + + + +Installing and Running mod_log_sql + + + + + + + + + + + + + + + + + +next +up +previous + +contents +
+ Next: Contents +   Contents +
+
+ + +

+ + + + +

+ +

+

Installing and Running mod_log_sql

+

Christopher Powell, <chris@grubbybaby.com>

+

+ + + + + +

+
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/img1.png b/docs/img1.png new file mode 100644 index 0000000..d34217e Binary files /dev/null and b/docs/img1.png differ diff --git a/docs/img2.png b/docs/img2.png new file mode 100644 index 0000000..df51a7c Binary files /dev/null and b/docs/img2.png differ diff --git a/docs/img3.png b/docs/img3.png new file mode 100644 index 0000000..1cd68cf Binary files /dev/null and b/docs/img3.png differ diff --git a/docs/img4.png b/docs/img4.png new file mode 100644 index 0000000..1e2543f Binary files /dev/null and b/docs/img4.png differ diff --git a/docs/index.html b/docs/index.html new file mode 100644 index 0000000..d63b31f --- /dev/null +++ b/docs/index.html @@ -0,0 +1,269 @@ + + + + + +Installing and Running mod_log_sql + + + + + + + + + + + + + + + + + +next +up +previous + +contents +
+ Next: Contents +   Contents +
+
+ + +

+ + + + +

+ +

+

Installing and Running mod_log_sql

+

Christopher Powell, <chris@grubbybaby.com>

+

+ + + + + +

+
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/next.png b/docs/next.png new file mode 100644 index 0000000..1628652 Binary files /dev/null and b/docs/next.png differ diff --git a/docs/next_g.png b/docs/next_g.png new file mode 100644 index 0000000..9d3f591 Binary files /dev/null and b/docs/next_g.png differ diff --git a/docs/node1.html b/docs/node1.html new file mode 100644 index 0000000..0238180 --- /dev/null +++ b/docs/node1.html @@ -0,0 +1,128 @@ + + + + + +Contents + + + + + + + + + + + + + + + + + + + + +next + +up + +previous +
+ Next: 1 Introduction + Up: Installing and Running mod_log_sql + Previous: Installing and Running mod_log_sql +
+
+ +
+ +

+Contents +

+ + + + + +

+


+
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/node2.html b/docs/node2.html new file mode 100644 index 0000000..33a2849 --- /dev/null +++ b/docs/node2.html @@ -0,0 +1,276 @@ + + + + + +1 Introduction + + + + + + + + + + + + + + + + + + + + +next + +up + +previous + +contents +
+ Next: 2 Installation + Up: Installing and Running mod_log_sql + Previous: Contents +   Contents +
+
+ + +Subsections + + + +
+ +

+1 Introduction +

+ +

+ +

+1.1 Homepage +

+ +

+ +

+
+
http://www.grubbybaby.com/mod_log_sql/ +
+
+

+ +

+1.2 Summary +

+ +

+This Apache module will permit you to log to a SQL database; it can +log each access request as well as data associated with each request: +cookies, notes, and inbound/outbound headers. Unlike logging to a +flat text file - which is standard in Apache - a SQL-based log exhibits +tremendous flexibility and power of data extraction. (See section +4.1.1 in the FAQ for further discussion and examples of the +advantages to SQL.) + +

+This module can either replace or happily coexist with mod_log_config, +Apache's text file logging facility. In addition to being more configurable +than the standard module, mod_log_sql is much more flexible. + +

+ +

+1.3 Approach +

+ +

+This project was formerly known as ``mod_log_mysql.'' It was +renamed ``mod_log_sql'' in order to reflect the project goal +of database-inspecificity. The module currently supports MySQL, but +support for other database backends is underway. + +

+In order to save speed and overhead, links are kept alive in between +queries. This module uses one dedicated SQL link per httpd child, +opened by each child process when it is born. Among other things, +this means that this module supports logging into only one MySQL server, +and for now, also, only one SQL database. But that's a small tradeoff +compared to the blinding speed of this module. Error reporting is +robust throughout the module and will inform the administrator of +database issues in the Apache ERRORLOG for the server/virtual +server. + +

+Virtual hosts are supported in the same manner they are in the regular +logging modules. The administrator defines some basic 'global' directives +in the main server config, then defines more specific 'local' directives +inside each VirtualHost stanza. + +

+A robust "preserve" capability has now been implemented. +This permits the module to preserve any failed INSERT commands to +a local file on its machine. In any situation that the database is +unavailable - e.g. the network fails or the database host is rebooted +- mod_log_sql will note this in the error log and begin appending +its log entries to the preserve file (which is created with the user +& group ID of the running Apache process, e.g. "nobody/nobody" +on many Linux installations). When database availablity returns, mod_log_sql +seamlessly resumes logging to it. When convenient for the sysadmin, +he/she can easily import the preserve file into the database because +it is simply a series of SQL insert statements. + +

+ +

+1.4 What gets logged by default? +

+ +

+All the data that would be contained in the "Combined Log +Format" is logged by default, plus a little extra. Your best +bet is to begin by accepting this default, then later customize the +log configuration based on your needs. + +

+The documentation of the run-time directives includes a full explanation +of what you can log, including examples - see section 3.6. + +

+ +

+1.5 Miscellaneous Notes +

+ +

+ +

+ +

+ +

+1.6 Author / Maintainer +

+ +

+The actual logging code was taken from the already existing flat file +text modules, so all that credit goes to the Apache Server group. + +

+The MySQL routines and directives were added by Zeev Suraski <bourbon@netvision.net.il>. + +

+All changes from 1.06+ and the new documentation were added by Chris +Powell <chris@grubbybaby.com>. It seems that the module had fallen +into the "unmaintained" category - it hadn't been +updated since 1998 - so Chris adopted it as the new maintainer. + +

+


+ + +next + +up + +previous + +contents +
+ Next: 2 Installation + Up: Installing and Running mod_log_sql + Previous: Contents +   Contents + +
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/node3.html b/docs/node3.html new file mode 100644 index 0000000..e832759 --- /dev/null +++ b/docs/node3.html @@ -0,0 +1,664 @@ + + + + + +2 Installation + + + + + + + + + + + + + + + + + + + + +next + +up + +previous + +contents +
+ Next: 3 Configuration + Up: Installing and Running mod_log_sql + Previous: 1 Introduction +   Contents +
+
+ + +Subsections + + + +
+ +

+2 Installation +

+ +

+ +

+2.1 Requirements +

+ +

+ +

+ +

+ +

+2.2 Platform-specific notes +

+ +

+These installation documents assume a relatively modern GNU/Linux +scenario. mod_log_sql has been ported to other platforms; following +are notes on compiling the module for those platforms. + +

+ +

+2.2.1 Solaris +

+ +

+The nanosleep() function used in mod_log_sql relies on linking aginst +the librt library. Make the following alterations before proceeding: + +

+ +

    +
  1. In Makefile, search for the string ``-lmysqlclient -lz'' and change +it to read ``-lmysqlclient -lz -lrt'' +
  2. +
  3. In part 8a of section 2.5 below, change +``-lmysqlclient -lm -lz'' to read ``-lmysqlclient -lm -lz -lrt'' +
  4. +
+ +

+ +

+2.2.2 BSD +

+ +

+No notes are available at present, but they are desired. If you have +successfully ported mod_log_sql to BSD, please contact the maintaniner, Chris Powell (chris@grubbybaby.com) +and help fill in this section. + +

+ +

+2.2.3 Win32 +

+ +

+No notes are available at present, but they are desired. If you have +successfully ported mod_log_sql to Win32, please contact +the maintaniner, Chris Powell (chris@grubbybaby.com) and help +fill in this section. + +

+ +

+2.3 Do I want a DSO or a static module? +

+ +

+You need to know the answer to this question before you proceed. The +answer is pretty straightforward: what have you done in the past? +If you like all your Apache modules to be dynamic, then you should +keep doing that. If you're more of an old-school type and prefer to +compile the modules right into apache, do that. Both methods work +equally well. + +

+FWIW, the DSO method is more modern and increasing in popularity because +apxs takes care of a lot of dirty little details for you. As you'll +see below, the static-module method is a little more complex. + +

+ +

+2.4 Installation as an Apache DSO (Preferred) +

+ +

+ +

    +
  1. Perform all the following steps as root so that you have install privs, +etc. Unpack the archive into a working directory. + +

    + +

    +
    +
    # tar zxf mod_log_sql.tar.gz -C /usr/local/src  + +

    +# cd /usr/local/src/mod_log_sql +

    +
    +
  2. +
  3. Edit Makefile and change the values of the variables in the first +section. + +

    + +

      +
    1. These paths are necessary: + +

      +

      +
      APACHEINSTALLED:
      +
      the location where you installed Apache - usually +/usr/local/apache, 'locate apxs' can help you find it. +
      +
      APACHEHEADERS:
      +
      The location of your Apache header files, find using +'locate httpd.h' +
      +
      MYSQLLIBRARIES:
      +
      The location of your MySQL libraries, find using +'locate libmysqlclient.so' +
      +
      MYSQLHEADERS:
      +
      The location of your MySQL header files, find using +'locate mysql.h' +
      +
      +
    2. +
    3. Optional: if you compiled mod_ssl for Apache and want to +log SSL data such as 'keysize' and 'cipher type': + +

      +

      +
      MODSSLHEADERS:
      +
      the location of your mod_ssl header files, find +using 'locate mod_ssl.h' +
      +
      DB1HEADERS:
      +
      the location of your db1 header files, find using 'locate +ndbm.h' +
      +
      +
    4. +
    +You do not need to compile SSL support into mod_log_sql +in order to simply use it with a secure site. You only need to compile +SSL support into mod_log_sql if you want to log SSL-specific +data such as the cipher type. + +

    +

  4. +
  5. IMPORTANT: If you are not logging SSL info, comment out MODSSLHDRS +by putting a # character in front of it: + +

    + +

    +
    +
    #MODSSLHDRS=/usr/include/... +
    +
    +
  6. +
  7. Instruct apxs to compile the module as a DSO. + +

    + +

    +
    +
    # make dso +
    +
    You should see output similar to the following: + +

    + +

    +
    +
    /usr/local/Apache/bin/apxs -Wc,-O2 -Wc,-Wall -Wc,-DEAPI -c -I/usr/... + +

    +gcc -DLINUX=22 -DNO_DBM_REWRITEMAP -DMOD_SSL=208111 -DUSE_HS...  + +

    +gcc -shared -o mod_log_sql.so mod_log_sql.o -Wc,-O2 -Wc,-Wall -Wc... +

    +
    You should see no errors and have a new file called "mod_log_sql.so" +in your directory. + +

    +

  8. +
  9. Instruct apxs to install the DSO. + +

    + +

    +
    +
    # make dsoinstall +
    +
    You should see output similar to the following: + +

    + +

    +
    +
    /usr/local/Apache/bin/apxs -i mod_log_sql.so  + +

    +cp mod_log_sql.so /usr/local/Apache/libexec/mod_log_sql.so  + +

    +chmod 755 /usr/local/Apache/libexec/mod_log_sql.so  +

    +
    +
  10. +
  11. Load and activate the module in httpd.conf: + +

    + +

      +
    1. Insert this line in the same area as other logging modules, e.g. near +``LoadModule config_log_module'': + +

      + +

      +
      +
      LoadModule sql_log_module libexec/mod_log_sql.so +
      +
      +
    2. +
    3. Insert this line in the same area as other logging modules, e.g. near +``AddModule mod_log_config.c'': + +

      + +

      +
      +
      AddModule mod_log_sql.c +
      +
      +
    4. +
    +
  12. +
  13. Module ordering within httpd.conf is important if you are logging +SSL information. Please ensure that + +

    + +

    +
    +
    LoadModule ssl_module libexec/libssl.so +
    +
    comes before + +

    + +

    +
    +
    LoadModule sql_log_module libexec/mod_log_sql.so +
    +
    in your httpd.conf file. If they are out of order, simply cut-and-paste +the ``ssl_module'' section so that it is at the top. If you do +not, you will get this error when you start Apache: + +

    + +

    +
    +
    /usr/local/apache/libexec/mod_log_mysql.so: undefined symbol: ssl_var_lookup + +

    +/usr/local/apache/bin/apachectl startssl: httpd could not be started +

    +
    (mod_log_sql has a dependency on mod_ssl for SSL symbols. If the +statements are out of order, mod_log_sql cannot recognize those +symbols.) + +

    +Now skip below to section 3, Configuration. + +

    +

  14. +
+ +

+ +

+
+2.5 Installation as a static module compiled into +httpd +

+ +

+ +

    +
  1. Perform all the following steps as root so that you have install privs, +etc. +
  2. +
  3. Unpack the archive into a working directory. + +

    + +

    +
    +
    # tar zxf mod_log_sql.tar.gz -C /usr/local/src  + +

    +# cd /usr/local/src/mod_log_sql +

    +
    +
  4. +
  5. Edit Makefile and change the values of the variables +in the first section. + +

    + +

      +
    1. These are necessary: + +

      +

      +
      APACHEINSTALLED:
      +
      the location where you installed Apache - usually +/usr/local/apache, 'locate apxs' can help you find it. +
      +
      APACHESOURCE:
      +
      the location of your Apache sources, find +using 'locate ABOUT_APACHE' +
      +
      APACHEHEADERS:
      +
      the location of your Apache header files, find using +'locate httpd.h' +
      +
      MYSQLLIBRARIES:
      +
      the location of your MySQL libraries, find using +'locate libmysqlclient.so' +
      +
      MYSQLHEADERS:
      +
      the location of your MySQL header files, find using +'locate mysql.h' +
      +
      +
    2. +
    3. Optional: if you compiled mod_ssl for Apache and want to +log SSL data such as 'keysize' and 'cipher type': + +

      +

      +
      MODSSLHEADERS:
      +
      the location of your mod_ssl header files, find +using 'locate mod_ssl.h' +
      +
      DB1HEADERS:
      +
      the location of your db1 header files, find using 'locate +ndbm.h' +
      +
      +
    4. +
    +You do not need to compile SSL support into mod_log_sql +in order to simply use it with a secure site. You only need to compile +SSL support into mod_log_sql if you want to log SSL-specific +data such as the cipher type. + +

    +

  6. +
  7. IMPORTANT: If you are not logging SSL info, comment out MODSSLHDRS +by putting a # character in front of it: + +

    + +

    +
    +
    #MODSSLHDRS=/usr/include/... +
    +
    +
  8. +
  9. Compile the module. + +

    + +

    +
    +
    # make static +
    +
    You should see output similar to the following: + +

    + +

    +
    +
    gcc -fpic -O2 -Wall -I/usr/local/Apache/include -I/usr/include/mysql -I/usr/lo... +
    +
    You should see no errors and have a new file called "mod_log_sql.o" +in your directory. + +

    +

  10. +
  11. Install the module. + +

    + +

    +
    +
    # make statinstall +
    +
    +
  12. +
  13. Change to your Apache source directory. + +

    + +

    +
    +
    # cd /usr/local/src/apache-1.3.22/src +
    +
    +
  14. +
  15. Re-compile your httpd binary as follows. + +

    + +

      +
    1. Make these changes to Configuration.apaci: + +

      + +

        +
      • Append the following string to the EXTRA_LIBS= line. ("/usr/lib/mysql" +is from step 3, and is where your MySQL libraries +live): +
      • +
      + +
      +
      +
      -L/usr/lib/mysql -lmysqlclient -lm -lz +
      +
      +
        +
      • Find the mod_log_config.o line, and insert this line immediately +after it: +
      • +
      + +
      +
      +
      AddModule modules/sql/mod_log_sql.o +
      +
      +
    2. +
    3. # cp Configuration.apaci Configuration +
    4. +
    5. # ./Configure +
    6. +
    7. # make +
    8. +
    9. # strip httpd +
    10. +
    +
  16. +
  17. Test your new apache binary: + +

    + +

    +
    +
    # ./httpd -l +
    +
    You should see something like: + +

    + +

    +
    +
    Compiled-in modules:  + +

    +http_core.c + +

    +mod_log_sql.c       <- That's the line you're looking for. + +

    +mod_env.c  + +

    +mod_log_config.c  + +

    +mod_mime.c  + +

    +mod_negotiation.c + +

    +etc... +

    +
    +
  18. +
  19. Install your httpd binary. Copy it over your old httpd binary, wherever +it lives. You can and should rename your old httpd first so that you +can easily revert to that working version in case of bugs with the +new version. + +

    + +

    +
    +
    # /etc/rc.d/init.d/httpd stop  + +

    +# mv /usr/local/Apache/bin/httpd ~/httpd-save  + +

    +# cp -f ./httpd /usr/local/Apache/bin/ +

    +
    +
  20. +
+ +

+


+ + +next + +up + +previous + +contents +
+ Next: 3 Configuration + Up: Installing and Running mod_log_sql + Previous: 1 Introduction +   Contents + +
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/node4.html b/docs/node4.html new file mode 100644 index 0000000..287333c --- /dev/null +++ b/docs/node4.html @@ -0,0 +1,2014 @@ + + + + + +3 Configuration + + + + + + + + + + + + + + + + + + + + +next + +up + +previous + +contents +
+ Next: 4 FAQ + Up: Installing and Running mod_log_sql + Previous: 2 Installation +   Contents +
+
+ + +Subsections + + + +
+ +

+
+3 Configuration +

+ +

+ +

+
+3.1 Preparing MySQL for logging +

+ +

+You have to prepare the database to receive data from mod_log_sql, +and set up run-time directives in httpd.conf to control how and what +mod_log_sql logs. + +

+This section will discuss how to get started with a basic config. +Full documentation of all available run-time directives is available +in section 3.6. + +

+ +

    +
  1. mod_log_sql can make its own tables on-the-fly, or you can pre-make +the tables by hand. The advantage of letting the module make the tables +is ease-of-use, but for raw performance you will want to pre-make +the tables in order to save some overhead. In this basic setup we'll +just let the module create tables for us. +
  2. +
  3. We still need to have a logging database created and ready, so run +the MySQL command line client and create a database: + +

    + +

    +
    +
    # mysql -uadmin -pmypassword  + +

    +Enter password: + +

    +mysql> create database apachelogs; +

    +
    +
  4. +
  5. If you want to hand-create the tables, run the +enclosed 'create-tables' SQL script as follows: + +

    + +

    +
    +
    mysql> source create_tables.sql +
    +
    +
  6. +
  7. Create a specific MySQL userid that httpd will use to authenticate +and enter data. This userid need not be an actual Unix user. It is +a userid internal to MySQL with specific privileges. In the following +example command, "apachelogs" is the database, "loguser" +is the userid to create, "my.apachemachine.com" +is the name of the Apache machine, and "l0gger" +is the password to assign. Choose values that are different from these +examples. + +

    + +

    +
    +
    mysql> grant insert,create on apachelogs.* to loguser@my.apachemachine.com + +

    +identified by 'l0gger'; +

    +
    +
  8. +
  9. You may be especially security-paranoid and want "loguser" +to not have "create" capability within the +"apachelogs" database. You can disable that privilege, +but the cost is that you will not be able to use the module's on-the-fly +table creation feature. If that cost is acceptable, hand-create the +tables as described in step 3 and use the following +GRANT statement instead of the one above: + +

    + +

    +
    +
    mysql> grant insert on apachelogs.* to loguser@my.apachemachine.com + +

    +identified by 'l0gger'; +

    +
    +
  10. +
  11. Enable full logging of your MySQL daemon (at least +temporarily for debugging purposes) if you don't do this already. +Edit /etc/my.cnf and add the following line to your [mysqld] section: + +

    + +

    +
    +
    log=/var/log/mysql-messages +
    +
    Then restart MySQL. + +

    + +

    +
    +
    # /etc/rc.d/init.d/mysql restart +
    +
    +
  12. +
+ +

+ +

+3.2 A very basic logging setup in Apache +

+ +

+ +

    +
  1. Tell the module what database to use and the appropriate authentication +information. + +

    +So, edit httpd.conf and insert the following lines somewhere after +any LoadModule / AddModule statements. Make sure these statements +are ``global,'' i.e. not inside any VirtualHost stanza. You will +also note that you are embedding a password in the file. Therefore +you are advised to ``chmod 660 httpd.conf'' to prevent unauthorized +regular users from viewing your database user and password. + +

    +Example: Use the MySQL database called "apachelogs" +running on "dbmachine.foo.com". Use username "loguser" +and password "l0gg3r" to authenticate to the database. +Permit the module create tables for us. + +

    + +

    +
    +
    LogSQLLoginInfo dbmachine.foo.com loguser l0gg3r  + +

    +LogSQLDatabase apachelogs + +

    +LogSQLCreateTables on +

    +
    If your database resides on localhost instead of another host, specify +the MySQL server's socket file as follows: + +

    + +

    +
    +
    LogSQLSocketFile /your/path/to/mysql.sock +
    +
    If your database is listening on a port other than 3306, specify the +correct TCP port as follows: + +

    + +

    +
    +
    LogSQLTCPPort 1234 +
    +
    +
  2. +
  3. The actual logging is set up on a virtual-host-by-host basis. So, +skip down to the virtual host you want to set up. Instruct this virtual +host to log entries to the table ``access_log'' by inserting +a LOGSQLTRANSFERLOGTABLE directive. (The LOGSQLTRANSFERLOGTABLE +directive is the minimum required to log - other directives that +you'll learn about later simply tune the module's behavior.) + +

    + +

    +
    +
    <VirtualHost 1.2.3.4> + +

    + [snip] + +

    + LogSQLTransferLogTable access_log + +

    + [snip] + +

    +</VirtualHost> +

    +
    +
  4. +
  5. Restart apache. + +

    + +

    +
    +
    # /etc/rc.d/init.d/httpd stop + +

    +# /etc/rc.d/init.d/httpd start +

    +
    +
  6. +
+ +

+ +

+3.3 Testing the basic setup +

+ +

+ +

    +
  1. Visit your web site in a browser to trigger some hits, then confirm +that the entries are being successfully logged: + +

    + +

    +
    +
    # mysql -hdbmachine.foo.com -umysqladmin -p -e "select * from access_log" apachelogs  + +

    +Enter password: +

    +
    Several lines of output should follow, corresponding to your hits +on the site. You now have basic functionality. Don't disable your +regular Apache logs until you feel comfortable that the database is +behaving as you'd like and that things are going well. If you do not +see any entries in the access_log, please consult section 4.2.2 +of the FAQ on how to debug and fix the situation. + +

    +

  2. +
  3. You can now activate the advanced features of mod_log_sql, which +are described in the next section. +
  4. +
+ +

+ +

+3.4 How to tune logging with run-time directives +

+ +

+ +

+3.4.1 Instructing the module what to log +

+ +

+The most basic directive for the module is LOGSQLTRANSFERLOGFORMAT, +which tells the module which information to send to the database; +logging to the database will not take place without it. Place a LOGSQLTRANSFERLOGFORMAT +directive in the VirtualHost stanza of each virtual host that you +want to activate. + +

+After LOGSQLTRANSFERLOGFORMAT you supply a string of characters +that tell the module what information to log. In the configuration +directive reference (section 3.6.17) there is a table which +clearly defines all the possible things to log. Let's say you want +to log only the ``request time,'' the ``remote host,'' and +the ``request''; you'd use: + +

+ +

+
+
LogSQLTransferLogFormat hUS +
+
But a more appropriate string to use is + +

+ +

+
+
LogSQLTransferLogFormat AbHhmRSsTUuv +
+
which logs all the information required to be compatible with the +Combined Log Format (CLF). + +

+If you don't choose to log everything that is available, that's fine. +Fields in the unused columns in your table will simply contain NULL. + +

+Some of the LOGSQLTRANSFERLOGFORMAT characters require a +little extra configuration: + +

+ +

+ +

+ +

+
+3.4.2 Instructing the module what NOT to log using filtering +directives +

+ +

+One ``accept'' and two ``ignore'' directives allow you to +fine-tune what the module should not log. These are very handy for +keeping your database as uncluttered as possible and keeping your +statistics free of unneeded numbers. Think of each one as a gatekeeper. + +

+It is important to remember that each of these three directives +is purely optional. mod_log_sql's default is to log everything. + +

+When a request comes in, the contents of LOGSQLREQUESTACCEPT +are evaluated first. This optional, ``blanket'' directive lets +you specify that only certain things are to be accepted for logging, +and everything else discarded. Because it is evaluated before LOGSQLREQUESTIGNORE +and LOGSQLREMHOSTIGNORE it can halt logging before those +two filtering directives ``get their chance.'' + +

+Once a request makes it past LOGSQLREQUESTACCEPT, it still +can be excluded based on LOGSQLREMHOSTIGNORE and LOGSQLREQUESTIGNORE. +A good way to use LOGSQLREMHOSTIGNORE is to prevent the module +from logging the traffic that your internal hosts generate. LOGSQLREQUESTIGNORE +is great for preventing things like requests for ``favicon.ico'' +from cluttering up your database, as well as excluding the various +requests that worms make, etc. + +

+You can specify a series of strings after each directive. Do not use +any type of globbing or regular-expression syntax - each string is +considered a match if it is a substring of the larger request +or remote-host; the comarison is case-sensitive. This means that +``LOGSQLREMHOSTIGNORE micro'' will ignore requests from +``microsoft.com,'' ``microworld.net,'' ``mymicroscope.org,'' +etc. ``LOGSQLREQUESTIGNORE gif'' will instruct the module +to ignore requests for ``leftbar.gif,'' ``bluedot.gif'' and +even ``giftwrap.jpg'' - but ``RED.GIF'' and ``Tree.Gif'' +would still get logged because of case sensitivity. + +

+A summary of the decision flow: + +

+ +

    +
  1. If LOGSQLREQUESTACCEPT exists and a request does not match +anything in that list, it is discarded. +
  2. +
  3. If a request matches anything in the LOGSQLREQUESTIGNORE +list, it is discarded. +
  4. +
  5. If a reqiest matches anything in the LOGSQLREMHOSTIGNORE +list, it is discarded. +
  6. +
  7. Otherwise the request is logged. +
  8. +
+This means that you can have a series of directives similar to the +following: + +

+ +

+
+
LogSQLRequestAccept *.html *.gif *.jpg + +

+LogSQLRequestIgnore statistics.html bluedot.jpg +

+
So the first line instructs the module to only log files +with html, gif and jpg suffixes; requests for ``formail.cgi'' +and ``shopping-cart.pl'' will never be considered for logging. +(``LeftArrow.JPG'' will also never be considered for logging - +remember, the comparison is case sensitive.) The second line +prunes the list further - you never want to log requests for those +two objects. + +

+Tip: if you want to match all the hosts in your domain such as ``host1.corp.foo.com'' +and ``server.dmz.foo.com'', simply specify: + +

+ +

+
+
LogSQLRemhostIgnore foo.com +
+
Tip: a great way to catch the vast majority of worm-attack requests +and prevent them from being logged is to specify: + +

+ +

+
+
LogSQLRequestIgnore root.exe cmd.exe default.ida +
+
Tip: to prevent the logging of requests for common graphic types, +make sure to put a '.' before the suffix to avoid matches that you +didn't intend: + +

+ +

+
+
LogSQLRequestIgnore .gif .jpg +
+
+

+ +

+3.5 Advanced logging scenarios +

+ +

+ +

+3.5.1 Using the module in an ISP environment +

+ +

+mod_log_sql has three basic tiers of operation: + +

+ +

    +
  1. The administrator creates all necessary tables by hand and configures +each Apache VirtualHost by hand. (LOGSQLCREATETABLES OFF) +
  2. +
  3. The module is permitted to create necessary tables on-the-fly, but +the administrator configures each Apache VirtualHost by hand. (LOGSQLCREATETABLES +ON) +
  4. +
  5. The module is permitted to create all necessary tables and to make +intelligent, on-the-fly configuration of each VirtualHost. (LOGSQLMASSVIRTUALHOSTING +ON) +
  6. +
+Many users are happy to use the module in its most minimal form: they +hand-create any necessary tables (using ``create_tables.sql''), +and they configure each VirtualHost by hand to suit their needs. However, +some administrators need extra features due to a large and growing +number of VirtualHosts. The LOGSQLMASSVIRTUALHOSTING directive +activates module capabilities that make it far easier to manage an +ISP environment, or any situation characterized by a large and varying +number of virtual servers: + +

+ +

+There are numerous benefits. The admin will not need to create new +tables for every new VirtualHost. (Although the admin will still need +to drop the tables of virtual hosts that are removed.) The admin will +not need to set LOGSQLTRANSFERLOGTABLE for each virtual host +- it will be configured automatically based on the host's name. Because +each virtual host will log to its own segregated table, data about +one virtual server will segregate from others; an admin can grant +users access to the tables they need, and they will be unable to view +data about another user's virtual host. + +

+In an ISP scenario the admin is likely to have a cluster of many front-end +webservers logging to a back-end database. mod_log_sql has a feature +that permits analysis of how well the web servers are loadbalancing: +the LOGSQLMACHINEID directive. The administrator uses this +directive to assign a unique identifier to each machine in the web +cluster, e.g. ``LOGSQLMACHINEID web01,'' ``LOGSQLMACHINEID +web02,'' etc. Used in conjunction with the 'M' character in LOGSQLTRANSFERLOGFORMAT, +each entry in the SQL log will include the machine ID of the machine +that created the entry. This permits the administrator to count the +entries made by each particular machine and thereby analyze the front-end +loadbalancing algorithm. + +

+ +

+
+3.5.2 Logging many-to-one data in separate tables +

+ +

+A given HTTP request can have a one-to-many relationship with certain +kinds of data. For example, a single HTTP request can have 4 cookies, +3 headers and 5 ``mod_gzip'' notes associated with it. mod_log_sql +is capable of logging these relationships due to the elegance of SQL +relational data. + +

+You already have a single table containing access requests. One of +the columns in that table is 'id' which is intended to contain the +unique request ID supplied by the standard Apache module mod_unique_id +- all you need to do is compile in that module and employ the LOGSQLTRANSFERLOGFORMAT +character 'I'. Thereafter, each request gets a unique ID that can +be thought of as a primary key within the database, useful for joining +multiple tables. So let's envision several new tables: a notes table, +a cookies table, and a table for inbound and outbound headers. + +

+

+
+ +

+ +

+

+ + + + + + + + + + + + + + + + + +
Table 1: +access_log
idremote_hostrequest_uritime_stampstatusbytes_sent
PPIDskBRH30AAGPtAsgzerberus.aiacs.net/mod_log_sql/index.html10224936172002215
+
+
+
+

+
+ +

+ +

+

+ + + + + + + + + + + + + + + +
Table 2: +notes_log
iditemval
PPIDskBRH30AAGPtAsgmod_gzip_resultOK
PPIDskBRH30AAGPtAsgmod_gzip_compression_ratio69
+
+
+
+ +

+

+
+ +

+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + +
Table 3: +headers_log
iditemval
PPIDskBRH30AAGPtAsgContent-Typetext/html
PPIDskBRH30AAGPtAsgAccept-Encodinggzip, deflate
PPIDskBRH30AAGPtAsgExpiresTue, 28 May 2002 10:00:18 GMT
PPIDskBRH30AAGPtAsgCache-Controlmax-age=86400
+
+
+
+ +

+We have a certain request, and its unique ID is ``PPIDskBRH30AAGPtAsg''. +Within each separate table will be multiple entries with that request +ID: several cookie entries, several header entries, etc. As you can +see in tables 1, 2 and 3, you +have a one-to-many relationship for request PPIDskBRH30AAGPtAsg: that +one access has two associated notes and four associated headers. You +can extract this data easily using the power of SQL's ``select'' +statement and table joins. To see the notes associated with a particular +request: + +

+ +

+
+
select a.remote_host, a.request_uri, n.item, n.val from access_log a, notes_log n + +

+where a.id=n.id and a.id='PPIDskBRH30AAGPtAsg'; + +

+

+
+
+ + + + + + + + + + + + + + + + +
remote_hostrequest_uriitemval
zerberus.aiacs.net/mod_log_sql/index.htmlmod_gzip_resultOK
zerberus.aiacs.net/mod_log_sql/index.htmlmod_gzip_compression_ratio69
+
+ +

+ +

+
+

+

+
Naturally you can craft similar statements for the outboud headers, +inbound headers and cookies, all of which can live in separate tables. +Your statements are limited in power only by your skill with SQL. + +

+In order to use this capability of mod_log_sql, you must do several +things: + +

+ +

+ +
+
+
<VirtualHost 216.231.36.128> + +

+  (snip) + +

+  LogSQLNotesLogTable notestable + +

+  LogSQLWhichCookies bluecookie redcookie greencookie  + +

+  LogSQLWhichNotes mod_gzip_result mod_gzip_compression_ratio + +

+  LogSQLWhichHeadersOut Expires Content-Type Cache-Control  + +

+  LogSQLWhichHeadersIn UserAgent Accept-Encoding Host + +

+  (snip) + +

+</VirtualHost> +

+
+

+ +

+3.5.3 Using the same database for production and test +

+ +

+Although suboptimal, it is not uncommon to use the same backend database +for the ``production'' webservers as well as the ``test'' +webservers (budgetary constraints, rackspace limits, etc.). Furthermore, +an administrator in this situation may be unable to use LOGSQLREMHOSTIGNORE +to exclude requests from the test servers - perhaps the generated +entries are genuinely useful for analytical or QA purposes, but their +value after analysis is minimal. + +

+It is wasteful and potentially confusing to permit this internal test +data to clutter the database, and a solution to the problem is the +proper use of the LOGSQLMACHINEID directive. Assume a scenario +where the production webservers have IDs like ``web01,'' ``web02,'' +and so on - and the test webservers have IDs like ``test01,'' +``test02,'' etc. Because entries in the log database are distinguished +by their source machine, an administrator may purge unneeded test +data from the access log as follows: + +

+ +

+
+
delete from access_log where machine_id like 'test%'; +
+
+

+ +

+
+3.5.4 Optimizing for a busy database +

+ +

+A busy MySQL database will have SELECT statements running concurrently +with INSERT and UPDATE statements. A long-running SELECT can in certain +circumstances block INSERTs and therefore block mod_log_sql. A workaround +is to compile mod_log_sql for ``delayed inserts,'' which are +described as follows in the MySQL documentation: + +

+

+The DELAYED option for the INSERT statement is a MySQL-specific option +that is very useful if you have clients that can't wait for the INSERT +to complete. This is a common problem when you use MySQL for logging +and you also periodically run SELECT and UPDATE statements that take +a long time to complete. DELAYED was introduced in MySQL Version 3.22.15. +It is a MySQL extension to ANSI SQL92. +
+

+

INSERT DELAYED only works with ISAM and MyISAM tables. Note that as +MyISAM tables supports concurrent SELECT and INSERT, if there is no +free blocks in the middle of the data file, you very seldom need to +use INSERT DELAYED with MyISAM. +
+

+

When you use INSERT DELAYED, the client will get an OK at once and +the row will be inserted when the table is not in use by any other +thread. +
+

+

Another major benefit of using INSERT DELAYED is that inserts from +many clients are bundled together and written in one block. This is +much faster than doing many separate inserts. + +
+The general disadvantages of delayed inserts are: + +

+ +

    +
  1. The queued rows are only stored in memory until they are inserted +into the table. If mysqld dies unexpectedly, any queued rows that +weren't written to disk are lost. +
  2. +
  3. There is additional overhead for the server to handle a separate thread +for each table on which you use INSERT DELAYED. +
  4. +
+The MySQL documentation concludes, ``This means that you +should only use INSERT DELAYED when you are really sure you need it!'' +Furthermore, the current state of error return from a failed INSERT +DELAYED seems to be in flux, and may behave in unpredictable ways +between different MySQL versions. See section 4.3.4 +in the FAQ - you have been warned. + +

+If you are experiencing issues which could be solved by delayed inserts, +uncomment the #MYSQLDELAYED line in the Makefile by removing the +# that is in front of it. Recompile and reinstall your module. All +regular INSERT statements are now INSERT DELAYED, and you should see +no more blocking of the module. + +

+ +

+
+3.6 Configuration directive reference +

+ +

+It is imperative that you understand which directives are used only +once in the main server config, and which are used inside VirtualHost +stanzas and therefore multiple times within httpd.conf. The ``context'' +listed with each entry informs you of this. + +

+ +

+3.6.1 LogSQLCookieLogTable +

+ +

+ +

+
+
Syntax: LogSQLCookieLogTable table-name  + +

+Example: LogSQLCookieLogTable cookie_log + +

+Default: cookies + +

+Context: virtual host +

+
Defines which table is used for logging of cookies. Working in conjunction +with LOGSQLWHICHCOOKIES, you can log many of each request's +associated cookies to a separate table. For meaningful data retrieval +the cookie table is keyed to the access table by the unique request +ID supplied by the standard Apache module mod_unique_id. + +

+Note that you must create the table (see create-tables.sql, included +in the package), or LOGSQLCREATETABLES must be set to ``on''. + +

+ +

+3.6.2 LogSQLCreateTables +

+ +

+ +

+
+
Syntax: LogSQLCreateTables flag + +

+Example: LogSQLCreateTables On  + +

+Default: Off  + +

+Context: main server config +

+
mod_log_sql has the ability to create its tables on-the-fly. The +advantage to this is convenience: you don't have to execute any SQL +by hand to prepare the table. This is especially helpful for people +with lots of virtual hosts (who should also see the LOGSQLMASSVIRTUALHOSTING +directive). + +

+There is a slight disadvantage: if you wish to activate this feature, +then the userid specified in LOGSQLLOGININFO must have CREATE +privileges on the database. In an absolutely paranoid, locked-down +situation you may only want to grant your mod_log_sql user INSERT +privileges on the database; in that situation you are unable to take +advantage of LOGSQLCREATETABLES. But most people - even +the very security-conscious - will find that granting CREATE on the +logging database is reasonable. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.3 LogSQLDatabase +

+ +

+ +

+
+
MANDATORY + +

+Syntax: LogSQLDatabase database  + +

+Example: LogSQLDatabase loggingdb  + +

+Context: main server config +

+
Defines the database that is used for logging. ``database'' must +be a valid db on the MySQL host defined in LOGSQLLOGININFO. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.4 LogSQLForcePreserve +

+ +

+ +

+
+
Syntax: LogSQLForcePreserve Flag + +

+Example: LogSQLPreserveFile on + +

+Default: off + +

+Context: main server config +

+
You may need to perform debugging on your database and specifically +want mod_log_sql to make no attempts to log to it. This directive +instructs the module to send all its log entries directly to the preserve +file and to make no database INSERT attempts. + +

+This is presumably a directive for temporary use only; it could be +dangerous if you set it and forget it, as all your entries will simply +pile up in the preserve file. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.5 LogSQLHeadersInLogTable +

+ +

+ +

+
+
Syntax: LogSQLHeadersInLogTable table-name  + +

+Example: LogSQLHeadersInLogTable headers + +

+Default: headers_in + +

+Context: virtual host +

+
Defines which table is used for logging of inbound headers. Working +in conjunction with LOGSQLWHICHHEADERSIN, you can log many +of each request's associated headers to a separate table. For meaningful +data retrieval the headers table is keyed to the access table by the +unique request ID supplied by the standard Apache module mod_unique_id. + +

+Note that you must create the table (see create-tables.sql, included +in the package), or LOGSQLCREATETABLES must be set to ``on''. + +

+ +

+3.6.6 LogSQLHeadersOutLogTable +

+ +

+ +

+
+
Syntax: LogSQLHeadersOutLogTable table-name  + +

+Example: LogSQLHeadersOutLogTable headers + +

+Default: headers_out + +

+Context: virtual host +

+
Defines which table is used for logging of outbound headers. Working +in conjunction with LOGSQLWHICHHEADERSOUT, you can log many +of each request's associated headers to a separate table. For meaningful +data retrieval the headers table is keyed to the access table by the +unique request ID supplied by the standard Apache module mod_unique_id. + +

+Note that you must create the table (see create-tables.sql, included +in the package), or LOGSQLCREATETABLES must be set to ``on''. + +

+ +

+3.6.7 LogSQLLoginInfo +

+ +

+ +

+
+
MANDATORY  + +

+Syntax: LogSQLLoginInfo host user password + +

+Example: LogSQLLoginInfo foobar.baz.com logwriter passw0rd  + +

+Context: main server config +

+
Defines the general parameters of the MySQL host to which you will +be logging. ``host'' is the hostname or IP address of the MySQL +machine, and is simply ``localhost'' if the database lives on +the same machine as Apache. ``user'' is the MySQL userid (not +a Unix userid!) with INSERT privileges on the table defined in LOGSQLTRANSFERLOGTABLE. +``password'' is that user's password. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.8 LogSQLMachineID +

+ +

+ +

+
+
Syntax: LogSQLMachineID somename + +

+Example: LogSQLMachineID web01 + +

+Context: main server config +

+
If you have a farm of webservers then you may wish to know which particular +machine made each entry; this is useful for analyzing your loadbalancing +methodology. LOGSQLMACHINEID permits you to distinguish each +machine's entries if you assign each machine its own LOGSQLMACHINEID: +for example, the first webserver gets ``LOGSQLMACHINEID +web01,'' the second gets ``LOGSQLMACHINEID web02,'' +etc. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.9 LogSQLMassVirtualHosting +

+ +

+ +

+
+
Syntax: LogSQLMassVirtualHosting flag  + +

+Example: LogSQLMassVirtualHosting On  + +

+Default: Off  + +

+Context: main server config +

+
If you administer a site hosting many, many virtual hosts then this +option will appeal to you. If you turn on LOGSQLMASSVIRTUALHOSTING +then several things happen: + +

+ +

+This is a huge boost in convenience for sites with many virtual servers. +Activating LOGSQLMASSVIRTUALHOSTING obviates the need to +create every virtual server's table and provides more granular security +possibilities. + +

+You are advised to investigate the use of Apache's USECANONICALNAME +ON directive with this directive in order to ensure that each virtual +host maps to one table namespace. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.10 LogSQLNotesLogTable +

+ +

+ +

+
+
Syntax: LogSQLNotesLogTable table-name  + +

+Example: LogSQLNotesLogTable notes_log + +

+Default: notes + +

+Context: virtual host  +

+
Defines which table is used for logging of notes. Working in conjunction +with LOGSQLWHICHNOTES, you can log many of each request's +associated notes to a separate table. For meaningful data retrieval +the notes table is keyed to the access table by the unique request +ID supplied by the standard Apache module mod_unique_id. + +

+Note that you must create the table (see create-tables.sql, included +in the package), or LOGSQLCREATETABLES must be set to ``on''. + +

+ +

+3.6.11 LogSQLPreserveFile +

+ +

+ +

+
+
Syntax: LogSQLPreserveFile filename  + +

+Example: LogSQLPreserveFile offline-preserve  + +

+Default: /tmp/sql-preserve + +

+Context: virtual host +

+
mod_log_sql writes queries to this local preserve file in the event +that it cannot reach the database, and thus ensures that your high-availability +web frontend does not lose logs during a temporary database outage. +This could happen for a number of reasons: the database goes offline, +the network breaks, etc. You will not lose entries since the module +has this backup. The file consists of a series of SQL statements that +can be imported into your database at your convenience; furthermore, +because the SQL queries contain the access timestamps you do not need +to worry about out-of-order data after the import, which is done in +a simple manner: + +

+ +

+
+
# mysql -uadminuser -p mydbname < /tmp/sql-preserve +
+
If you do not define LOGSQLPRESERVEFILE then all virtual +servers will log to the same default preserve file (/tmp/sql-preserve). +You can redefine this on a virtual-host basis in order to segregate +your preserve files if you desire. Note that segregation is not usually +necessary, as the SQL statements that are written to the preserve +file already distinguish between different virtual hosts if you include +the 'v' character in your LOGSQLTRANSFERLOGFORMAT directive. +It is only necessary to segregate preserve-files by virualhost if +you also segregate access logs by virtualhost. + +

+The module will log to Apache's ERRORLOG when it notices +a database outage, and upon database return. You will therefore know +when the preserve file is being used, although it is your responsibility +to import the file. + +

+The file does not need to be created in advance. It is safe to remove +or rename the file without interrupting Apache, as the module closes +the filehandle immediately after completing the write. The file is +created with the user & group ID of the running Apache process (e.g. +'nobody' on many Linux distributions). + +

+ +

+3.6.12 LogSQLRemhostIgnore +

+ +

+ +

+
+
Syntax: LogSQLRemhostIgnore host1 host2 host3 ... hostN  + +

+Example: LogSQLRemhostIgnore localnet.com  + +

+Context: virtual host +

+
Lists a series of strings that, if present in the REMOTE_HOST, will +cause that request to not be logged. This directive is useful +for cutting down on log clutter when you are certain that you want +to ignore requests from certain hosts, such as your own internal network +machines. See section 3.4.2 for some tips for using this +directive. + +

+Each string is separated by a space, and no regular expressions or +globbing are allowed. Each string is evaluated as a substring of the +REMOTE_HOST using strstr(). The comparison is case sensitive. + +

+ +

+3.6.13 LogSQLRequestAccept +

+ +

+ +

+
+
Syntax: LogSQLRequestAccept req1 req2 req3 ... reqN  + +

+Example: LogSQLRequestAccept .html .php .jpg + +

+Default: if not specified, all requests are ``accepted'' + +

+Context: virtual host +

+
Lists a series of strings that, if present in the URI, will permit +that request to be considered for logging (depending on additional +filtering by the ``ignore'' directives). Any request that fails +to match one of the LOGSQLREQUESTACCEPT entries will be discarded. + +

+This directive is useful for cutting down on log clutter when you +are certain that you only want to log certain kinds of requests, and +just blanket-ignore everything else. See section 3.4.2 +for some tips for using this directive. + +

+Each string is separated by a space, and no regular expressions or +globbing are allowed. Each string is evaluated as a substring of the +URI using strstr(). The comparison is case sensitive. + +

+This directive is completely optional. It is more general than LOGSQLREQUESTIGNORE +and is evaluated before LOGSQLREQUESTIGNORE. If +this directive is not used, all requests are accepted and +passed on to the other filtering directives. Therefore, only use this +directive if you have a specific reason to do so. + +

+ +

+3.6.14 LogSQLRequestIgnore +

+ +

+ +

+
+
Syntax: LogSQLRequestIgnore req1 req2 req3 ... reqN  + +

+Example: LogSQLRequestIgnore root.exe cmd.exe default.ida favicon.ico  + +

+Context: virtual host +

+
Lists a series of strings that, if present in the URI, will cause +that request to NOT be logged. This directive is +useful for cutting down on log clutter when you are certain that you +want to ignore requests for certain objects. See section 3.4.2 +for some tips for using this directive. + +

+Each string is separated by a space, and no regular expressions or +globbing are allowed. Each string is evaluated as a substring of the +URI using strstr(). The comparison is case sensitive. + +

+ +

+3.6.15 LogSQLSocketFile +

+ +

+ +

+
+
Syntax: LogSQLSocketFile filename  + +

+Example: LogSQLSocketFile /tmp/mysql.sock  + +

+Default: /var/lib/mysql/mysql.sock  + +

+Context: main server config +

+
At Apache runtime you can specify the MySQL socket file to use. Set +this once in your main server config to override the default value. +This value is irrelevant if your database resides on a separate machine. + +

+mod_log_sql will automatically employ the socket for db communications +if the database resides on the local host. If the db resides on a +separate host the module will automatically use TCP/IP. This is a +function of the MySQL API and is not user-configurable. + +

+This is defined only once in the httpd.conf file. + +

+ +

+3.6.16 LogSQLTCPPort +

+ +

+ +

+
+
Syntax: LogSQLTCPPort portnumber + +

+Example: LogSQLTCPPort 3309 + +

+Default: 3306 + +

+Context: main server config +

+
Your database may listen on a different port than the default. If +so, use this directive to instruct the module which port to use. This +directive only applies if the database is on a different machine connected +via TCP/IP. + +

+This is defined only once in the httpd.conf file. + +

+ +

+
+3.6.17 LogSQLTransferLogFormat +

+ +

+ +

+
+
Syntax: LogSQLTransferLogFormat format-string  + +

+Example: LogSQLTransferLogFormat huSUTv  + +

+Default: AbHhmRSsTUuv  + +

+Context: virtual host +

+
Each character in the format-string defines an attribute of the request +that you wish to log. The default logs the information required to +create Combined Log Format logs, plus several extras. Here is the +full list of allowable keys, which sometimes resemble their Apache +counterparts, but do not always: + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
 What is this?Data fieldColumn typeExample
AUser agentagentvarchar(255)Mozilla/4.0 (compat; MSIE 6.0; Windows)
aCGI request argumentsrequest_argsvarchar(255)user=Smith&cart=1231&item=532
bBytes transferedbytes_sentint unsigned32561
cText of cookie $^{\textrm{1}}$cookievarchar(255)Apache=sdyn.fooonline.net.1300102700823
HHTTP request protocolrequest_protocolvarchar(10)HTTP/1.1
hName of remote hostremote_hostvarchar(50)blah.foobar.com
IRequest ID (from mod_unique_id)idchar(19)POlFcUBRH30AAALdBG8
lIdent user inforemote_lognamevarchar(50)bobby
MMachine ID $^{\textrm{2}}$machine_idvarchar(25)web01
mHTTP request methodrequest_methodvarchar(6)GET
Phttpd child PIDchild_pidsmallint unsigned3215
phttpd portserver_portsmallint unsigned80
RRefererreferervarchar(255)http://www.biglinks4u.com/linkpage.html
rRequest in full formrequest_linevarchar(255)GET /books-cycroad.html HTTP/1.1
STime of request in UNIX formattime_stampint unsigned1005598029
sHTTP status of requeststatussmallint unsigned404
TSeconds to service requestrequest_durationsmallint unsigned2
tTime of request in human formatrequest_timechar(28)[02/Dec/2001:15:01:26 -0800]
URequest in simple formrequest_urivarchar(255)/books-cycroad.html
uUser info from HTTP authremote_uservarchar(50)bobby
vVirtual host servicing the requestvirtual_hostvarchar(50)www.foobar.com
+

+

+
+

+

$^{\textrm{1}}$ You must also specify LOGSQLWHICHCOOKIE +for this to take effect. +
+

+

$^{\textrm{2}}$ You must also specify LOGSQLMACHINEID for +this to take effect. + +
+If you have compiled mod_log_sql with SSL logging capability, you +also can use these: + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + +
 What is this?Data fieldColumn TypeExample
zSSL cipher usedssl_ciphervarchar(25)RC4-MD5
qKeysize of the SSL connectionssl_keysizesmallint unsigned56
QMaximum keysize supportedssl_maxkeysizesmallint unsigned128
+
+ +

+ +

+3.6.18 LogSQLTransferLogTable +

+ +

+ +

+
+
MANDATORY (unless LOGSQLMASSVIRTUALHOSTING is ``on'') + +

+Syntax: LogSQLTransferLogTable table-name  + +

+Example: LogSQLTransferLogTable access_log_table  + +

+Context: virtual host +

+
Defines which table is used for logging of Apache's transfers; this +is analogous to Apache's TransferLog directive. table-name must be +a valid table within the database defined in LOGSQLDATABASE. + +

+This directive is not necessary if you declare LOGSQLMASSVIRTUALHOSTING +ON, since that directive activates dynamically-named tables. If you +attempt to use LOGSQLTRANSFERLOGTABLE at the same time a +warning will be logged and it will be ignored, since LOGSQLMASSVIRTUALHOSTING +takes priority. + +

+ +

+3.6.19 LogSQLWhichCookie +

+ +

+ +

+
+
Syntax: LogSQLWhichCookie cookiename  + +

+Example: LogSQLWhichCookie Clicks + +

+Default: None + +

+Context: virtual host +

+
In HTTP, cookies have names to distinguish them from each other. Using +mod_usertrack, for example, you can give your user-tracking cookies +a name with the CookieName directive. + +

+You must include a 'c' character in LOGSQLTRANSFERLOGFORMAT +for this directive to take effect; once you specify 'c', LOGSQLWHICHCOOKIE +tells mod_log_sql which cookie to log. This is necessary because +you will usually be setting and receiving more than one cookie from +a client; this cookie designates which one to log. + +

+Note: although this was intended for people who are using mod_usertrack +to set user-tracking cookies, you aren't restricted in any way. You +can choose which cookie you wish to log to the database -any cookie +at all - and it doesn't necessarily have to have anything to do with +mod_usertrack. + +

+ +

+3.6.20 LogSQLWhichCookies +

+ +

+ +

+
+
Syntax: LogSQLWhichCookies cookie1 cookie2 ... cookieN + +

+Example: LogSQLWhichCookies userlogin foobar foobaz + +

+Default: None + +

+Context: virtual host +

+
Defines the list of cookies you would like logged. This works in conjunction +with LOGSQLCOOKIELOGTABLE. This directive does not require +any additional characters to be added to the LOGSQLTRANSFERLOGFORMAT +string. The feature is activated simply by including this directive, +upon which you will begin populating the separate cookie table with +data. + +

+Note that you must have already created the table (see create-tables.sql, +included in the package), or LOGSQLCREATETABLES must be set +to ``on''. + +

+ +

+3.6.21 LogSQLWhichHeadersIn +

+ +

+ +

+
+
Syntax: LogSQLWhichHeadersIn item1 item2 ... itemN + +

+Example: LogSQLWhichHeadersIn UserAgent Accept-Encoding Host + +

+Default: None + +

+Context: virtual host +

+
Defines the list of inbound headers you would like logged. This works +in conjunction with LOGSQLHEADERSINLOGTABLE. This directive +does not require any additional characters to be added to the LOGSQLTRANSFERLOGFORMAT +string. The feature is activated simply by including this directive, +upon which you will begin populating the separate inbound-headers +table with data. + +

+Note that you must have already created the table (see create-tables.sql, +included in the package), or LOGSQLCREATETABLES must be set +to ``on''. + +

+ +

+3.6.22 LogSQLWhichHeadersOut +

+ +

+ +

+
+
Syntax: LogSQLWhichHeadersOut item1 item2 ... itemN + +

+Example: LogSQLWhichHeadersOut Expires Content-Type Cache-Control + +

+Default: None + +

+Context: virtual host +

+
Defines the list of outbound headers you would like logged. This works +in conjunction with LOGSQLHEADERSOUTLOGTABLE. This directive +does not require any additional characters to be added to the LOGSQLTRANSFERLOGFORMAT +string. The feature is activated simply by including this directive, +upon which you will begin populating the separate outbound-headers +table with data. + +

+Note that you must have already created the table (see create-tables.sql, +included in the package), or LOGSQLCREATETABLES must be set +to ``on''. + +

+ +

+3.6.23 LogSQLWhichNotes +

+ +

+ +

+
+
Syntax: LogSQLWhichNotes item1 item2 ... itemN + +

+Example: LogSQLWhichNotes mod_gzip_result mod_gzip_compression_ratio + +

+Default: None + +

+Context: virtual host +

+
Defines the list of notes you would like logged. This works in conjunction +with LOGSQLNOTESLOGTABLE. This directive does not require +any additional characters to be added to the LOGSQLTRANSFERLOGFORMAT +string. The feature is activated simply by including this directive, +upon which you will begin populating the separate notes table with +data. + +

+Note that you must have already created the table (see create-tables.sql, +included in the package), or LOGSQLCREATETABLES must be set +to ``on''. + +

+


+ + +next + +up + +previous + +contents +
+ Next: 4 FAQ + Up: Installing and Running mod_log_sql + Previous: 2 Installation +   Contents + +
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/node5.html b/docs/node5.html new file mode 100644 index 0000000..e9a593c --- /dev/null +++ b/docs/node5.html @@ -0,0 +1,1270 @@ + + + + + +4 FAQ + + + + + + + + + + + + + + + + + + + + +next + +up + +previous + +contents +
+ Next: About this document ... + Up: Installing and Running mod_log_sql + Previous: 3 Configuration +   Contents +
+
+ + +Subsections + + + +
+ +

+4 FAQ +

+ +

+ +

+4.1 General module questions +

+ +

+ +

+
+4.1.1 Why log to an SQL database? +

+ +

+To begin with, let's get it out of the way: logging to a database +is not a panacea. But while there are complexities with this solution, +the benefit can be substantial for certain classes of administrator +or people with advanced requirements: + +

+ +

+For example, do you want to see all your 404's? Do this: + +

+ +

+
+
select remote_host,status,request_uri,bytes_sent,from_unixtime(time_stamp) + +

+from acc_log_tbl where status=404 order by time_stamp; + +

+

+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
remote_hoststatusrequest_uribytes_sentfrom_unixtime(time_stamp)
marge.mmm.co.uk404/favicon.ico3212001-11-20 02:30:56
62.180.239.251404/favicon.ico3332001-11-20 02:45:25
212.234.12.66404/favicon.ico3212001-11-20 03:01:00
212.210.78.254404/favicon.ico3332001-11-20 03:26:05
+
+ +

+ +

+
+

+

+
Or do you want to see how many bytes you've sent within a certain +directory or site? Do this: + +

+ +

+
+
select request_uri,sum(bytes_sent) as bytes,count(request_uri) as howmany from + +

+acc_log_tbl where request_uri like '%mod_log_sql%' group by request_uri order + +

+by howmany desc;  + +

+

+
+
+ + + + + + + + + + + + + + + + + + + + + +
request_uribyteshowmany
/mod_log_sql/style_1.css1573961288
/mod_log_sql/2514337801
/mod_log_sql/mod_log_sql.tar.gz9769312456
/mod_log_sql/faq.html5038728436
+
+ +

+ +

+
+

+

+
Or maybe you want to see who's linking to you? Do this: + +

+ +

+
+
select count(referer) as num,referer from acc_log_tbl where + +

+request_uri='/mod_log_sql/' group by referer order by num desc; + +

+

+
+
+ + + + + + + + + + + + + + + + +
numreferer
271http://freshmeat.net/projects/mod_log_sql/
96http://modules.apache.org/search?id=339
48http://freshmeat.net/
8http://freshmeat.net
+
+ +

+ +

+
+

+

+
As you can see, there are myriad possibilities that can be constructed +with the wonderful SQL SELECT statement. Logging to an SQL database +can be really quite useful! + +

+ +

+4.1.2 Why use MySQL? Are there alternatives? +

+ +

+MySQL is a robust, free, and very powerful production-quality database +engine. It is well supported and comes with detailed documentation. +Many 3rd-party software pacakges (e.g. Slashcode, the engine that +powers Slashdot) run exclusively with MySQL. In other words, you will +belong to a very robust and well-supported community by choosing MySQL. + +

+That being said, there are alternatives. PostgreSQL is probably MySQL's +leading "competitor" in the free database world. +There is also an excellent module available for Apache to permit logging +to a PostgreSQL database, called pgLOGd (http://www.digitalstratum.com/pglogd/). + +

+ +

+4.1.3 Is this code production-ready? +

+ +

+By all accounts it is. It is known to work without a problem on many-thousands-of-hits-per-day +webservers. Does that mean it is 100% bug free? Well, no software +is. But it is well-tested and believed to be fully compatible with +production environments. (The usual disclaimers apply. This software +is provided without warranty of any kind.) + +

+ +

+4.1.4 Who's using mod_log_sql? +

+ +

+Good question! It would be great to find out! If you are a production-level +mod_log_sql user, please contact the maintainer, Chris Powell (chris@grubbybaby.com) +so that you can be mentioned here. + +

+ +

+4.1.5 Why doesn't the module also replace the Apache ErrorLog? +

+ +

+There are circumstances when that would be quite unwise - for example, +if Apache could not reach the MySQL server for some reason and needed +to log that fact. Without a text-based error log you'd never know +anything was wrong, because Apache would be trying to log a database +connection error to the database... you get the point. + +

+Error logs are usually not very high-traffic and are really best left +as text files on a web server machine. + +

+ +

+4.1.6 Does mod_log_sql work with Apache 2.x? +

+ +

+As of this writing, no. The Apache Group significantly altered the +module API with the release of Apache 2.0. All modules written for +1.3, including mod_log_sql, will not work with 2.0. + +

+mod_log_sql will eventually be ported to Apache 2.x, but not immediately. +It is going to take some time, and there are other features that have +higher priority. Please sign up for the announcements list (on the +main website) or monitor the website for updates to learn when the +port (and other releases) are available. + +

+<OPINION>If you're a *NIX user, stick with Apache 1.3.x for now. +Major modules like mod_ssl and PHP are not even ready for 2.0 yet, +and the main benefits in 2.0 are for Win32 users anyway. Apache 1.3.x +is rock-stable and performs equally well on *NIX as 2.0.</OPINION> + +

+ +

+4.1.7 Does mod_log_sql connect to MySQL via TCP/IP or a socket? +

+ +

+It depends! This is not determined by mod_log_sql. mod_log_sql +relies on a connection command that is supplied in the MySQL API, +and that command is somewhat intelligent. How it works: + +

+ +

+You don't have any control of which methodology is used. You can fine-tune +some of the configuration, however. The LOGSQLSOCKETFILE +runtime configuration directive overrides the default of ``/var/lib/mysql/mysql.sock'' +for socket-based connections, whereas the LOGSQLTCPPORT command +allows to you override the default TCP port of 3306 for TCP/IP connections. + +

+ +

+4.1.8 I have discovered a bug. Who can I contact? +

+ +

+Please contact the maintainer (chris@grubbybaby.com)! Your +comments, suggestions, bugfixes, bug catches, and usage testimonials +are always welcome. As free software, mod_log_sql is intended to +be a community effort - any code contributions or other ideas will +be fully and openly credited, of course. + +

+ +

+4.2 Problems +

+ +

+ +

+4.2.1 Apache segfaults when using PHP and mod_log_sql +

+ +

+This occurs if you compiled PHP with MySQL database support. PHP utilizes +its internal, bundled MySQL libraries by default. These conflict with +the ``real'' MySQL libraries linked by mod_log_sql, causing +the segmentation fault. + +

+The solution is to configure PHP to link against the real MySQL libraries +and recompile mod_php. Apache will run properly once the modules +are all using the same version of the MySQL libraries. + +

+ +

+
+4.2.2 Apache appears to start up fine, but nothing +is getting logged in the database +

+ +

+If you do not see any entries in the access_log, then something is +preventing the inserts from happening. This could be caused by several +things: + +

+ +

+Important: it is improper to ask for help before you have followed +these steps. + +

+First examine the MySQL log that you established in step 6 +of section 3.1. Ensure that the INSERT statements are +not being rejected because of a malformed table name or other typographical +error. By enabling that log, you instructed MySQL to log every connection +and command it receives - if you see no INSERT attempts in the log, +the module isn't successfully connecting to the database. If you see +nothing at all in the log - not even a record of your administrative +connection attempts, then you did not enable the log correctly. If +you do see INSERT attempts but they are failing, the log should tell +you why. + +

+Second, confirm that your LOGSQL* directives are all correct. + +

+Third, examine the Apache error logs for messages from mod_log_sql; +the module will offer hints as to why it cannot connect, etc. + +

+The next thing to do is recompile the module with debugging output +activated. change the "#undef DEBUG" on line 8 +of mod_log_sql.c to "#define DEBUG" and recompile/reinstall. +The module will now output copious notes about what it is doing, and +this will help you (and the maintainer) solve the problem. In order +to see the debugging messages, ensure that you make them visible using +the LOGLEVEL directive in the main server config +as well as in each VIRTUALHOST config: + +

+ +

+
+
LogLevel debug + +

+ErrorLog /var/log/httpd/server-messages  +

+
+

+ +

+4.2.3 Why do I get the message ``insufficient configuration info to +establish database link'' in my Apache error log? +

+ +

+At a minimum, LOGSQLDATABASE and LOGSQLLOGININFO +must be defined in order for the module to be able to establish a +database link. If these are not defined or are incomplete you will +receive this error message. + +

+ +

+4.2.4 My database cannot handle all the open connections from mod_log_sql, +is there anything I can do? +

+ +

+The rule of thumb: if you have n webservers each configured +to support y MAXCLIENTS, then your database must be +able to handle $n\times y$ simultenous connections in the worst +case. Certainly you must use common sense, consider reasonable traffic +expectations and structure things accordingly. + +

+Tweaking my.cnf to scale to high connection loads is imperative. But +if hardware limitations prevent your MySQL server from gracefully +handling the number of incoming connections, it would be beneficial +to upgrade the memory or CPU on that server in order to handle the +load. + +

+Jeremy Zawodny, a highly respected MySQL user and contributor to Linux +Magazine, has this very helpful and highly appropriate article on +tuning MySQL: http://jeremy.zawodny.com/blog/archives/000173.html + +

+Please remember that mod_log_sql's overriding principle is performance +- that is what the target audience demands and expects. Other database +logging solutions do not open and maintain many database connections, +but their performance suffers drastically. For example, pgLOGd funnels +all log connections through a separate daemon that connects to the +database, but that bottlenecks the entire process. mod_log_sql achieves +performance numbers an order of magnitude greater than the alternatives +because it dispenses with the overhead associated with rapid connection +cycling, and it doesn't attempt to shoehorn all the database traffic +through a single extra daemon or proxy process. + +

+ +

+4.2.5 Why do I occasionally see a ``lost connection to MySQL server'' +message in my Apache error log? +

+ +

+This message may appear every now and then in your Apache error log, +especially on very lightly loaded servers. This doesn't mean that +anything is necessarily wrong. Within each httpd child process, mod_log_sql +will open (and keep open) a connection to the MySQL server. MySQL, +however, will close connections that haven't been used in a while; +the default timeout is 8 hours. When this occurs, mod_log_sql will +notice and re-open the connection. That event is what is being logged, +and looks like this: + +

+ +

+
+
[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: first attempt failed,  + +

+  API said: error 2013, Lost connection to MySQL server during query  + +

+[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: reconnect successful + +

+[Tue Nov 12 19:04:10 2002] [error] mod_log_sql: second attempt successful +

+
Reference: MySQL documentation (http://www.mysql.com/documentation/mysql/bychapter/manual_Problems.html#Gone_away) + +

+ +

+4.3 Performance and Tuning +

+ +

+ +

+4.3.1 How well does it perform? +

+ +

+mod_log_sql scales to very high loads. Apache 1.3.22 + mod_log_sql +was benchmarked using the "ab" (Apache Bench) program +that comes with the Apache distribution; here are the results. + +

+Overall configuration: + +

+ +

+Apache configuration: + +

+ +

+
+
Timeout 300  + +

+KeepAlive On  + +

+MaxKeepAliveRequests 100  + +

+KeepAliveTimeout 15  + +

+MinSpareServers 5  + +

+StartServers 10  + +

+MaxSpareServers 15  + +

+MaxClients 256  + +

+MaxRequestsPerChild 5000  + +

+LogSQLTransferLogFormat AbHhmRSsTUuvc  + +

+LogSQLWhichCookie Clicks  + +

+CookieTracking on  + +

+CookieName Clicks +

+
"ab" commandline: + +

+ +

+
+
./ab -c 10 -t 20 -v 2 -C Clicks=ab_run http://www.hostname.com/target  +
+
( 10 concurrent requests; 20 second test; setting a cookie "Clicks=ab_run"; +target = the mod_log_sql homepage. ) + +

+Ten total ab runs were conducted: five with MySQL logging enabled, +and five with all MySQL directives commented out of httpd.conf. Then +each five were averaged. The results: + +

+ +

+In other words, any rate-limiting effects on this particular hardware +setup are not caused by MySQL. Note that although this very simple +webserver setup is hardly cutting-edge - it is, after all, a fairly +small machine - 139 requests per second equal over twelve million +hits per day. + +

+If you run this benchmark yourself, take note of three things: + +

+ +

    +
  1. Use a target URL that is on your own webserver :-). +
  2. +
  3. Wait until all your connections are closed out between runs; after +several thousand requests your TCP/IP stack will be filled with hundreds +of connections in TIME_WAIT that need to close. Do a "netstat +-t|wc -l" on the webserver to see. If you don't wait, you +can expect to see a lot of messages like "ip_conntrack: +table full, dropping packet" in your logs. (This has nothing +to do with mod_log_sql, this is simply the nature of the TCP/IP +stack in the Linux kernel.) +
  4. +
  5. When done with your runs, clean these many thousands of requests out +of your database: +
  6. +
+ +
+
+
mysql> delete from access_log where agent like 'ApacheBench%';  + +

+mysql> optimize table access_log;  +

+
+

+ +

+4.3.2 Do I need to be worried about all the running MySQL children? Will +holding open n Apache-to-MySQL connections consume a lot of +memory? +

+ +

+Short answer: you shouldn't be worried. + +

+Long answer: you might be evaluating at the output of ``ps -aufxw'' +and becoming alarmed at all the 7MB httpd processes or 22MB mysqld +children that you see. Don't be alarmed. It's true that mod_log_sql +opens and holds open many MySQL connections: each httpd child maintains +one open database connection (and holds it open for performance reasons). +Four webservers, each running 20 Apache children, will hold open 80 +MySQL connections, which means that your MySQL server needs to handle +80 simultaneous connections. In truth, your MySQL server needs to +handle far more than that if traffic to your website spikes and the +Apache webservers spawn off an additional 30 children each... + +

+Fortunately the cost reported by 'ps -aufxw' is deceptive. This is +due to an OS memory-management feature called ``copy-on-write.'' +When you have a number of identical child processes (e.g. Apache, +MySQL), it would appear in ``ps'' as though each one occupies +a great deal of RAM - as much as 7MB per httpd child! In actuality +each additional child only occupies a small bit of extra memory - +most of the memory pages are common to each child and therefore shared +in a ``read-only'' fashion. The OS can get away with this because +the majority of memory pages for one child are identical across all +children. Instead of thinking of each child as a rubber stamp of the +others, think of each child as a basket of links to a common memory +area. + +

+A memory page is only duplicated when it needs to be written to, hence +``copy-on-write.'' The result is efficiency and decreased memory +consumption. ``ps'' may report 7MB per child, but it might really +only ``cost'' 900K of extra memory to add one more child. It is +not correct to assume that 20 Apache +children with a VSZ of 7MB each equals +$(20\times 7MB)$ of memory +consumption - the real answer is much, much lower. The same ``copy-on-write'' +rules apply to all your MySQL children: 40 mysqld children @ 22MB +each do not occupy 880MB of RAM. + +

+The bottom line: although there is a cost to spawn extra httpd or +mysqld children, that cost is not as great as ``ps'' would lead +you to believe. + +

+ +

+4.3.3 My webserver cannot handle all the traffic that my site receives, +is there anything I can do? +

+ +

+If you have exhausted all the tuning possibilities on your existing +server, it is probably time you evaluated the benefits of clustering +two or more webservers together in a load-balanced fashion. In fact, +users of such a setup are mod_log_sql's target audience! + +

+ +

+
+4.3.4 What is the issue with activating delayed +inserts? +

+ +

+There are several. + +

+ +

    +
  1. INSERT DELAYED is a specific syntax to MySQL and is not supported +by any other database. Ergo, why is it needed, and what MySQL deficiency +is it working around? INSERT DELAYED is a kluge. +
  2. +
  3. The MySQL documentation is unclear whether INSERT DELAYED is even +necessary for an optimized database. It says, ``The DELAYED option +for the INSERT statement is a MySQL-specific option that is very useful +if you have clients that can't wait for the INSERT to complete.'' +But then it goes on to say, ``Note that as MyISAM tables supports +concurrent SELECT and INSERT, if there is no free blocks in the middle +of the data file, you very seldom need to use INSERT DELAYED with +MyISAM.'' +
  4. +
  5. Because INSERT DELAYED returns without waiting for the data to be +written, a hard kill of your MySQL database at the right (wrong?) +moment could lose those logfile entries. +
  6. +
  7. As of MySQL version 3.23.52, the error return functions disagree after +a failed INSERT DELAYED: mysql_errno() always returns 0, even if +mysql_error() returns a textual error. I have reported this bug to +the MySQL folks. However, we have no way of knowing what solution +they will adopt to fix this, and with the worst case solution mod_log_sql +would not be able to tell if anything went wrong with a delayed insert. +
  8. +
+Instead of delayed inserts, you may wish to utilize InnoDB tables +(instead of the standard MyISAM tables). InnoDB tables suppot row-level +locking and are recommended for high-volume databases. + +

+If after understanding these problems you still wish to enable delayed +inserts, section 3.5.4 discusses how. + +

+ +

+4.4 ``How do I...?'' - accomplishing certain tasks +

+ +

+ +

+4.4.1 I am using LogSQLMassVirtualHosting, and sometimes a single VirtualHost +gets logged to two different tables. How do I prevent that? +

+ +

+Proper usage of the Apache runtime SERVERNAME directive and +the directive USECANONICALNAME ON (or DNS) are necessary +to prevent this problem. ``On'' is the default for USECANONICALNAME, +and specifies that self-referential URLs are generated from the SERVERNAME +part of your VirtualHost: + +

+

+With UseCanonicalName on (and in all versions prior to 1.3) Apache +will use the ServerName and Port directives to construct the canonical +name for the server. With UseCanonicalName off Apache will form self-referential +URLs using the hostname and port supplied by the client if any are +supplied (otherwise it will use the canonical name, as defined above). +[From the Apache documentation http://httpd.apache.org/docs/mod/core.html#usecanonicalname] + +
+The module inherits Apache's ``knowledge'' about the server name +being accessed. As long as those two directives are properly configured, +mod_log_sql will log to only one table per virtual host while using +LOGSQLMASSVIRTUALHOSTING. + +

+ +

+4.4.2 How do I extract the data in a format that my analysis tool can understand? +

+ +

+mod_log_sql would be virtually useless if there weren't a way for +you to extract the data from your database in a somewhat meaningful +fashion. To that end there's a Perl script enclosed with the distribution. +That script (make_combined_log.pl) is designed to extract N-many +days worth of access logs and provide them in a Combined Log Format +output. You can use this very tool right in /etc/crontab to extract +logs on a regular basis so that your favorite web analysis tool can +read them. Or you can examine the Perl code to construct your own +custom tool. + +

+For example, let's say that you want your web statistics updated once +per day in the wee hours of the morning. A good way to accomplish +that could be the following entries in /etc/crontab: + +

+ +

+
+
# Generate the temporary apache logs from the MySQL database (for webalizer)  + +

+05 04 * * * root make_combined_log.pl 1 www.grubbybaby.com > /var/log/temp01 + +

+# Run webalizer on httpd log  + +

+30 04 * * * root webalizer -c /etc/webalizer.conf; rm -f /var/log/temp01 +

+
Or if you have a newer system that puts files in /etc/cron.daily etc., +create a file called ``webalizer'' in the cron.daily subdirectory. +Use the following as the contents of your file, and make sure to chmod +755 it when done. + +

+ +

+
+
#!/bin/sh + +

+/usr/local/sbin/make_combined_log.pl 1 www.yourdomain.com > /var/log/httpd/templog + +

+/usr/local/bin/webalizer -q -c /etc/webalizer.conf  + +

+rm -f /var/log/httpd/templog +

+
See? Easy. + +

+ +

+
+4.4.3 How can I log mod_usertrack cookies? +

+ +

+A number of people like to log mod_usertrack cookies in their Apache +TransferLog to aid in understanding their visitors' clickstreams. +This is accomplished, for example, with a statement as follows: + +

+ +

+
+
LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" \"%{cookie}n\"" +
+
Naturally it would be nice for mod_log_sql to permit the admin to +log the cookie data as well, so as of version 1.10 you can do this. +You need to have already compiled mod_usertrack into httpd - it's +one of the standard Apache modules. + +

+First make sure you have a column called "cookie" +in the MySQL database to hold the cookies, which can be done as follows +if you already have a working database: + +

+ +

+
+
alter table acc_log_tbl add column cookie varchar(255); +
+
Next configure your server to set usertracking cookies as follows, +and make sure you include the new 'c' directive in your LOGSQLTRANSFERLOGFORMAT, +which activates cookie logging. Here's an example: + +

+ +

+
+
<VirtualHost 1.2.3.4>  + +

+ CookieTracking on  + +

+ CookieStyle Cookie  + +

+ CookieName Foobar  + +

+ LogSQLTransferLogFormat huSUsbTvRAc  + +

+ LogSQLWhichCookie Foobar  + +

+</VirtualHost> +

+
The first three lines configure mod_usertrack to create a COOKIE +(RFC 2109) format cookie called Foobar. The last two lines tell mod_log_sql +to log cookies named Foobar. You have to choose which cookie to log +because more than one cookie can/will be sent to the server by the +client. + +

+Recap: the 'c' character activates cookie logging, and the +LOGSQLWHICHCOOKIE directive chooses which cookie to +log. + +

+FYI, you are advised NOT to use COOKIESTYLE COOKIE2 - it +seems that even newer browsers (IE 5.5, etc.) have trouble with the +new COOKIE2 (RFC 2965) format. Just stick with the standard COOKIE +format and you'll be fine. + +

+Perform some hits on your server and run a select: + +

+ +

+
+
mysql> select request_uri,cookie from access_log where cookie is not null; + +

+

+
+
+ + + + + + + + + + + + + +
request_uricookie
/mod_log_sql/ool-18e4.dyn.optonline.net.130051007102700823
/mod_log_sql/usa.gifool-18e4.dyn.optonline.net.130051007102700823
/mod_log_sql/style_1.cssool-18e4.dyn.optonline.net.130051007102700823
+
+ +

+ +

+
+

+

+
+

+ +

+4.4.4 What if I want to log more than one cookie? What is the difference +between LogSQLWhichCookie and LogSQLWhichCookies? +

+ +

+As of version 1.17, you have a choice in how you want cookie logging +handled. + +

+If you are interested in logging only one cookie per request, follow +the instructions in section 4.4.3 above. That cookie will +be logged to a column in the regular access_log table, and the actual +cookie you want to log is specified with LOGSQLWHICHCOOKIE. +Don't forget to specify the 'c' character in LOGSQLTRANSFERLOGFORMAT. + +

+If, however, you need to log multiple cookies per request, you must +employ the LOGSQLWHICHCOOKIES (note the plural) directive. +The cookies you specify will be logged to a separate table (as discussed +in section 3.5.2), and entries in that table will be +linked to the regular access_log entries via the unique ID that is +supplied by mod_unique_id. Without mod_unique_id the information +will still be logged but you will be unable to correlate which cookies +go with which access-requests. Furthermore, with LOGSQLWHICHCOOKIES, +you do not need to include the 'c' character in LOGSQLTRANSFERLOGFORMAT. + +

+LOGSQLWHICHCOOKIE and LOGSQLWHICHCOOKIES can coexist +without conflict because they operate on entireley different tables, +but you're better off choosing the one you need. + +

+ +

+4.4.5 What are the SSL logging features, and how do I activate them? +

+ +

+Note: you do not need to compile SSL support into mod_log_sql +in order to simply use it with a secure site. You only need to compile +SSL support into mod_log_sql if you want to log SSL-specific data +such as the cipher type used, or the keysize that was negotiated. +If that information is unimportant to you, you can ignore this FAQ. + +

+By adding certain characters to your LOGSQLTRANSFERLOGFORMAT +string you can tell mod_log_sql to log the SSL cipher, the SSL keysize +of the connection, and the maximum keysize that was available. This +would let you tell, for example, which clients were using only export-grade +security to access your secure software area. + +

+You can compile mod_log_sql with SSL logging support if you have +the right packages installed. If you already have an SSL-enabled Apache +then you by definition have the correct packages already installed: +OpenSSL and mod_ssl. + +

+You need to ensure that your database is set up to log the SSL data. +Issue the following commands to MySQL if your access table does not +already have them: + +

+ +

+
+
alter table access_log add column ssl_cipher varchar(25); + +

+alter table access_log add column ssl_keysize smallint unsigned; + +

+alter table access_log add column ssl_maxkeysize smallint unsigned; +

+
Finally configure httpd.conf to activate the SSL fields. Note that +this is only meaningful in a VirtualHost that is set up for SSL. + +

+ +

+
+
<VirtualHost 1.2.3.4:443>  + +

+ LogSQLTransferLogFormat AbHhmRSsTUuvcQqz  + +

+</VirtualHost> +

+
The last three characters (Qqz) in the directive are the SSL ones; +see section 3.6.17 in the directives documentation for details +of the LOGSQLTRANSFERLOGFORMAT directive. + +

+Restart Apache, then perform some hits on your server. Then run the +following select statement: + +

+ +

+
+
mysql> select remote_host,request_uri,ssl_cipher,ssl_keysize,ssl_maxkeysize + +

+from access_log where ssl_cipher is not null; + +

+

+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
remote_hostrequest_urissl_cipherssl_keysizessl_maxkeysize
216.190.52.4/dir/somefile.htmlRC4-MD5128128
216.190.52.4/dir/somefile.gifRC4-MD5128128
216.190.52.4/dir/somefile.jpgRC4-MD5128128
+
+ +

+ +

+
+
+
+
+

+


+ + +next + +up + +previous + +contents +
+ Next: About this document ... + Up: Installing and Running mod_log_sql + Previous: 3 Configuration +   Contents + +
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/node6.html b/docs/node6.html new file mode 100644 index 0000000..cefac28 --- /dev/null +++ b/docs/node6.html @@ -0,0 +1,74 @@ + + + + + +About this document ... + + + + + + + + + + + + + + + + + +next + +up + +previous + +contents +
+ Up: Installing and Running mod_log_sql + Previous: 4 FAQ +   Contents +
+
+ + +

+About this document ... +

+ Installing and Running mod_log_sql

+This document was generated using the +LaTeX2HTML translator Version 2002-1 (1.68) +

+Copyright © 1993, 1994, 1995, 1996, +Nikos Drakos, +Computer Based Learning Unit, University of Leeds. +
+Copyright © 1997, 1998, 1999, +Ross Moore, +Mathematics Department, Macquarie University, Sydney. +

+The command line arguments were:
+ latex2html -local_icons -show_section_numbers -split 4 -navigation -noindex_in_navigation -contents_in_navigation -dir Documentation/HTML Documentation/documentation.tex +

+The translation was initiated by Chris Powell on 2002-12-18 +


+
+Chris Powell +2002-12-18 +
+ + diff --git a/docs/prev.png b/docs/prev.png new file mode 100644 index 0000000..e60b8b4 Binary files /dev/null and b/docs/prev.png differ diff --git a/docs/prev_g.png b/docs/prev_g.png new file mode 100644 index 0000000..476d956 Binary files /dev/null and b/docs/prev_g.png differ diff --git a/docs/style_1.css b/docs/style_1.css new file mode 100644 index 0000000..f12a3af --- /dev/null +++ b/docs/style_1.css @@ -0,0 +1,33 @@ +body { + margin-left: 3%; + background-color: #ccccFF; + font-family: sans-serif; +} + +h1 { + text-align: center; +} + +h2 { + margin-left: -1%; + font-size: 150%; +} + +h3 { + margin-left: -1%; +} + +h4 { + text-align: center; +} + +p { +} + +pre { + font-family: monospace; +} + +DD { + font-family: monospace; +} \ No newline at end of file diff --git a/docs/up.png b/docs/up.png new file mode 100644 index 0000000..3937e16 Binary files /dev/null and b/docs/up.png differ diff --git a/docs/up_g.png b/docs/up_g.png new file mode 100644 index 0000000..54ceb68 Binary files /dev/null and b/docs/up_g.png differ diff --git a/index.xml b/index.xml new file mode 100644 index 0000000..1db68be --- /dev/null +++ b/index.xml @@ -0,0 +1,260 @@ + + + +
+ +
+

mod_log_sql is a logging module for Apache 1.3 and 2.0 which logs all requests to a database. This began a port of the Apache 1.3 version of the module by Chris Powell, and as of February 6th, 2004 Chris Powell and I have decided to switch maintainer-ship of the module over to me.
+This module now compiles under Apache 1.3 and Apache 2.0 from the same source.

+

The 1.9x versions are to be considered beta quality, as they contain new features + and some major code cleanups. If you are using Apache 1.3 it is recommended that you use + mod_log_sql version 1.18. Only use the 1.9x releases if you need the new features they provide.

+
+
+
+ + + + + + This release adds more documentation of the all the log formats and also adds documentation + of the tabletype DBParam for the mysql driver. Using LogSQLDBParam tabletype ARCHIVE will + set the tabletype for autocreated tables to ARCHIVE (and save TONS of space too) + + + Segfaulting due to not loading a driver module no longer occurs and an error message is + logged to the log file stating that you didn't load the driver module. + + + This release also adds another sub-module that will log incoming and outgoing bytes (logio) + This modules is EXCLUSIVE to the mod_logio module. Meaning if you use the mod_log_sql version + you must NOT load the standard apache version. Currenlty due to how logging of outgoing bytes + is done in the apache core there is NO workaround for this, you either log outgoing bytes + in the text access logs OR mod_log_sql. + + + To actually USE the new logging fields you just need to add the "i" and "o" LogSQLLogFormat options. + Please note that these are lowercase instead of uppercase like they are in mod_logio. + + + + Due to the addition of the logio two new fields are added to the log tables. If you wish to use + logging if IO then you MUST alter your existing tables adding fields bytes_in and bytes out of type int unsigned. + + ALTER TABLE mylogingtable ADD COLUMN bytes_in INT UNSIGNED, ADD COLUMN bytes_out INT UNSIGNED + + + + + + + This release adds a new "V" log format which logs the requested hostname into + virtual_host instead of the ServerName. This is exclusive to "v", Only one or the other + can be used. + + + There are several fixes in the configure system to not error out if an optional library + or file is not found. And there are several fixes to the SQL generation code fixing + escaping, table names, and NULL values. + + + + + + + This release fixes segmentation faults of apache that occured when the preserve file + was used. The errors printed to the error log are also more informative. And several + autoconf detection routines were fixed. + + + It also adds the dbi provider which works for adding SQL content to postgresql and mysql servers, + but does not support the auto creation of tables in either database. + The pgsql driver stub is also included, which is no where near completed. + + + For win32 users, the build scripts are included in the distribution now, read the + changelog for the 1.98 release for more information. If anyone wishes to contribute a + MSVC project file they will be welcomed. + + + + + + + Contains a new configuration directives to control the Preserve file, + LogSQLDisablePreserve. This option completely disables the preserve + file, so if the Database cannot be contacted, those records will be + lost and not logged. Also the LogSQLPreserveFile is now relative, if + it does not begin with a "/", to the ServerRoot and defaults + to logs/mod_log_sql-preserve. It will appear with your log files now. + + + Also in this release is the much requested Win32 support. I have added 2 + batch scripts to compile mod_log_sql for apache 1.3 and apache 2.0. You + will need to edit the files to setup the include and library paths for + compiling. I compiled this with the free Command line compiler provided + by Microsoft, Apache 2.0.49 and 1.3.29, Mysql 4.0, and the MS SDK from + Microsoft's website. I had to obtain the msvcrt.lib file from a MS VS + installation to get the dlls to compile, however. If anybody can + contribute a MSVC 6 project or nmake Makefiles I would greatly + appreciate it. + + + + Unfortunately due to the license that the free compiler from Microsoft + has I can not distribute binaries for Win32. + + + + + + + + This release has several changes so be sure to read the + documentation and Changelog. Changes are a new LogSQLLoginInfo syntax which + better allows configuring mod_log_sql with multiple databases.. basic example. + + mysql://loguser:Mypass@dbhost/apache_log + + Also the mysql code is not in a separate module that must be loaded after + the core mod_log_sql.so, so you will NEED to add this to your httpd.conf + file. + + LoadModule log_sql_mysql_module modules/mod_log_sql_mysql.so + + + + + + This release offers a new configuration directive to configure Database + configuration options. LogSQLDBParam should be used over LogSQLTcpPort, + LogSQLDatabase, and LogSQLSocketFile. as they are deprecated, and will + most likely be removed in future versions. + + + + + +
+ +
+ mod_log_sql 1.18 documentation
+ mod_log_sql 2.0 documentation +
+
+
+ + + Prerequisites + + + + + + + + + + + + + + + + + + + This patch stops configure from erroring out if libdbi is not detected + on the system. + + + This patch fixes loggin request_args(a) and fixes the MySQL reconnect segfault. + + + + + + This patch fixes a compile issues with certain builds of Apache 2 which + generate the "PACKAGE_NAME" redefined error. + + + + + + + + + + + + Historic version 1.18 (Last stable release for Apache 1.3 only) + + + + Repositories + + + + +
+ +
+

Debian

+

+ Thanks to Thomas Goirand, Debian users can add the following repository to + their /etc/apt/sources.list if using stable branch of Debian (mod_log_sql 1.18 only):

+ +

deb ftp://ftp.gplhost.com/debian stable main

+

This repository also includes the following packages: dtc, qmail, ucspi-tcp,mysqmail + and checklocalpasswd.

+

The packages can also be downloaded separately for version 1.18 and 1.9x on the + GPLHost website. +

NOTE: Please direct any issues with these binary releases to Thomas Goirand (links on + the his website), or to the mod_log_sql mailing list.

+

+
+
+

Win32

+

There are no win32 binaries provided by anyone at this time. If you wish to build them + there is a build.bat provided in the source distribution to build with MySQL and Apache 1.3.x and + Apache 2.0.x. However, remember that win32 is an unsupported platform for running mod_log_sql.

+
+
+
+ + + + + There are two mailing lists for mod_log_sql. The first is the generic announcement mailing list which + provides announcements for all software releases on OutOfOrder.cc, but can be filtered by + choosing topics in the mailing list options page. The second is the user mailing list specific to + mod_log_sql only. Release announcements will be cross posted to both lists. + + + + + + + + +
+ Contact & Help + + + E-Mail me, Edward Rudd, about mod_log_sql. + + + Send an e-mail to the mod_log_sql mailing list. + + + Bugs should be reported to the OutOfOrder.cc Bug Tracker. + + +
+
-- cgit