track2sql.php #4

  • //
  • guest/
  • lester_cheung/
  • track2sql/
  • track2sql.php
  • Commits
# Change User Description Committed
#4 9732 Lester Cheung Renamed direcotry "track2sql" to "log_analyzer".
#3 8061 Lester Cheung Swapping NULL characters with slashes instead of spaces as it turns out
that the problem was casused by a bug in P4V sending version strings
containing NULL chars.
#2 8059 Lester Cheung Replacing NULL characters with spaces.

   Some files log files contains NULL characters and track2sql would
   happily convert them into SQL statements. This may cause problem
   when we insert the generated statements into the database. For
   example:

     $ xzcat log.gz| php track2sql.php - -| mysql mydb
     ERROR 1064 (42000) at line 73164: You have an error in your SQL
     syntax; check the manual that corresponds to your MySQL server version
     for the right syntax to use near
     '07d4e0a7-1b46-d94f-662d-a69a1ee2266f','trigger',54,0,53,1,0,0,1,1448,0,0,0,1,0,0'
     at line 1.

   ... which is not very helpful as it's not referring to the SQL
   statement which contains the NULL chars.
#1 8058 Lester Cheung Branching Steward's track2sql locally
//guest/stewart_lord/track2sql/track2sql.php
#16 7621 Stewart Lord Updated create table statements to use the IF NOT EXISTS
condition. This allows SQL to be fed into an existing database
without errors/warnings.
#15 7620 Stewart Lord Minor update to Track2SQL to avoid timezone unset issue
and split() function deprecated in PHP 5.3.
#14 7338 Stewart Lord Rolled-back change 7209.
This removes the endTime column
from the process table. Track2SQL no longer attempts to analyze
'completed' entries.

This change was influenced by three factors:
 - Analyzing 'completed' entries significantly degrades
   performance (about 2.75x slower in my tests).
 - In some versions of PHP (5.2.8) the strtotime() function
   suffers from a memory leak.
 - The 'completed' entries are not part of Vtrack output.
#13 7209 Stewart Lord Integrating an enhancement from Michael Shield's guest
branch. Track2SQL now records the end time of each process
(when it is reported). This information is reported for every
completed process when -vserver=2|3 logging is enabled.

If verbose server logging is enabled this is more reliable than
start 'time' + 'lapse' because (by default) lapse is only reported
when it exceeds a certain threshold. If, however, vtrack=1 is
set then lapse time will be reported for every command.

Note: this change brings a schema change. It adds a
'endTime' column to the process table.
#12 7198 Stewart Lord It is now possible to accumulate output from multiple
invocations of track2sql in a single database. Previously,
the processKey was a incrementing value that always started
at zero. Therefore, processKeys from separate runs of
track2sql would collide if inserted into the same databse.

Now, track2sql uses a 36 character universally unique
identifier (UUID) for each process. UUIDs give us reasonable
confidence that the process keys will never collide.

One consequence of this change is that the schema is
slightly different. The type of the processKey column is
now a 36 character varchar instead of a integer.

Another side-effect of this change is that the output of
track2sql cannot be predicted and will always be different
even when processing the same log file.

Track2sql performance is largely unaffected, however, insert
and select performance may degrade somewhat due to the
larger key size.
#11 7197 Stewart Lord Fixed a bug where incomplete process lines such as:
'journal rotation', produced invalid SQL output.

Log entries with process lines that fail to pass a basic
sanity check are now ignored. The added check is rather
crude and therefore performance is largely unaffected.
#10 7193 Stewart Lord Fixed a bug where 'pages reordered' data could be
mistaken for 'pages i/o' data, if the 'pages reordered'
line was not preceded by a 'pages i/o' line.
#9 7104 Stewart Lord Follow-on to 7098.
The quote() function now wraps and
escapes the given string.
#8 7098 Stewart Lord Fixed a sqlite incompatibility.
Backslash is not a valid way
to escape quotes in sqlite. Quote-quote ('') is valid in both
mysql and sqlite.

Replaced use of addslashes with a quote() function that
replaces occurrences of single-quotes with two single quotes.
#7 6424 Stewart Lord Updated track2sql disclaimer. Addded a link to the
readme file from the script itself.
#6 6379 Stewart Lord Fixed a bug where track2sql failed to properly extract
some table usage data if 'pages reordered' were reported.
#5 6289 Stewart Lord Minor update to track2sql.
 - Added version and usage information. Can be viewed
   with -v, -V or -h.
 - Added error handling for the case of a non-existent
   input file or a empty input file.
 - Removed 'drop table if exists' statements from the
   table creation SQL.
#4 6010 Stewart Lord Fixed a bug where track2sql failed to properly parse
lock times in 2007.2 log files. This was due to a small
change in the log file format.
#3 5889 Stewart Lord Modified create table statements to use signed columns instead
of unsigned columns. This avoids subtraction problems that can
occur in some versions of MySQL when SQL_MODE is not set to
NO_UNSIGNED_SUBTRACTION.

Main() now sets error_reporting to E_ALL & ~E_NOTICE to
suppress notices.
#2 5858 Stewart Lord Added disclaimer to script.
#1 5857 Stewart Lord Initial add of track2sql to the public depot.