Track2SQL

From Perforce Workshop
Jump to: navigation, search


{{{2}}}

{{{2}}}


                   T  R  A  C  K  2  S  Q  L
                 -----------------------------
                   a server log analysis tool


  Copyright (c) 2008, Perforce Software, Inc.  All rights reserved.

  Redistribution and use in source and binary forms, with or without
  modification, are permitted provided that the following conditions are met:

  1.  Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.

  2.  Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.

  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  ARE DISCLAIMED. IN NO EVENT SHALL PERFORCE SOFTWARE, INC. BE LIABLE FOR ANY
  DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
  (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
  ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


 Description

   Track2sql takes a VTRACK log file (server version 2005.2 or
   greater) as input, and produces an SQL file as output. It
   requires a PHP command-line interpreter and an SQL database.
   It has been tested with PHP version 5, but might be compatible
   with earlier versions. It has been tested with MySQL version 5
   but should be compatible with other SQL databases.


 Usage

   php track2sql.php [ logFile | - [ sqlFile | - ] ] [ -d dbName ]

    Options:
      logFile     name of input file or '-' for stdin
      sqlFile     name of output file or '-' for stdout
      -d          name of database to create
      -v|V|h      view version and usage information

    Examples:
      $ track2sql.php log track.sql -d track
      $ cat log | track2sql.php | mysql
      $ tail -F log | track2sql.php | mysql


 Application

   The power of SQL allows us to analyze the server log data in
   many different ways. In particular, track2sql is an effective
   tool for identifying performance problems. If you are
   experiencing poor performance, try the following steps to
   illuminate the culprits.

    1. Convert your log file to sql.

        track2sql.php logFile -d dbName | mysql

    2. Launch your SQL client

        mysql dbName

    3. Identify commands with long compute-phases.

        mysql> SELECT
                process.processKey,user,cmd,
                MAX(readHeld+writeHeld)-MAX(readWait+writeWait)
                AS compute
               FROM tableUse JOIN process USING (processKey)
               GROUP BY tableUse.processKey
               ORDER BY compute DESC LIMIT 25;

       This will produce a list of 25 processes that held locks
       on one or more database tables for an extended period
       of time. During these periods of time, it is possible
       that the offending processes blocked other processes,
       thereby degrading performance.

          +---------------------------+------+-----------+---------+
          | processKey                | user | cmd       | compute |
          +---------------------------+------+-----------+---------+
          | 2cd5c5a5-...-f57f5ea18795 | jdoe | user-sync |   98765 |
          |        ...                |  ... |       ... |     ... |
          +---------------------------+------+-----------+---------+

       FOR EACH OFFENSIVE PROCESS:

         a. Get process information.

             mysql> SELECT * FROM process
                    WHERE processKey='2cd5c5a5-...-f57f5ea18795';

         b. Get table usage information.

             mysql> SELECT * FROM tableUse
                    WHERE processKey='2cd5c5a5-...-f57f5ea18795';

         c. Identify bottlenecks and take action.
            This last step can be difficult. Keep in mind that
            in general, performance can be improved three ways:

              -> By improving hardware (memory, disks, cpu)
              -> By upgrading software (perforce server/clients, OS)
              -> By adjusting usage (reducing scope of commands)


 Schema

    +------------+---------------+------+-----+
    | P R O C E S S                           |
    +------------+---------------+------+-----+
    | Field      | Type          | Null | Key |
    +------------+---------------+------+-----+
    | processKey | varchar(36)   | NO   | PRI |
    | time       | int(11)       | NO   |     |
    | endTime    | int(11)       | YES  |     |
    | pid        | int(11)       | NO   |     |
    | user       | varchar(255)  | NO   |     |
    | client     | varchar(255)  | NO   |     |
    | ip         | varchar(255)  | NO   |     |
    | app        | varchar(255)  | NO   |     |
    | cmd        | varchar(255)  | NO   |     |
    | args       | text          | YES  |     |
    | lapse      | decimal(10,3) | YES  |     |
    | uCpu       | int(11)       | YES  |     |
    | sCpu       | int(11)       | YES  |     |
    | diskIn     | int(11)       | YES  |     |
    | diskOut    | int(11)       | YES  |     |
    | ipcIn      | int(11)       | YES  |     |
    | ipcOut     | int(11)       | YES  |     |
    | maxRss     | int(11)       | YES  |     |
    | pageFaults | int(11)       | YES  |     |
    | rpcMsgsIn  | int(11)       | YES  |     |
    | rpcMsgsOut | int(11)       | YES  |     |
    | rpcSizeIn  | int(11)       | YES  |     |
    | rpcSizeOut | int(11)       | YES  |     |
    +------------+---------------+------+-----+

    +-------------+--------------+------+-----+
    | T A B L E   U S E                       |
    +-------------+--------------+------+-----+
    | Field       | Type         | Null | Key |
    +-------------+--------------+------+-----+
    | processKey  | varchar(36)  | NO   | PRI |
    | tableName   | varchar(255) | NO   | PRI |
    | pagesIn     | int(11)      | YES  |     |
    | pagesOut    | int(11)      | YES  |     |
    | pagesCached | int(11)      | YES  |     |
    | readLocks   | int(11)      | YES  |     |
    | writeLocks  | int(11)      | YES  |     |
    | getRows     | int(11)      | YES  |     |
    | posRows     | int(11)      | YES  |     |
    | scanRows    | int(11)      | YES  |     |
    | putRows     | int(11)      | YES  |     |
    | delRows     | int(11)      | YES  |     |
    | readWait    | int(11)      | YES  |     |
    | readHeld    | int(11)      | YES  |     |
    | writeWait   | int(11)      | YES  |     |
    | writeHeld   | int(11)      | YES  |     |
    +-------------+--------------+------+-----+


Recent Changes RssIcon.gif

  • Change 7621 stewart_lord@stewart_lord_mac_pro
      Updated create table statements to use the IF NOT EXISTS
      condition. This allows SQL to be fed into an existing database
      without errors/warnings.
  • Change 7620 stewart_lord@stewart_lord_mbp
      Minor update to Track2SQL to avoid timezone unset issue
      and split() function deprecated in PHP 5.3.
  • Change 7338 stewart_lord@stewart_lord_mac_pro
      Rolled-back change 7209. This removes the endTime column
      from the process table. Track2SQL no longer attempts to analyze
      'completed' entries.
      This change was influenced by three factors:
      - Analyzing 'completed' entries significantly degrades
      performance (about 2.75x slower in my tests).
      - In some versions of PHP (5.2.8) the strtotime() function
      suffers from a memory leak.
      - The 'completed' entries are not part of Vtrack output.
  • Change 7209 stewart_lord@stewart_lord_mac_pro
      Integrating an enhancement from Michael Shield's guest
      branch. Track2SQL now records the end time of each process
      (when it is reported). This information is reported for every
      completed process when -vserver=2|3 logging is enabled.
      If verbose server logging is enabled this is more reliable than
      start 'time' + 'lapse' because (by default) lapse is only reported
      when it exceeds a certain threshold. If, however, vtrack=1 is
      set then lapse time will be reported for every command.
      Note: this change brings a schema change. It adds a
      'endTime' column to the process table.
  • Change 7199 stewart_lord@stewart_lord_mac_pro
      Updated Track2SQL readme file to reflect schema changes.