SQL Server Performance Monitoring Through PerfMon

This is a list of Windows performance counters that my company captures to monitor, in real-time, our SQL infrastructure.  The performance counters are captured every 5 [OLTP] or 15 [non-OLTP] seconds, written to a SQL database table(s) in real-time and later viewed through automated reports [SSRS].  We create this method to monitor our SQL infrastructure because: 1)  We didn’t want to spend a lot of $$; 2) This tools comes with Windows and is available for free; and 3) No tool out there gave us the granularity of <1 minute capture and analysis.  We also setup a job to aggregate data older than 14 days to another database for historical analysis.

We capture these performance counters from a trusted Windows box which has the necessary firewall rights the SQL instances [virtual or physical].  I will go into detail, at a later time, on how to create the SQL database table(s) for capturing the data and configuring performance counters to write to the tables automatically.

LogicalDisk (Root Drive Letter or Mount Point {R:\Data1})

  • Avg. Disk Queue Length — Average number of I/O requests [read + write] waiting for disk access.
  • Avg. Disk sec/Read — Average time, in seconds, of a read of data from disk.  Measurement of read latency.
  • Avg. Disk sec/Transfer — Average time, in seconds, of a trasnfers onto the disk.  Measurement of read & write latency.
  • Avg. Disk sec/Write — Average time, in seconds, of a write of data to disk.  Measurement of write latency.
  • Disk Reads/sec — Number of reads from the disk, per second.
  • Disk Transfers/sec — Number of transfers [reads & writes] to disk, per second.
  • Disk Writes/sec — Number of writes to the disk, per second.

Memory

  • Available MBytes — Available physical memory, in MBytes, available for a given process or system use.
  • Pages/sec — Rate at which pages are read from or written to disk to resolve hard page faults.

SQLServer *or* MSSQL$<INSTANCE>

Access Method

  • Page Splits/sec — Number of page splits per second that occur as a result of overflowing index pages.

Buffer Manager

  • Buffer cache hit ratio — Percentage of pages that were found in the buffer pool without having to perform a read from disk.
  • Page life expectancy — Number of seconds a page will stay in the buffer pool without references.

Database Mirroring(_Total)

  • Log Send Queue KB — Total number of KB of log that have not been sent to the mirror server.
  • Redo Queue KB — Total number of KB that redo on the mirror database is behind the hardened log.
  • Send/Receive Ack Time — Milliseconds messages waited for acknowledgement from the partner per second.

Database(_Total) & Database(<DB_NAME>)

  • Data File(s) Size KB — Cumulative size of all the data files in the database.
  • Transactions/sec — Number of transactions started for the database.
  • Write Transactions/sec — Number of transactions which wrote to the database in the last second. <– SQL 2008 ONLY!

General Statistics

  • User Connections — Number of users connected to the SQL instance.

Latches

  • Latch Waits/sec — Number of latch requests that could not be granted immediately and had to wait before being granted.

Locks(_Total)

  • Lock Waits/sec — Number of lock requests that could not be satisfied immediately and required the caller to wait before being granted the lock.

Memory Manager

  • Memory Grants Pending — Current number of processes waiting for a workspace memory grant.

SQL Statistics

  • Batch Requests/sec— Number of batch requests received by the SQL instance, per second.
  • SQL Re-Compilation/sec — Number of SQL recompiles that occurred, per second.

SQL 2008 ONLY — Workload Group Stats(default)

  • Active requests — Number of active requests in the default workgroup.
  • Blocked tasks — Number of blocked tasks in the default workgroup.
  • CPU usage % — A method through which you can measure the CPU consumed by this SQL instance.  Helpful when stacking instances on 1 box.

Network Interface(<NIC Name>)

  • Bytes Total/sec — Rate at which bytes are sent and received over each network adapter, including framing characters.
  • Current Bandwidth — Estimate of the current bandwidth of the network interface in bits per second (BPS).

Process(sqlserver)

  • % Processor Time — Percentage of elapsed time that all of process threads used the processor to execution instructions.  The value is cumulative over all logical processors.
  • % User Time — Percentage of elapsed time that all of process threads spent executing code in user mode.  The value is cumulative over all logical processors.
  • Working Set — Set of memory pages touched recently by the threads in the process.

Processor(%Total)

  • % Processor Time — Percentage of elapsed time that the processor spends to execute a non-Idle thread.  A measurement of overall CPU pressure [up to 100%].

System

  • Context Switches/sec — Combined rate at which all processors on the computer are switched from one thread to another.
  • Processor Queue Length — Number of threads in the processor queue.  Need to divide by the number of logical processors to get actual value.

NOTE:  The firewall ports that you need opened to capture data from the trusted host are the same as those for SMB [TCP & UDP ports 135-139, 445].

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s