Server Monitoring - Intergraph Smart 3D - Installation & Upgrade

Server Monitoring - Intergraph Smart 3D - Installation & Upgrade - Hexagon

Intergraph Smart 3D Installation

Language

English

Product

Intergraph Smart 3D

Search by Category

Installation & Upgrade

Smart 3D Version

The details on how to monitor system counters and log the results to a file can be found in the operating system documentation. Hexagon Asset Lifecycle Intelligence assumes that the reader is already familiar with that topic.

While doing the monitoring at Hexagon Asset Lifecycle Intelligence, we started from a very broad selection of counters. The analysis of these system parameters with the number of users on the system conducted us to focus on a subset of counter found the most relevant for the scalability and dimensioning analysis. The list of highlighted counters is not a definitive list of what should be monitored, but the list of counters that should be watched with the closest attention. We recommend starting from a wide selection of counters and discard later what is found not to be pertinent.

Testing has demonstrated that these counters can be monitored every second to analyze a specific workflow or every 10 seconds if you plan on monitoring the system for longer times.
A log file of the user activity should be kept in order to relate the server activity to the actions of the user.
You can also use the System and Configuration Analyzer tool (SCA) to monitor these parameters. This tool can be downloaded from Smart Community.

Processor

Processor average usage should be kept under 80% for each processor. Isolated spikes over 80% are acceptable.

We recommend monitoring the following counters:

% Processor time
% Privileged time
% User time
% Interrupt time
Interrupts per second
Processor queue length
Context switches per second

Any significant discrepancy between the % Processor time and the % User time indicates that the CPU is not available for SQL server and needs to be investigated. This problem did not occur during testing.

Logical Disk and Physical Disks

The recommendation is to monitor the following counters for each physical drive:

% Disk read time
% Disk write time
% Disk time
% Idle time
% Average disk queue length
% Average disk read queue length
% Average disk write queue length
Disk second per read
Disk second per write
Disk second per transfer
Disk read per second
Disk write per second
Disk transfer per second
Disk write bytes per second
Disk read bytes per second

Memory

Available Mbytes
Page fault per second
Page read per second
Page write per second
Pages per second

Page fault per second needs to be monitored only to make sure the system is not overloaded.

Network

Byte received per second
Byte sent per second
Current bandwidth
Output queue length

Record PING times between client and server to verify that the software is not being affected by other network traffic.

Database (Microsoft SQL Server)

Buffer Manager Object

Counter to monitor:

Buffer cache hit ratio
Procedure cache pages
Free pages
Page read per second
Stolen pages
Page writes per second
Free List Stalls per second
Total pages
Page Life expectancy
Page reads/sec Number
Page writes/sec Number

Buffer cache ratio should remain over 90%. During testing, the ratio has always been in the 95-99% range.

SQL Databases Object

We recommend monitoring the following counter at least for the Tempdb, Catalog, and Model databases. You can monitor the other Smart 3D databases as well. For performance testing, the Tempdb database should be monitored.

Data file size
Log file size
% log used
Transaction per second
Active transaction

The transaction per second was not found very relevant for Smart 3D because the software executes very few transactions per second (one transaction per command normally). Also, note that two different commands, that is, two transactions can have a very different impact on the database. Testing has shown that the measure of how much the software "hits" the database server is best measured with the number of batch requests per second.

SQL Statistics Object

Batch request per second
SQL compilation per second
SQL re-compilation per second

SQL Locks

Average wait time (ms)
Lock timeouts/second
Lock waits/second
Number of deadlocks per second

A certain amount of locking is to be expected because of the way SQL manages data integrity. Excessive locking, however, can lead to blocking and needs to be analyzed in order to correct the software. Any deadlock situation needs to be analyzed.

SQL Latches

Average latch wait time (ms)
Latch wait per second

Database (ORACLE)

Oracle performance counters can be monitored using the web-ui that installs with Oracle or by using the System and Configuration Analyser tool (SCA) available from Smart Community. Please consult Oracle documentation for details on it.

Oracle Database Counters:

Dictionary Cache Hit Ratio – Should be > 90
Library Cache Hit Ratio – Should be >= 99%
DB Block Buffer Cache Hit Ratio – Should be > 90
Log Switch Interval – Should be greater than 30 minutes

Oracle Reports:

Automatic Database Diagnostics Monitor (ADDM) report – This report can be generated from the Oracle Database Console or using the SCA tool available on Smart Community.