Database design for a SNMP network monitor
                          Draft version 0.2.1
                              19Mar1998

1. Motivation

SNMP collectors are usually used to collect performance related data in
an IP network. These data have to be later examined by the management
station to get an information about the state of the network. To get an
idea of the current state, the station should allow to plot data for
different time scales, compare data of different times and compare data
of different nodes or interfaces.

2. Collection proccess

To get to the needed data, typically SNMP collectors are used to poll
the nodes at fixed time intervals (typically 15 minutes intervals). This
collection process may lead to huge amount of data (i.e. about 200
MB/day for a network of about 1000 interfaces. A network with such many
interfaces isn´t such uncommon in the age of Access Servers and Port
Masters). To lower the bandwidth needed by the SNMP collectors and to
get redundancy, it should be possible to run multiple collectors which
are distributed in the network and later transfer the collected data to
the management station at fixed time intervals (i.e. once every night).

3. Database consideration

The database needed to get the desired information aout of the collected
data merely consists of two parts:
- a database containing structural data (topology, type of node,
  function of node, node class, link class, location of node, etc)
- a database containing the collected performance data

The requirements for the two databases are as following:

structural data:
- unfrequent changes
- model the current structure
- provide a history
- automatic update desired, but manual editing should be possible, as well
- open API to let other tools be able to access this as needed or desired

performance data:
- huge amounts of data to be processed
- data may arrive from different distributed collectors
- redundant data may arrive and has to be detected and removed
- there might be missing data (i.e. a node not responding during a
  certain time period)
- data should be provided in different reduction levels over time (i.e.
  unreduced data with an interval defined by the collector with 1 week of data, 
  1h interval with 1 month and 1d interval with 1 year of data)
- data are usually only accessed using pre-defined keys (i.e.
  node/interface/time)
- analysis variables are always numeric
- certain statistics should be performed when reducing data (i.e. sum,
  avarage, max, min, count, or a 90% tracking to show where the top 90 % 
  values fall in)

4. Implementation details

4.1 Structural data 

The strucutal data can be implemeted using a relational database. A
possible solution could consist of the following tables:

- node:
  - nodename (CHAR UNIQUE)
  - display name (CHAR)
  - address (CHAR)
  - location (CHAR)
  - type (CHAR)
  - description (CHAR)
  - responsible person (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- snmp:
  - nodename (CHAR)
  - snmp version (NUM)
  - snmp port (NUM)
  - snmp retries (NUM)
  - snmp timeout (NUM)
  - snmp write community (CHAR)
  - snmp read community (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- link:
  - key (NUM UNIQUE)
  - nodename 1 (CHAR)
  - nodename 2 (CHAR)
  - interface1 (CHAR)
  - interface2 (CHAR)
  - type (CHAR)
  - description (CHAR)

- history:
  - key (NUM)
  - speed1 (NUM)
  - speed2 (NUM)
  - start (DATE)
  - stop (DATE)
  - IP1 (CHAR)
  - IP2 (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- nodegroup:
  - nodegrpname (CHAR UNIQUE)
  - description (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- nodegrplist:
  - nodename (CHAR)
  - nodegrpname (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- linkgroup:
  - linkgrpname (CHAR UNIQUE)
  - description (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- linkgrplist:
  - key (NUM)
  - linkgrpname (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- capability
  - capname (CHAR UNIQUE)
  - captype (CHAR)
  - description (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- caplist
  - nodename (CHAR)
  - capname (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - modified by (CHAR)
  - modified at (DATE)

- nodetext (no delete/modify)
  - nodename (CHAR)
  - added by (CHAR)
  - added at (DATE)
  - notes (CHAR)

- linktext (no delete/modify)
  - key (NUM)
  - added by (CHAR)
  - added at (DATE)
  - notes (CHAR)

4.2 Performance data

4.2.1 Performance data implemented using flat files

The performance data could be implemeted using a set of binary "flat
files", sorted by the pre-defined keys. We need two different set of
keys, one for the router data (i.e. CPU, memory, etc) and one for
interface data (i.e. ifOutDiscards, ifInOctets, frCircuitReceivedFECNs,
atmInterfaceOCDEvents). It should be possible to configure the column
number and names of the performace database and to specify the
statistics to be performed on data reduction. The transformation of OIDs
to the performance database names should be done in the load scripts, as
sometimes the translation is not a one-to-one relationship (i.e. for the
CPU each vendor uses a different OID). The load script takes care of
splitting the data into the different tables, does the reduction for the
different levels, expires the old data and finally sorts the tables by
the pre-defined keys. There is only one load process which adds the various
data to the database sequentially, so there is no need for file locking or
shared access methods. The data has to be transferred to the machine running
the database before the load script run on this data. 

The database of MRTG 3.0 might be used for this purpose.

4.2.2 Performance data implemented using SQL

Using a standard SQL database would probably be slower but it would also offer
more flexiblity in the reporting on the data, allow other tools to access 
this data. (web servers with php stuff, server push to your orgination on 
problems in the network.. etc..) 
Flat files on the other hand would be faster but would require more tools
to be written if other access to the data is desired.

4.2.3 Sample structure of performace database

A sample table could look like this:

detail_interface:
  - node (CHAR)
  - interface (CHAR)
  - timestamp (DATETIME)

  - ifOutOctets
  - ifInOctets
  - ifOutDiscards

hour_interface:
  - node (CHAR)
  - interface (CHAR)
  - timestamp (DATETIME)

  - ifOutOctets_avg
  - ifOutOctets_max
  - ifInOctets_avg
  - ifInOctets_max
  - ifOutDiscards_max

day_interface:
  - node (CHAR)
  - interface (CHAR)
  - timestamp (DATETIME)

  - ifOutOctets_avg
  - ifOutOctets_max
  - ifInOctets_avg
  - ifInOctets_max
  - ifOutDiscards_max

detail_node:
  - node (CHAR)
  - timestamp (DATETIME)

  - CPUBusy5
  - FreeMem
  - ResponseTime

hour_node:
  - node (CHAR)
  - timestamp (DATETIME)

  - CPUBusy5_max
  - FreeMem_min
  - ResponseTime_max

day_node:
  - node (CHAR)
  - timestamp (DATETIME)

  - CPUBusy5_max
  - FreeMem_min
  - ResponseTime_max

5. Feedback

Feel free to send any feedback to the GXSNMP mailing list
<gxsnmp@randomc.com>.

6. Authors:

This draft contains contributions from:

Jochen Friedrich <jochen@scram.de>
Gregory A. McLean <gregm@randomc.com>
Gus Estrella <gus@norstar.net>