Database design for a SNMP network monitor Draft version 0.2.1 19Mar1998 1. Motivation SNMP collectors are usually used to collect performance related data in an IP network. These data have to be later examined by the management station to get an information about the state of the network. To get an idea of the current state, the station should allow to plot data for different time scales, compare data of different times and compare data of different nodes or interfaces. 2. Collection proccess To get to the needed data, typically SNMP collectors are used to poll the nodes at fixed time intervals (typically 15 minutes intervals). This collection process may lead to huge amount of data (i.e. about 200 MB/day for a network of about 1000 interfaces. A network with such many interfaces isnīt such uncommon in the age of Access Servers and Port Masters). To lower the bandwidth needed by the SNMP collectors and to get redundancy, it should be possible to run multiple collectors which are distributed in the network and later transfer the collected data to the management station at fixed time intervals (i.e. once every night). 3. Database consideration The database needed to get the desired information aout of the collected data merely consists of two parts: - a database containing structural data (topology, type of node, function of node, node class, link class, location of node, etc) - a database containing the collected performance data The requirements for the two databases are as following: structural data: - unfrequent changes - model the current structure - provide a history - automatic update desired, but manual editing should be possible, as well - open API to let other tools be able to access this as needed or desired performance data: - huge amounts of data to be processed - data may arrive from different distributed collectors - redundant data may arrive and has to be detected and removed - there might be missing data (i.e. a node not responding during a certain time period) - data should be provided in different reduction levels over time (i.e. unreduced data with an interval defined by the collector with 1 week of data, 1h interval with 1 month and 1d interval with 1 year of data) - data are usually only accessed using pre-defined keys (i.e. node/interface/time) - analysis variables are always numeric - certain statistics should be performed when reducing data (i.e. sum, avarage, max, min, count, or a 90% tracking to show where the top 90 % values fall in) 4. Implementation details 4.1 Structural data The strucutal data can be implemeted using a relational database. A possible solution could consist of the following tables: - node: - nodename (CHAR UNIQUE) - display name (CHAR) - address (CHAR) - location (CHAR) - type (CHAR) - description (CHAR) - responsible person (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - snmp: - nodename (CHAR) - snmp version (NUM) - snmp port (NUM) - snmp retries (NUM) - snmp timeout (NUM) - snmp write community (CHAR) - snmp read community (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - link: - key (NUM UNIQUE) - nodename 1 (CHAR) - nodename 2 (CHAR) - interface1 (CHAR) - interface2 (CHAR) - type (CHAR) - description (CHAR) - history: - key (NUM) - speed1 (NUM) - speed2 (NUM) - start (DATE) - stop (DATE) - IP1 (CHAR) - IP2 (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - nodegroup: - nodegrpname (CHAR UNIQUE) - description (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - nodegrplist: - nodename (CHAR) - nodegrpname (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - linkgroup: - linkgrpname (CHAR UNIQUE) - description (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - linkgrplist: - key (NUM) - linkgrpname (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - capability - capname (CHAR UNIQUE) - captype (CHAR) - description (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - caplist - nodename (CHAR) - capname (CHAR) - added by (CHAR) - added at (DATE) - modified by (CHAR) - modified at (DATE) - nodetext (no delete/modify) - nodename (CHAR) - added by (CHAR) - added at (DATE) - notes (CHAR) - linktext (no delete/modify) - key (NUM) - added by (CHAR) - added at (DATE) - notes (CHAR) 4.2 Performance data 4.2.1 Performance data implemented using flat files The performance data could be implemeted using a set of binary "flat files", sorted by the pre-defined keys. We need two different set of keys, one for the router data (i.e. CPU, memory, etc) and one for interface data (i.e. ifOutDiscards, ifInOctets, frCircuitReceivedFECNs, atmInterfaceOCDEvents). It should be possible to configure the column number and names of the performace database and to specify the statistics to be performed on data reduction. The transformation of OIDs to the performance database names should be done in the load scripts, as sometimes the translation is not a one-to-one relationship (i.e. for the CPU each vendor uses a different OID). The load script takes care of splitting the data into the different tables, does the reduction for the different levels, expires the old data and finally sorts the tables by the pre-defined keys. There is only one load process which adds the various data to the database sequentially, so there is no need for file locking or shared access methods. The data has to be transferred to the machine running the database before the load script run on this data. The database of MRTG 3.0 might be used for this purpose. 4.2.2 Performance data implemented using SQL Using a standard SQL database would probably be slower but it would also offer more flexiblity in the reporting on the data, allow other tools to access this data. (web servers with php stuff, server push to your orgination on problems in the network.. etc..) Flat files on the other hand would be faster but would require more tools to be written if other access to the data is desired. 4.2.3 Sample structure of performace database A sample table could look like this: detail_interface: - node (CHAR) - interface (CHAR) - timestamp (DATETIME) - ifOutOctets - ifInOctets - ifOutDiscards hour_interface: - node (CHAR) - interface (CHAR) - timestamp (DATETIME) - ifOutOctets_avg - ifOutOctets_max - ifInOctets_avg - ifInOctets_max - ifOutDiscards_max day_interface: - node (CHAR) - interface (CHAR) - timestamp (DATETIME) - ifOutOctets_avg - ifOutOctets_max - ifInOctets_avg - ifInOctets_max - ifOutDiscards_max detail_node: - node (CHAR) - timestamp (DATETIME) - CPUBusy5 - FreeMem - ResponseTime hour_node: - node (CHAR) - timestamp (DATETIME) - CPUBusy5_max - FreeMem_min - ResponseTime_max day_node: - node (CHAR) - timestamp (DATETIME) - CPUBusy5_max - FreeMem_min - ResponseTime_max 5. Feedback Feel free to send any feedback to the GXSNMP mailing list . 6. Authors: This draft contains contributions from: Jochen Friedrich Gregory A. McLean Gus Estrella