STEP Event Protocol Specification

Following is the specification for the "Simple Text Event Protocol" ("STEP") family of SENSE Event Protocols.

Overview

STEP provides a very simple mechanism to capture simple status of a target entity. It is designed to be used in conjuction with programs that use standard output mechanisms for simple text. Each output line is of the form:

       <code><space><reason>

That is, each line consists of three fields:

<code> A numeric string (consisting of only decimal digits) that represents the event code
<space> A single space character, used to separate the other two fields
<reason> Arbitrary display text that describes the event reason

A line of this form is called an event line in the STEP documentation. Here is an example event line describing an event in which a printer has run out of paper:

       242 Printer out of paper

Here is an example of a sequence of event lines displayed by a program conforming to the STEP event line format; this example shows a laser printer going from a healthy "ready" state to a problem state, then back to a a healthy state:

       111 Ready
       112 Processing
       112 Printing
       342 Printer jam
       242 Cover open
       122 Cover closed
       122 Warming up
       112 Printing

If an event line is displayed such that it does not start with a decimal numeric string, or does not contain a space character immediately after the numeric string, then the text of the line is assumed to be a separate event, but the code for the event is assumed to be the last valid code seen. For example:

       111 Ready
       342 Printer jam
       Cover open
       Cover closed
       Warming up
       112 Printing

In this example, the three lines that do not contain an event code are interpreted such that they have an implicit event code value of "342", since the last event line seen with a valid event code was the line displaying "342 Printer jam".

The standard SENSE publisher sfpstrm may be used as a "back-end" to an existing Unix program or shell script to capture displayed lines and use the contents of those lines to publish the current condition of the entity. The sfpstrm program reads lines from stdin and parses each line to extract the code and reason fields, then sends formatted STEP event messages to the Server on behalf of the Publication specified on the sfpstrm command line.

Event Codes

The <code> field in an event line is a specially structured decimal integer in which certain dimensions of the entity's current status are encoded. This value is structured using the following format:

       <vendor-subcode><condition-subcode>

where

<vendor-subcode> A vendor-specific event code.
<condition-code> A sequence of one to four digits that express the overall condition of the entity.

Each of these subcodes is a sequence of decimal digits; the two subcodes are concatenated such that the <condition-code> is in the least significant position of the resulting decimal number.

The <vendor-subcode> is optional and may not be present; this code may consist of between zero and seven digits.

The <condition-subcode> consists of a sequence of between one and three digits. Each digit represents the value for one of three different dimensions that express the current overall status of the entity; the definitions of these dimensions are described in the following table:

Condition Dimensions
DimensionDefinition
ActivityCurrent level of activity, from "idle" to "extremely busy".
HealthOverall operational status, indicating the relative problem state.
SupportLevel of technical support required to resolve the indicated problem level.

The value for each dimension is a single decimal digit, where the sequence of dimension digits follows the form:
       <support><health><activity>

The following table lists the meanings of each decimal digit value for each dimension; the table columns are aligned in the order the digits would appear in a complete <condition-subcode> string:

Summary of Condition Dimension Values
ValueSupportHealthActivity
0 Unknown Unknown Unknown
1 None Healthy Idle
2 User Warning, transient Not idle
3 Operator Warning, persistent Lightly busy
4 Technician Alert Busier (see below}
5 Administrator (reserved) Busier (see below}
6 (reserved) (reserved) Moderately busy
7 (reserved) (reserved) Busier (see below}
8 (reserved) (reserved) Busier (see below}
9 (reserved) (reserved) Extremely busy

The Support and Health dimensions are similar in that each are represented by a small set of choices (enumerated values), and that all choices within a set are relatively unique. The Activity dimension, however, is a bit more complicated in that its range of values are partitioned into a few choices and a small range set.

The choices for the Support dimension represent a subset of the categories of "Role Models" as defined in Appendix D of the SNMP Printer MIB (RFC 1759). These categories are defined in an order that indicates an increasing level of operational knowledge about the entity:

ValueCategoryMeaning
0UnknownUnable to determine the required support
1NoneNo support required (entity should be healthy)
2UserVery low knowledge required
3OperatorSome knowledge required, not necessarily technical
4TechnicianSpecial training required, usually on the physical aspects of the entity; for example, Field Service personnel
4Administrator special training required, usually dealing with configuration, compatibility or interoperability; for example, System Manager
6-9(reserved)Reserved for future definitions
The Health choices of Warning, transient and Warning, persistent are similar, but different in an important way. A "transient" Warning describes a state in which the Warning condition is only temporary and should clear itself within a relatively short period of time. An example of this condition would be the time during which a laser printer is "warming up" after being idle for some period of time; such a condition can occur if the printer had gone into "Energy Saver" mode, and a print job has just arrived on one of its interfaces.

A "persistent" Warning, on the other hand, indicates that the Warning condition will continue until some kind of human intervention takes place. Using a laser printer as an example, this condition would exist when the printer recognizes that its toner supply is low; if the toner is not refilled soon enough, then the printer will transition to a Health value of Alert.

The Activity dimension is a range of values expressing the level of work in which the entity is engaged. The values defined for this dimension have been assigned so as to accommodate varying abilities to resolve the level of activity within a given entity. The values are defined so that an implementor has some clear alternatives in selecting which values to issue for a target entity. The ability for an implementor to resolve the relative level of activity for any given entity is typically one of the following:

To be able to unambiguously express these situations, the set of Activity values was partitioned to accommodate these various degrees of capability:

ValueMeaning
0Unknown; unable to determine activity level
1Idle; definitely no activity
2Not idle; some activity is present, but unable to determine any kind of level
3Low level of activity; within the spectrum of ability to determine an activity level, this implies the least level of activity
4, 5More than lightly busy
6Average level of activity; if the level of activity can be measured in some way, then this value represents the mid-point of the range
7, 8Busier than average
9As busy as the entity can possibly be without exploding, going on strike, or otherwise punting off into oblivion

Given these definitions for expressing an entity's condition at the time of an event, the following table presents some examples of STEP event code values; each example includes a printer-related problem scenario in which the event code can express useful information to certain types of persons:

Examples of Event Code Values
ValueMeaningProblem Scenario
000 Unknown; absolutely nothing can be said about the entity because the associated Publisher is not currently bound to the Publication A person watching a graphical monitor application would see this printer as having a rendition respresenting an undefined condition; as such, the person would not be able to make any judgements about the printer, including whether it even exists, or not.
0 Unknown, same as the previous example. This example illustrates that leading 0 digits may or may not be specified in a STEP event code; if a non-zero <vendor-subcode> value is specified in the event code, then all zero digits in the <condition-subcode> must be specified
111 Healthy, idle (no support required). A user in need of rapid turnaround for a print job would be happy to either see this event occur, or observe that the printer's current condition is such that no one is currently using the device.
011 Healthy, idle. Same as the previous example, except this value indicates that the Publisher for the entity is unable to ever declare the level of Support needed should a problem condition arise.
010 Healthy. Same as the previous example, except this value indicates that the Publisher is not only unable to declare a Support level, but that it is unable to say whether the device is busy or not (a truly clueless Publisher implementation).
013 Healthy, only slightly busy. A user not requiring rapid turnaround might select this printer for a non-critical print job.
121 Short-term warning, idle, no support required. A user looking for quick turnaround might select this device, knowing that by the time the print job arrives at the printer, the printer will be ready to handle the job.
246 Alert, moderately busy, User required. The output tray of the printer is full; users with a vested interest in the printer might be moved to address this problem, for example, the printer is clicking off many copies of their resumes...and their manager is about to use that printer to make copies of a critical presentation that begins in 10 minutes.
331 Persistent warning, idle, Operator required. The toner is low in the printer; since it is currently idle, an Operator may not immediately rush to the aid of the device.
338 Persistent warning, very busy, Operator required. Toner is low, but there is immediate, significant demand for the printer. If an Operator experiences this condition with one printer and simultaneously experiences the previous condition on another printer, then the Operator is likely to set the proper priorities in deciding which problem to address first.
449Red Alert; the entity can not perform its intended function. It's 5:05 pm and the printer has caught on fire, but there are 376 print jobs queued up to that device, including a dinner invitation for the CEO; clearly a condition of serious interest to a Field Service representative (not to mention the remaining elements of the food chain).
523 Short-term warning, lightly busy, Administrator required. The printer has processed a print job that requires a resource not currently available in the printer, such as a particular font; since the printer is indicating only a slight amount of usage, the responsible Administrator (Ed: oxymoron?) might choose to put off resolution of the problem until after lunch.
6651907523 Short-term warning, lightly busy, Administrator required. Same as the previous example, but the event code also includes a vendor subcode; the subcode could provide information as to exactly which resource is required, or perhaps the type of resource.
391342 Alert, busy (but not clear how much), Operator required. The printer has jammed; the event code includes a vendor subcode that informs the Operator of the location of the jam within the printer.
2349 Alert, extremely busy, Operator required. The printer has run out of paper; normally this is categorized as requiring only User-level support, but the Publisher for this printer was configured to indicate Operator-level support is required due to the types of paper used in the printer. The event code includes a vendor subcode that informs the Operator which input tray has gone empty; for a printer located at a distance for the paper supply, this information can save the Operator valuable time in that the Operator can know which paper to fetch from the supply area before heading off to the printer.

Event Messages

tbs


Copyright (C) 1995,1996 by Jay Martin, Underscore, Inc. All rights reserved.
Comments to the author, J. K. Martin (jkm@underscore.com). Last modified 96/05/11.