Visit Our Home Page  Explored. Designed. Delivered™

Creativyst Docs 
Software Stabiltity Ratings
Contact | News | Glossary | Site  




Creativyst Software Stability Ratings
A proposed separation of software stability from process maturity

Work in Progress:
I'm just getting started on this. There will eventually be an article here. This page is currently linked to give others opportunities to comment on, and contribute to, the work in progress. Thank you for your patience.

    John Repici

Author: Dominic John Repici

The relationship between software stability and version is tenuous at best. Yet we software designers and engineers --people who should know better-- have accepted software versioning as the primary indicator and feedback mechanism about software's stability since people first started writing it.

Software designers know, as a matter of training, that it is unacceptable to assign multiple meanings to a single variable, no matter how close in concept those meanings may be. Yet, we happily go on overloading our software version numbers to also connote the stability of the underlying package. Again, because there is some --not quite quantifiable-- relationship there, and, well, that's the way its always been done.

We've also always known at some level that there was something wrong, somewhere, with the whole notion. We've filled many tombs with attempts to understand and document the illusive relationship between a software project's longevity and the resulting software's quality. We've approached it with mathematical precision, religious zeal, and edgy rebellion, but have we ever really been able to square it?

This paper proposes completely separating the concepts of software stability (or "quality") from version maturity.

There are at least two advantages to separating software version numbers from the stability rating:

  1. A non-ambiguous indication of a given piece of software's quality (stability) can be made available to the people responsible for using, supporting, upgrading, insuring, and valuing it (within any context).

  2. With two separate values to denote version and stability, the subtle relationship between these two concepts can be better studied and understood simply by analyzing historical records of both data sets.

    At the same time, the relationship will become less important, because a value based on something more germane than a software product's maturity can be used to estimate how likely the product is to work or fail within a given context or system.

Design Philosophy & Goals
The overriding philosophy of this project is simplicity. If we have to tell people to defocus their eyes and concentrate on a point somewhere behind the page (or some similar-sounding instructions) in order to understand this system then we have failed.

It is important to the usefulness of this rating system that educated people in any field be able to grasp the connection between the rating value and the likelihood that the rated product will work or fail within their given situation. It must also be explainable in such a way that, if they wish, people can grasp the reasons and mechanisms that the system claims will insure it reflects and predicts real-world probabilities.

Design Goals:

  1. The rating system shall be simple, unambiguous, and concise without being too simple to be useful
  2. The rating system shall be based on a small, fixed set of self-evident criteria that all who use it can comprehend and validate to their own satisfaction.
  3. The rating system shall be independent of the code, the distribution files, or the deliverables that make up the software. A rating shall never be embedded within software distribution files or code because it is free to change without any changes to the underlying software or package.
  4. Primarily, the rating system shall be a trailing indicator. It will be based on the software's real-world, historically observable behavior with regard to its stability and, generally not on other behaviors or attributes of the software.
  5. Where little history is available to produce a rating, that fact should be easily discernable through observation of the rating alone.
  6. When special processes or practices have been employed during the design and/or development of the software to insure safety and reliability, such special measures may be reflected within the rating so long as they do not interfere with the rating's primary purpose of relating the historically observed stability of the underlying software.
  7. For minor determinations to a new project's rating (before sufficient historical data has accumulated) the rating system MAY consider the software stability record of individuals who perform the new development if a clear and quantifiable connection between the individuals' past, " first release" software's stability and the stability of the current software can be made and statistically verified.
  8. To the extent, if any, that the rating system is linked to the people who design and develop the software being rated (as provided above), the only credential that shall be considered regarding the contributing individuals is the complete documented stability history of first-release software they've produced in the past.
  9. [TBD] Should the consideration put forward above for individuals be extended to include organizations?

Some Definitions
  • Function or Method
    For the sake of this paper, these two names are used interchangeably. A function is a callable block of code, which may return a value, produce side-effects, or both. Functions may include parameters to be referenced by the code that is within the block. Parameters may be altered as a side effect of calling the code in the function (such as when they are passed by reference). Functions may have identifiers or may be anonymous.

  • Independence
    Because we've lived with version-stability as an overloaded variable for so long, this paper will often reinforce the concept of independence. Changes to the stability rating of a piece of software are not necessarily accompanied by changes to the code or the distribution files. While a stability rating can change based on software changes it will often change based on events that are completely external to the code and distribution files.

  • Library Grade Function (LGF)
    LGFs are the embodiment of the notion of "re-usable code". An LGF will have no non-standard or non-LGF function, identifier, or storage dependencies. That means, it may call outside functions or reference identifiers that are part of a formally standardized library within its own language (e.g. the standard C library), or it may call other functions or reference identifiers packaged in the same library that it is itself a part of and linked from. All other dependencies are disallowed. Should any other function calls or identifier dependencies exist and be used within a function, directly, or as a result of calling another function, the function is NOT an LGF

  • Libraries
    Functions ("methods") within a function or class library MUST each be rated separately.
    • For a function ("method") within a library to be rated, it must be an LGF. Otherwise it is un-rated which should be thought of as equivalent to a rating lower than Alpha (A).
    • The rating given for a function or class library as a whole MUST be the rating of the lowest rated function ("method") within the library.
    • If A library includes un-rated functions ("methods") within it, these are considered the lowest rating and so the entire library must be un-rated.
    • Functions ("methods") that are part of function or class libraries may have additional rating requirements, which include; unit testing (min, max, min-max exceeded, and typical inputs), thread behavior (unit tests with multiple calls on the stack simultaneously), external re-entrancy behavior (unit tests of multiple stack depths from within a re-entrant caller), self re-entrant behavior (if the function or method being rated is re-entrant, then test under a multitude of stack conditions), Statement lines per function (average, percentage above threshold, max).

  • Release vs. Version
    Warning: These definitions for release and version are not currently compatible with IEEE definitions. If anyone has ideas for how to maintain compatibility while allowing for independence, please contact me.

    A single release of a given piece of software may involve multiple versions.

    • Version
      A version denotes that changes have been made to the software package from a prior version, however small, since the last time the package was published outside of the developer.
    • Release
      A release is a version of the software that has been functionally changed and/or functionally improved in a significant way from the preceding version (or from inception).

  • SSR
    SSR is an abreveiation for Software Stability Ratings as they are defined in this paper.

  • User, Developer, & Tester
    These are the three entities who may have a direct effect on a software stability rating at various levels.
    • Developer
      The entity that is actually developing the software. In the case of multiple independent developers being sub-contracted, this is the top-level developer responsible for the overal unit which a given rating applies to.
    • Tester
      Testing agencies may also perform assurance and insurance functions as well, these are generally not brought in at levels below Hi-rel but may be.
    • User
      The most important effector. Experience and observations by this entity have power to downgrade (or upgrade) a rating at all levels.

Publication is a foundational purpose of any rating system. This specification prescribes a way in which to publish downgrades to a software version's stability rating.

When a reliability rating of a version of software is changed the new rating must be published to an easily accessible web-page where users of the software have been clearly instructed such ratings can be found.

A reliability rating that is downgraded to any rating other than "Unsecure" must be published in this way within ten business days from the day it is downgraded.

A reliability rating that is changed to "Unsecure" must be published in this way within three business days of the day it is changed. Also, in the case of a rating that is changed to "Unsecure", users who have purchased the downgraded version in the past year and have not purchased or received more recent versions must be sent email stating that the version was downgraded. If there is no email address, or an attempt to send mail to the email address on file has failed, no further emails need be sent to that user.

When a previously unreported, non-security related defect that can effect a software version's current reliability rating is reported, the software must be tested within 20 business days of the report to verify if the defect exists. If the defect is found to exist, or the test for verification is not performed the reliability rating must be changed to reflect the defect. The existence and non-existence of software defects MUST be verified in writing and signed by the individual making the determination.

When a previously un-reported defect is reported that can cause a software version's current reliability rating to be changed to "Unsecure", the software must be tested within 3 business days of the report to verify if the defect exists. If the defect is found to exist, or the test for verification is not performed, the reliability rating must be changed to "Unsecure" to reflect the defect.

The existence and non-existence of software defects MUST be verified in writing and the individual making the determination must sign a declaration attesting to the veracity of the reported result.

Generally, stability ratings SHOULD NOT be used in product advertisements. If a stability rating is used in a product advertisement in a dynamic media, it must adhear to the same time restrictions as have been documented here for publicizing rating downgrades along with a link to the webpage where the downgrade ratings for the product are published in accordance with the above recommendations. If the stability rating is used in a fixed media type, such as magazines, the date it was valid on MUST be displayed along with the URL of the webpage where downgrades are published.

Color is optional, but if rating values are portrayed in such a way that each rating value is assigned a different color or set of colors, the color scheme defined here MUST be used.

An optional-use color scheme is defined for the recommendation so that if colors are associated with rating values they will be consistent. This will eliminate color differences as a possible source of confusion when viewing stability ratings.

Software Ratings
The rating values currently defined, along with the critereria requirements for software to be assigned these ratings are documented in this table.

Use example
Name - value
Color, if used
Stability rating
Unsecure - 0
Black with red lettering
  A security vulnerability has been found in this version. Users should upgrade to a more recent version immediately. Use of this versin should be avoided in the future. Current media containing this version should be destroyed or prominently labeled as Unsecure software.

Software versions that have acheived a Beta (B) rating or better and subsequently been found to have security vulnerabilities MUST be downgraded to this rating value.

Stability rating
Downgraded - / (optionally /n where n=SSR downgraded from)
  Downgraded (/ or /n where n is the rating it has been downgraded from) software is software which has been downgraded because defects have been discovered that do not effect security.

Software at Beta [B] or higher that has been found to contain defects MUST be downgraded.

Software at Alpha [A] with defects may optionally be reworked rather than downgraded depending on the processes and policies of the developer regarding control of [A] level software (see below).

When downgrading software to [/] (downgraded), the developer may optionally show the rating it was downgraded from with a lower-case letter code (e.g. [/b], [/c], etc).

Software versions that have been downgraded to this level from Beta (B) or higher levels are destined to remain in downgraded forever, since once downgraded, a new version must be produced.

Stability rating
Alpha - A
  Alpha (A) rated software that has never had a higher rating can be thought of as software that is generally in late development and in-house testing. It may have known defects, including security defects.

Software versions that are primarily fixes of versions that were rated B or better must be started in Alpha and remain in Alpha with no defect reports for at least five days before promotion to B.

Alpha (A) rated software MAY be tested outside if done in tightly controlled environments that are strictly managed and monitored by the developer. Alpha rated software MUST NOT be published to the general user base except under tightly controlled "person to person" conditions.

Software in Alpha MUST be thoroughly reviewed and tested and free of known defects before it is promoted to B.

While a software version is in Alpha for the first time the develper has the option, based on their own internal processes, to make changes to the code without changing the version number.

When the stability rating of a version of software is changed from Alpha to Beta however, the version MUST be frozen. That is, further changes to the files that constitute the software MUST NOT be made. Instead, if defects are found once a software release has been changed to Beta, the defective version MUST be downgraded to Alpha (where it shall remain) and a new version must be produced to fix the defects. The reason for this is so users who have obtained the defective Beta version may be informed that it is no longer considered safe or stable.

Software in Alpha should be considered Unsecure.

Stability rating
Beta - B
  Early outside testing:

For defects that do not effect security: No known defects more than 20 business days old. Defect reports must be tested and dismissed within twenty business days. If a reported defect is not tested for more than twenty (20) business days, or a defect is reproduced in testing the status must drop to '/' (optionally '/b') where it will remain.

For security defects: Reports of security defects must be tested within three (5) business days. If reports are not tested within five busiess days or if defects are found during the tests, the software MUST be downgraded to Unsecure (0).

Stability rating
Commercial - C
  No known defects greater than 20 business days old. No untested defect reports more than 20 business days old. No unsubstantiated security defect reports more than three (3) business days old.

Software versions in C with non-security related defects or defect reports that have not been tested for more than 20 business days must be downgraded (/ or /c). Software versions in C that have security defects or reports of security related defects that have not been tested for more than three (3) business days MUST be downgraded to 0(Unsecure).

Stability rating
Exceptional - E
  Criteria for rating a software version exceptional (E) are [TBD] (to be determined) at this time. Until criteria have been established, no software may carry a stability rating of E.

[TBD] Notes: Has been in C for more than 75 business days.

A higher release that is no more than four versions removed from this version has been the highest version in C for at least 40 business days and is currently in C.

Special note: Software that is custom produced and released in a single phase with no upgrade releases planned is not required to meet the last criteria in order to be promoted to E. Should new project phases or releases be made subsequently, versions that were promoted to E must be downgraded until the criteria for more recent releases is met.

The above SSR's are trailing indicators, based on experienced stability.

The following two SSR's are predictive and must be based on observed, statistically valide corelations between software development practices used and trailing SSRs achieved in the past.

They are [TBD] at this time.

Stability rating
Hi-Rel - H
  Criteria for rating a software version hi-rel (H) are [TBD] (to be determined) at this time. Until criteria have been established, no software may carry a stability rating of H.

Software developed for this level must move through lower SSR's. While in lower SSR's this rating intention must be denoted by following the SSR with a lower-case 'h' ('Ah', 'Bh', 'Ch', 'Eh').

Likewise software developed for this level must maintain a complete and public record of downgrades. Downgrades MUST use the (otherwise optional) trailing letter after the slash to show what level was achieved before downgrade ('/ah', '/bh', '/ch', '/eh', '/h').

[TBD] Notes: Is this classification level even required? LGFs all at E or higher. All functions that make up the package (not just LGFs) have unit input/output specifactions and test criteria and are tested at typical, maximum, and beyond maximum, and minimum values.

Stability rating
Safe - S
  Criteria for rating a software version safe (S) are [TBD] (to be determined) at this time. Until criteria have been established, no software may carry a stability rating of S.

Software developed for this level must move through lower SSR's. While in lower SSR's this rating intention must be denoted by following the SSR with a lower-case 's' ('As', 'Bs', 'Cs', 'Es').

Likewise software developed for this level must maintain a complete and public record of downgrades. Downgrades MUST use the (otherwise optional) trailing letter after the slash to show what level was achieved before downgrade ('/as', '/bs', '/cs', '/es', '/s').

[TBD] Notes: All requirements of Hi-rel, and: Tested and guaranteed by a third party to run without damage-causing or life-threatening defects in the context where it is designed to run. Guarantee must include a specific amount of financial compensation for any such failure or a set of compensations for specific types of failure within the context of the software (damage of specific type, death, loss of use, etc). Testing organization must openly publish its record of failure in all cases except where doing so would violate government classified information. Where government classified information is required to disclose a failure, testing organization must make those failure records accessible to all individuals who have proper clearances upon request.

[To Be Determined] Permissions will be chosen so that this recommendation may be used and adapted without cost by others for profit and non-profit endeavors. There will be some restrictions to be determined at a later date. For the time being though, all rights are reserved. If you have an interest in using this material or suggestions for the best way to set up permissions please contact me or start a topic in the forums (in the Developers category).

If you're interested in improving or promoting this project please contact me. If used, your code and contributions will be attributed to you with a link to your site.
back to top


© Copyright 2002 - 2004 Creativyst, Inc.

Written by John Repici

With contributions by: