company logo

Classification - creating class hierarchies

Keywords:  category

In order to provide aggregated information about objects belonging to different categories, individual objects can be classified by means of classifications. A classification is a concept, which provides a mean for dividing an object collection into distinct subsets. A classification consists of a set of categories, which create subsets (classified collections) for the object collection by associating each individual object of the object collection with exactly one category.

In general, it is also possible to classify individual objects by other individual objects (ad hoc classifications). It depends on the specific view and the subject area, when to use a classification and when to use object references.

Categories

Categories describe the rules for dividing an object collection into subsets. Usually, categories are defined as general categories not referring to a certain set of objects. In principle, it would also be possible to define individual categories that refer to a set of individual objects. This happens, when defining (hierarchical) classifications, which explicitly refer to related objects. E.g. a continent classification may list the countries belonging to each continent for each category in the classification.

Notes:

There were many discussions whether categories are metadata (i.e. defined on the model level) or data stored in the database. Considering geographical classifications, a category refers to a certain country or geographical area, which, indeed, is an individual object and hence, defined on the data level. On the other hand, many statistics do not care about the individual object aspect of geographical objects but use those for classifying companies or persons. Thus, for a statistician, geographical areas appear rather as categories than as individual objects. Finally, both is right. It just depends on the way of reflecting reality.

Typed and untyped classification

Often, classifications may apply to individual objects of a certain object type, only. Applying a color classification referring to colors as categories to an object collection requires that individual objects in the collection have a color, i.e. the classification may apply on "colored objects" (object type), only. Classifications applying to individual objects of a given object type are called typed classifications.

A typed classification produces a number of subsets (but not necessarily subclasses), when applying to an individual object collection. Since individual objects in the object collection belong to the same object type, the subset created for each category can be defined as condition, which individual objects have to fulfill for belonging to a given category (e.g. all person objects with an age between 20 and 29 belong to the twenties category).

Practically, classifications also may apply to object collections containing individual objects of any object type (e.g. by assigning individual object explicitly to categories). Classification not requiring individual object of a specific object type are called untyped classifications. Usually, classifications refer to properties that all individual objects must have in order to be classified, which means, that object to be classified usually have a common object type that becomes the object type of the classification.

Typed category

Categories of typed classifications may create subclasses. Considering a person classification consisting of categories male and female, may create not only person subsets, but also specialized person classes as men and women. In order to produce specialized subclasses, the category requires an object type, which has to inherit from the classifications object type. Categories producing subclasses rather than subsets are called typed categories.

Hierarchical classifications

Since a category creates a new object collection, another (sub) classification may apply to this subset in order to divide it again. Thus, by combining classifications, hierarchical classification can be defined. A set of hierarchical classifications may form a classification system, where all (hierarchical) classifications included refer to a common set of (flat) classifications.

Within a hierarchical classification it becomes obious, the each category (except the ones on lowest level) is a classification, again. Hence, categories are considered in general as classifications.

Classification properties

Some classifications require additional characteristics in order to describe categories properly, e.g. the duration of study for education classifications. Thus, classifications support defining a number of properties for their categories.

Within a hierarchical classification, extension properties are supposed to exist for all categories referred to in the hierarchical classification or on all categories defined on the lowest hierarchy level.