Supported transformation methods
ARX supports a variety of common data transformation models, which can also be combined with each other.
Global and local transformation schemes
ARX can be configured to apply the same transformation scheme to all records in a dataset or apply different
transformation schemes to different subsets of the records. The maximal number of transformations that may be
used can be specified. The exact nature of the resulting data transformation scheme depends on additional parameters set
by the user. If value generalization hierarchies have been specified, for example, performing global
transformation will result in full-domain generalization, where each value of an attribute's domain is transformed
to the same generalization level. With local transformation, different generalization levels may be used for the
same attribute value in different records. Analogously, if transformation rules have been specified that only
suppress values, a global transformation process will result in attribute suppression, while a local transformation
process will result in a cell suppression scheme.
Value generalization
User-specified generalization hierarchies form the backbone of ARX's data transformation mechanism. Hierarchies can
either be used to directly reduce the uniqueness of attribute values or to form clusters that will be transformed
using further methods, such as microaggregation.
Random sampling
ARX supports multiple methods for drawing a sample from the input dataset. This can be used to relate a dataset to an
underlying population table or to reduce privacy risks. Random sampling is further used to introduce randomness
into the differential privacy mechanism supported by ARX.
Record, attribute and cell suppression
As described previously, ARX also supports removing individual attributes, attribute values or complete records in the
transformation process. This can be controlled by defining appropriate hierarchies (which is supported by specific wizards),
by performing local or global transformation and by specifying a limit for the maximal number of records which may be removed.
Microaggregation
Sets of numeric attribute values can be transformed into a common value by user-specified aggregation functions.
Prior to aggregation, clustering can be performed based on value generalization hierarchies.
Top- and bottom-coding
By constructing appropriate hierarchies using ARX's built-in wizards, hierarchies can be created that truncate
values exceeding a user-specified range.
Categorization
The wizards provided by the software can be used to create transformation rules that are represented as functions,
which can be used to perform on-the-fly categorization of continuous variables during anonymization.