Klasse HierarchyBuilderRedactionBased<T>

java.lang.Object
org.deidentifier.arx.aggregates.HierarchyBuilder<T>
org.deidentifier.arx.aggregates.HierarchyBuilderRedactionBased<T>
Typparameter:
T -
Alle implementierten Schnittstellen:
Serializable

public class HierarchyBuilderRedactionBased<T> extends HierarchyBuilder<T> implements Serializable
This class enables building hierarchies for categorical and non-categorical values using redaction. Data items are 1) aligned left-to-right or right-to-left, 2) differences in length are filled with a padding character, 3) then, equally long values are redacted character by character from left-to-right or right-to-left.
Siehe auch:
  • Methodendetails

    • create

      public static <T> HierarchyBuilderRedactionBased<T> create(char redactionCharacter)
      Values are aligned left-to-right and redacted right-to-left. Redacted characters are replaced with the given character. The same character is used for padding.
      Typparameter:
      T -
      Parameter:
      redactionCharacter -
      Gibt zurück:
    • create

      public static <T> HierarchyBuilderRedactionBased<T> create(File file) throws IOException
      Loads a builder specification from the given file.
      Typparameter:
      T -
      Parameter:
      file -
      Gibt zurück:
      Löst aus:
      IOException
    • create

      public static <T> HierarchyBuilderRedactionBased<T> create(HierarchyBuilderRedactionBased.Order alignmentOrder, HierarchyBuilderRedactionBased.Order redactionOrder, char redactionCharacter)
      Values are aligned according to the alignmentOrder and redacted according to the redactionOrder. Redacted characters are replaced with the given character. The same character is used for padding.
      Typparameter:
      T -
      Parameter:
      alignmentOrder -
      redactionOrder -
      redactionCharacter -
      Gibt zurück:
    • create

      public static <T> HierarchyBuilderRedactionBased<T> create(HierarchyBuilderRedactionBased.Order alignmentOrder, HierarchyBuilderRedactionBased.Order redactionOrder, char paddingCharacter, char redactionCharacter)
      Values are aligned according to the alignmentOrder and redacted according to the redactionOrder. Redacted characters are replaced with the given character. The padding character is used for padding.
      Typparameter:
      T -
      Parameter:
      alignmentOrder -
      redactionOrder -
      paddingCharacter -
      redactionCharacter -
      Gibt zurück:
    • create

      public static <T> HierarchyBuilderRedactionBased<T> create(String file) throws IOException
      Loads a builder specification from the given file.
      Typparameter:
      T -
      Parameter:
      file -
      Gibt zurück:
      Löst aus:
      IOException
    • build

      public AttributeType.Hierarchy build()
      Creates a new hierarchy, based on the predefined specification.
      Angegeben von:
      build in Klasse HierarchyBuilder<T>
      Gibt zurück:
    • build

      public AttributeType.Hierarchy build(String[] data)
      Creates a new hierarchy, based on the predefined specification.
      Angegeben von:
      build in Klasse HierarchyBuilder<T>
      Parameter:
      data -
      Gibt zurück:
    • getAligmentOrder

      public HierarchyBuilderRedactionBased.Order getAligmentOrder()
      Returns the alignment order.
      Gibt zurück:
    • getAlphabetSize

      public Double getAlphabetSize()

      Returns properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies. May return null.

      Gibt zurück:
      Size of the alphabet: the possible number of elements per character of any value from the domain
    • getDomainSize

      public Double getDomainSize()

      Returns properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies. May return null.

      Gibt zurück:
      Size of the domain: the number of elements in the domain of the attribute
    • getMaxValueLength

      public Double getMaxValueLength()

      Returns properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies. May return null.

      Gibt zurück:
      Max. length of an element: the number of characters of the largest element in the domain
    • getPaddingCharacter

      public char getPaddingCharacter()
      Returns the padding character.
      Gibt zurück:
    • getRedactionCharacter

      public char getRedactionCharacter()
      Returns the redaction character.
      Gibt zurück:
    • getRedactionOrder

      public HierarchyBuilderRedactionBased.Order getRedactionOrder()
      Returns the redaction order.
      Gibt zurück:
    • isDomainPropertiesAvailable

      public boolean isDomainPropertiesAvailable()
      Returns whether domain-properties are available for this builder. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies.
      Gibt zurück:
    • prepare

      public int[] prepare(String[] data)
      Prepares the builder. Returns a list of the number of equivalence classes per level
      Angegeben von:
      prepare in Klasse HierarchyBuilder<T>
      Parameter:
      data -
      Gibt zurück:
    • setAlphabetSize

      public void setAlphabetSize(int alphabetSize, int maxValueLength)

      Sets properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies. Required properties are:

      • Size of the domain: the number of elements in the domain of the attribute
      • Size of the alphabet: the possible number of elements per character of any value from the domain
      • Max. length of an element: the number of characters of the largest element in the domain

      As a simplifying assumption, it is assumed that the domain values are distributed equally regarding their length and their characters from the alphabet.

      This method will estimate the size of the domain as domainSize = alphabetSize^{maxValueLength}

      Parameter:
      alphabetSize -
      maxValueLength -
    • setDomainAndAlphabetSize

      public void setDomainAndAlphabetSize(int domainSize, int alphabetSize, int maxValueLength)

      Sets properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies. Required properties are:

      • Size of the domain: the number of elements in the domain of the attribute
      • Size of the alphabet: the possible number of elements per character of any value from the domain
      • Max. length of an element: the number of characters of the largest element in the domain
      Parameter:
      domainSize -
      alphabetSize -
      maxValueLength -
    • setDomainMetadata

      public void setDomainMetadata(String[] data)

      Sets properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies.

      Parameter:
      data -
    • setDomainSize

      public void setDomainSize(int domainSize, int maxValueLength)

      Sets properties about the attribute's domain. Currently, this information is only used for evaluating information loss with the generalized loss metric for attributes with functional redaction-based hierarchies. Required properties are:

      • Size of the domain: the number of elements in the domain of the attribute
      • Size of the alphabet: the possible number of elements per character of any value from the domain
      • Max. length of an element: the number of characters of the largest element in the domain

      As a simplifying assumption, it is assumed that the domain values are distributed equally regarding their length and their characters from the alphabet.

      This method will estimate the size of the alphabet as alphabetSize = pow(domainSize, 1.0d / maxValueLength)

      Parameter:
      domainSize -
      maxValueLength -