Klasse StatisticsBuilderInterruptible

java.lang.Object
org.deidentifier.arx.aggregates.StatisticsBuilderInterruptible

public class StatisticsBuilderInterruptible extends Object
A class offering basic descriptive statistics about data handles. Instances of this class can be interrupted and are thus suitable for use in multi-threaded environments.
  • Methodendetails

    • getClassificationPerformance

      public StatisticsClassification getClassificationPerformance(String clazz, ARXClassificationConfiguration<?> config) throws InterruptedException
      Creates a new set of statistics for the given classification task
      Parameter:
      clazz - - The class attributes
      config - - The configuration
      Löst aus:
      ParseException
      InterruptedException
    • getClassificationPerformance

      public StatisticsClassification getClassificationPerformance(String[] features, String clazz, ARXClassificationConfiguration<?> config) throws InterruptedException
      Creates a new set of statistics for the given classification task
      Parameter:
      features - - The feature attributes
      clazz - - The class attributes
      config - - The configuration
      Löst aus:
      ParseException
      InterruptedException
    • getClassificationPerformance

      public StatisticsClassification getClassificationPerformance(String[] features, String clazz, ARXClassificationConfiguration<?> config, ARXFeatureScaling scaling) throws InterruptedException
      Creates a new set of statistics for the given classification task
      Parameter:
      features - - The feature attributes
      clazz - - The class attributes
      config - - The configuration
      scaling - - Feature scaling
      Löst aus:
      ParseException
      InterruptedException
    • getContingencyTable

      public StatisticsContingencyTable getContingencyTable(int column1, boolean orderFromDefinition1, int column2, boolean orderFromDefinition2) throws InterruptedException
      Returns a contingency table for the given columns.
      Parameter:
      column1 - The first column
      orderFromDefinition1 - Indicates whether the order that should be assumed for string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      column2 - The second column
      orderFromDefinition2 - Indicates whether the order that should be assumed for string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getContingencyTable

      public StatisticsContingencyTable getContingencyTable(int column1, AttributeType.Hierarchy hierarchy1, int column2, AttributeType.Hierarchy hierarchy2) throws InterruptedException
      Returns a contingency table for the given columns. The order for string data items is derived from the provided hierarchies
      Parameter:
      column1 - The first column
      hierarchy1 - The hierarchy for the first column, may be null
      column2 - The second column
      hierarchy2 - The hierarchy for the second column, may be null
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getContingencyTable

      public StatisticsContingencyTable getContingencyTable(int column1, int column2) throws InterruptedException
      Returns a contingency table for the given columns. This method assumes that the order of string data items will be derived from the hierarchies provided in the data definition (if any)
      Parameter:
      column1 - The first column
      column2 - The second column
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getContingencyTable

      public StatisticsContingencyTable getContingencyTable(int column1, int size1, boolean orderFromDefinition1, int column2, int size2, boolean orderFromDefinition2) throws InterruptedException
      Returns a contingency table for the given columns.
      Parameter:
      column1 - The first column
      size1 - The maximal size in this dimension
      orderFromDefinition1 - Indicates whether the order that should be assumed for string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      column2 - The second column
      size2 - The maximal size in this dimension
      orderFromDefinition2 - Indicates whether the order that should be assumed for string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getContingencyTable

      public StatisticsContingencyTable getContingencyTable(int column1, int size1, AttributeType.Hierarchy hierarchy1, int column2, int size2, AttributeType.Hierarchy hierarchy2) throws InterruptedException
      Returns a contingency table for the given columns. The order for string data items is derived from the provided hierarchies
      Parameter:
      column1 - The first column
      size1 - The maximal size in this dimension
      hierarchy1 - The hierarchy for the first column, may be null
      column2 - The second column
      size2 - The maximal size in this dimension
      hierarchy2 - The hierarchy for the second column, may be null
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getContingencyTable

      public StatisticsContingencyTable getContingencyTable(int column1, int size1, int column2, int size2) throws InterruptedException
      Returns a contingency table for the given columns. This method assumes that the order of string data items can (and should) be derived from the hierarchies provided in the data definition (if any)
      Parameter:
      column1 - The first column
      size1 - The maximal size in this dimension
      column2 - The second column
      size2 - The maximal size in this dimension
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getDistinctValues

      public String[] getDistinctValues(int column) throws InterruptedException
      Returns the distinct set of data items from the given column.
      Parameter:
      column - The column
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getDistinctValuesOrdered

      public String[] getDistinctValuesOrdered(int column) throws InterruptedException
      Returns an ordered list of the distinct set of data items from the given column. This method assumes that the order of string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      Parameter:
      column - The column
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getDistinctValuesOrdered

      public String[] getDistinctValuesOrdered(int column, boolean orderFromDefinition) throws InterruptedException
      Returns an ordered list of the distinct set of data items from the given column.
      Parameter:
      column - The column
      orderFromDefinition - Indicates whether the order that should be assumed for string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getDistinctValuesOrdered

      public String[] getDistinctValuesOrdered(int column, AttributeType.Hierarchy hierarchy) throws InterruptedException
      Returns an ordered list of the distinct set of data items from the given column. This method assumes that the order of string data items can (and should) be derived from the provided hierarchy
      Parameter:
      column - The column
      hierarchy - The hierarchy, may be null
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getEquivalenceClassStatistics

      public StatisticsEquivalenceClasses getEquivalenceClassStatistics() throws InterruptedException
      Returns statistics about the equivalence classes.
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getFrequencyDistribution

      public StatisticsFrequencyDistribution getFrequencyDistribution(int column) throws InterruptedException
      Returns a frequency distribution for the values in the given column. This method assumes that the order of string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      Parameter:
      column - The column
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getFrequencyDistribution

      public StatisticsFrequencyDistribution getFrequencyDistribution(int column, boolean orderFromDefinition) throws InterruptedException
      Returns a frequency distribution for the values in the given column.
      Parameter:
      column - The column
      orderFromDefinition - Indicates whether the order that should be assumed for string data items can (and should) be derived from the hierarchy provided in the data definition (if any)
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getFrequencyDistribution

      public StatisticsFrequencyDistribution getFrequencyDistribution(int column, AttributeType.Hierarchy hierarchy) throws InterruptedException
      Returns a frequency distribution for the values in the given column. The order for string data items is derived from the provided hierarchy
      Parameter:
      column - The column
      hierarchy - The hierarchy, may be null
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getProgress

      public int getProgress()
      If supported by the according builder, this method will report a progress value in [0,100]. Otherwise, it will always return 0
      Gibt zurück:
    • getQualityStatistics

      public StatisticsQuality getQualityStatistics() throws InterruptedException
      Returns data quality according to various models.
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getQualityStatistics

      public StatisticsQuality getQualityStatistics(DataHandle output) throws InterruptedException
      Returns data quality according to various models. This is a special variant of the method supporting arbitrary user-defined outputs.
      Parameter:
      output -
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getQualityStatistics

      public StatisticsQuality getQualityStatistics(DataHandle output, Set<String> qis) throws InterruptedException
      Returns data quality according to various models. This is a special variant of the method supporting arbitrary user-defined outputs.
      Parameter:
      output -
      qis -
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getQualityStatistics

      public StatisticsQuality getQualityStatistics(Set<String> qis) throws InterruptedException
      Returns data quality according to various models.
      Parameter:
      qis -
      Gibt zurück:
      Löst aus:
      InterruptedException
    • getSummaryStatistics

      public Map<String,StatisticsSummary<?>> getSummaryStatistics(boolean listwiseDeletion) throws InterruptedException
      Returns summary statistics for all attributes.
      Parameter:
      listwiseDeletion - A flag enabling list-wise deletion
      Gibt zurück:
      Löst aus:
      InterruptedException
    • interrupt

      public void interrupt()
      Interrupts all computations.