Get involved!

How to participate, contribute and provide feedback

General notes: Besides tasks that focus on the GUI or documentation only, features need to be integrated into the API/ARX Core first. Optionally, features can be made available through the GUI or in-depth documentation can be provided. For contributions please refer to our github repository and create issues or pull-requests. You can also contact us at arx.deidentifier@gmail.com.

Low difficulty

  • Giving feedback about experiences when using and testing ARX (GUI and API).
  • Proof-reading and correcting text content (e.g. on our website, the application itself or the contained context-sensitive help), especially by native English speakers.
  • Translating text content to further languages (internationalization).
  • Contributing generalization hierarchies for common (e.g. biomedical) attributes.
  • Systematic evaluation with real-world datasets (see AnonBench as a starting point for performance-related evaluations).

Medium difficulty

  • Adding new methods for importing data (e.g. from SPSS, SAS, OpenOffice (and forks)).
  • Adding new methods for importing data from relational databases into the GUI (e.g. from Oracle, DB2, MSSQL).
  • Extending data export facilities (e.g. to SQL databases, spreadsheet programs, statistics software).
  • Refactoring parts of the code making up the graphical user interface.
  • Implementing a simplified user interface for common tasks.
  • Computing and visualizing further data properties, e.g. summary statistics (measures of central tendency, measures of statistical dispersion, measures of dependence) with tables or charts.
  • Creating ARX plugins for open source ETL tools, e.g. for┬áTalend Open Studio.

High difficulty

  • Making the framework more extensible by implementing a plugin infrastructure.
  • Adding methods for non-interactive Differential Privacy.
  • Adding methods for analyzing re-identification risks.
  • Adding methods for data masking (e.g. noise addition, time/value-shifts, shuffling).

Very high difficulty

  • Adding further anonymization algorithms (e.g. Mondrian).
  • Implementing further coding models (e.g. local recoding).
  • Adding further privacy criteria (e.g. LKC privacy).
  • Implementing further utility metrics (e.g. utility constraints).

Notes on tasks with very high difficulty: A plethora of methods exist that can be implemented but fall into this category. Even though methods may seem easy to implement at first, in the context of ARX, this is for the most part only true for a prototypical implementation that is only loosely coupled with the system. Really making a new method a first-class citizen in the ARX anonymization tool most likely requires a very deep understanding of how ARX is designed and implemented because it will require significant changes to the core of the system. We will happily assist you with identifying the major challenges related to the implementation of a specific new feature. For a first overview of ARX please refer to the description of our implementation framework and our publication list.

Fork me on GitHub