AUTO-DataGenCARS+ is a powerful graphical user interface that can be used to generate synthetic data for the evaluation of Recommender Systems (RS) and Context-Aware Recommender Systems (CARS). It extends our previous tool DataGenCARS with a flexible and useful GUI, user facilities, improved and new functionalities like the following:
IMPORTANT: this webpage focuses on AUTO-DataGenCARS+, a more advanced and new version of the tool AUTO-DataGenCARS, now developed in Python. Originally, AUTO-DataGenCARS was implemented using Java. The AUTO-DataGenCARS invention has been registered in Spain (University of Zaragoza — PII-2021-0019).
AUTO-DataGenCARS+ provides 'Help Information' for each functionality, which includes some videos demonstrating example scenarios of user interaction.
A comparison between AUTO-DataGenCARS+ and DataGenCARS can be seen in the following table:
Functionality | DataGenCARS library | AUTO-DataGenCARS+ tool |
---|---|---|
Generation of datasets with context information (to evaluate CARS) | ✓ | ✓ |
Creation of users, items and contexts through schema files | ✓ | ✓ |
Generation of completely synthetic datasets | ✓ | ✓ |
Increasing of ratings in an existing dataset | ✓ | ✓ |
Generation of a synthetic dataset similar to an existing one | ✓ | ✓ |
Generation of a dataset from an initial sample of an existing dataset | ✓ | ✓ |
Ability to remove unknown contextual information (replacing it with values) | ✓ | ✓ |
Mapping of item schemas into Java classes (and vice versa) | ✓ | |
Generation of dataset’s user profiles (manual and automatic) | ✓ | |
Powerful graphical user interface | ✓ | |
Generation of datasets with no context information (to evaluate RS) | ✓ | |
Graphical definition of workflows | ✓ | |
Export and import data files | ✓ | |
Automatic generation of the needed input files | ✓ | |
Ability to chain different types of generation actions | ✓ | |
Generation of graphs showing the input and output files | ✓ | |
Summarizing statistics through built-in graphs | ✓ | |
Evaluation functionalities integrated within the tool, with configuration of the required experimental settings | ✓ | |
Saving datasets and attributes for easy viewing and later use | ✓ | |
Transformation of attributes (from categorical to numerical attributes and from preferential to binary ratings, and vice versa) | ✓ | |
Other improvements and bug fixes in the synthetic data generation techniques | ✓ |