Last update: January 2025

AUTO-DataGenCARS+: Advanced Pyton-Based User orienTed tOol DataGenCARS

Brief description

AUTO-DataGenCARS+ is a powerful graphical user interface that can be used to generate synthetic data for the evaluation of Recommender Systems (RS) and Context-Aware Recommender Systems (CARS). It extends our previous tool DataGenCARS with a flexible and useful GUI, user facilities, improved and new functionalities like the following:

IMPORTANT: this webpage focuses on AUTO-DataGenCARS+, a more advanced and new version of the tool AUTO-DataGenCARS, now developed in Python. Originally, AUTO-DataGenCARS was implemented using Java. The AUTO-DataGenCARS invention has been registered in Spain (University of Zaragoza — PII-2021-0019).

Documentation

AUTO-DataGenCARS+ provides 'Help Information' for each functionality, which includes some videos demonstrating example scenarios of user interaction.

Software

Videos

Some demo videos showing how to perform different actions with AUTO-DatGenCARS+, along with several use case scenarios, are referenced in the following (the audio is in Spanish, but you can use YouTube's automatic subtitling in the desired language):

Contributors

Researchers Students (final degree projects)

AUTO-DataGenCARS+ vs. DataGenCARS

A comparison between AUTO-DataGenCARS+ and DataGenCARS can be seen in the following table:

Functionality DataGenCARS library AUTO-DataGenCARS+ tool
Generation of datasets with context information (to evaluate CARS)
Creation of users, items and contexts through schema files
Generation of completely synthetic datasets
Increasing of ratings in an existing dataset
Generation of a synthetic dataset similar to an existing one
Generation of a dataset from an initial sample of an existing dataset
Ability to remove unknown contextual information (replacing it with values)
Mapping of item schemas into Java classes (and vice versa)
Generation of dataset’s user profiles (manual and automatic)
Powerful graphical user interface
Generation of datasets with no context information (to evaluate RS)
Graphical definition of workflows
Export and import data files
Automatic generation of the needed input files
Ability to chain different types of generation actions
Generation of graphs showing the input and output files
Summarizing statistics through built-in graphs
Evaluation functionalities integrated within the tool, with configuration of the required experimental settings
Saving datasets and attributes for easy viewing and later use
Transformation of attributes (from categorical to numerical attributes and from preferential to binary ratings, and vice versa)
Other improvements and bug fixes in the synthetic data generation techniques

Acknowledgments

Logos