File created by Sergio Ilarri on April 30, 2023. This refers to version 1.0 of the TextHealthDataAnonym prototype (http://webdiis.unizar.es/~silarri/prot/HealthDataAnonym/Software/HealthDataAnonym-appWebsite.zip). Key files: ---------- -Script TextHealthDataAnonym-CommandLine.sh: to execute the command line version. -Script TextHealthDataAnonym-GUI.sh: to execute the version with a GUI. -File Configuration.txt: it needs to be edited to configure some key aspects of the tool. This will allow executing the tool from the command line in batches. -The .py scripts are Python scripts needed to interact with spaCy. For the execution to succeed, the following key folders and all the supporting software mentioned in this document need to be installed first. Key folders: ------------ Colecciones (Collections): it contains dictionaries (named entities). Informes (Reports): it is the folder where input reports could be placed for anonymization. Resultados (Results): open folder to hold the documents anonymized. libs: folder containing .jar libraries required: * jython-standalone-2.7.2.jar (https://www.jython.org/jython-old-sites/downloads.html -> https://repo1.maven.org/maven2/org/python/jython-installer/2.7.2/jython-installer-2.7.2.jar). * Ngrams.jar (https://github.com/DanielJohnBenton/Ngrams.java -> https://raw.githubusercontent.com/DanielJohnBenton/Ngrams.java/master/dist/Ngrams.jar). * weka-3.7.0.jar (https://sourceforge.net/projects/weka/files/weka-3-7/3.7.0/). Dependencies and required supporting software: ---------------------------------------------- Installation of the needed programming languages interpreters and supporting packages (we show commands for Linux, but the tool has also been tested in Windows successfully): INSTALLATION OF JAVA'S JRE: For Linux (Debian-based distributions, like Ubuntu): sudo apt update sudo apt install default-jre CONFIGURATION OF REPOSITORIES: For Linux (Debian-based distributions, like Ubuntu): sudo apt-add-repository -r ppa:gnome3-team/gnome3 sudo apt-add-repository -r ppa:philip.scott/spice-up-daily sudo apt update INSTALLATION OF PYTHON: For Linux (Debian-based distributions, like Ubuntu): sudo apt update sudo apt install software-properties-common sudo add-apt-repository ppa:deadsnakes/ppa sudo apt install python3.9 python3.9 --version INSTALLATION OF PYTHON'S PACKAGE PATTERN3: pip install pattern3 For additional information, please see https://pypi.org/project/pattern3. INSTALLATION OF JYTHON: For Linux (Debian-based distributions, like Ubuntu): sudo apt install jython For MacOS: brew install jython or sudo port install jython (see https://brew.sh/ or https://www.macports.org/) As an altxernative, you can install Jython following the instructions in https://www.jython.org/download.html. INSTALLATION OF SPACY: pip install -U pip setuptools wheel pip install -U spacy python -m spacy download es_core_news_sm or python3 -m spacy download es_core_news_sm Warning: -------- Please, notice that this is not production code. There may be some elements set for a specific environment, requiring adjustment.