Installation
Requirements
Python 3.8 or newer is required (tested on 3.8–3.11).
Basic installation
This installs the core package and required dependencies. Sufficient for training with DataFrames or local data.
Recommended (plots, HuggingFace, tokenizer compatibility, safetensors)
For a full experience (plots, loading HuggingFace datasets, tokenizer compatibility, safetensors), install:
This adds:
| Package | Purpose |
|---|---|
| matplotlib | Plotting (encoder distributions, convergence plots) |
| seaborn | Visualizations (encoder scatter, heatmaps) |
| safetensors | Faster, safer model serialization (preferred over .bin) |
| datasets | Loading HuggingFace datasets by id |
| sentencepiece | Tokenizer backend for some Hugging Face models (e.g. many T5/LLaMA variants); not required for all BERT/GPT tokenizers |
Optional: data creation (spaCy)
To create training data from raw text with morphological filtering (e.g. German articles by gender/case), install:
This adds:
| Package | Purpose |
|---|---|
| spacy | Morphological filtering via spacy_tags in TextFilterConfig (see Data generation) |
| datasets | Loading HuggingFace datasets as base data for TextPredictionDataCreator |
spaCy also needs a language model, e.g. de_core_news_sm for German: python -m spacy download de_core_news_sm.
You can combine extras:
pip install gradiend[recommended,data]for plots, HF, safetensors, and data creation.
Optional: interactive encoder scatter (Plotly)
The encoder scatter plot (trainer.plot_encoder_scatter()) uses Plotly for interactive hover and zoom. It is optional and not part of recommended. Without Plotly, the function returns None and logs a warning.
To enable the interactive scatter (e.g. in Jupyter):
Dev (contributors)
For building docs and running tests:
From source
git clone https://github.com/aieng-lab/gradiend.git
cd gradiend
pip install -e .
# With recommended extras:
pip install -e ".[recommended]"