TextPredictionConfig
Bases: TrainerConfig
Configuration for TextPredictionTrainer.
Unified data contract: internal representation uses masked, split, factual_class, alternative_class, factual, alternative, transition. Training uses only rows where transition ∈ {c1→c2, c2→c1} for the configured pair.
Data input: - data as Dict[str, DataFrame]: per-class format. Key = factual_class; each df has masked, split, and columns = class names (factual = df[factual_class], alternative = df[alternative_class]). If a df has no column for another class (single-token-per-class, e.g. Gender), target is inferred as the other class's token for the configured pair. - data as DataFrame: merged format with label_class_col, label_col, and optionally target_col, target_class_col. If target columns omitted, pair is required and target = other class's token. - hf_dataset: load from HuggingFace (merged format), then convert to unified.
Attributes:
| Name | Type | Description |
|---|---|---|
run_id |
Optional[str]
|
Optional run identifier (subdir and display). |
data |
Optional[Union[DataFrame, Dict[str, DataFrame], str, Path]]
|
Per-class dict (label_class -> DataFrame) or merged DataFrame. |
hf_dataset |
Optional[str]
|
HuggingFace dataset ID; loads merged format and converts to unified. |
hf_subset |
Optional[Union[str, List[str]]]
|
Subset name(s) to load when using hf_dataset. |
hf_splits |
Optional[List[str]]
|
Splits to include (e.g. ['train', 'validation', 'test']). |
target_classes |
Optional[List[str]]
|
Target classes for training. Pair is automatically inferred when len(target_classes) == 2. |
all_classes |
Optional[List[str]]
|
All classes available in the dataset. If None (default), inferred from data. When loading from HuggingFace datasets, None means load all configs/subsets. |
masked_col, |
split_col
|
Column names. For merged format also label_col, label_class_col, and optionally target_col, target_class_col. |
use_class_names_as_columns |
bool
|
For per-class data, use class name as column name for tokens. |
all_classes
class-attribute
instance-attribute
All classes available in the dataset. If None (default), inferred from data. When loading from HuggingFace datasets, None means load all configs/subsets.
alternative_class_col
class-attribute
instance-attribute
class_merge_transition_groups
class-attribute
instance-attribute
decoder_eval_prob_on_other_class
class-attribute
instance-attribute
decoder_eval_restrict_to_target_classes
class-attribute
instance-attribute
max_counterfactuals_per_sentence
class-attribute
instance-attribute
target_classes
class-attribute
instance-attribute
Target classes for training. Pair is automatically inferred when len(target_classes) == 2.