mllf.cb.train_deepset_autoencoder

Training script for DeepSet autoencoder pretraining.

This implements Step 3 of the 4-step pretraining process: Train the autoencoder using MSE loss between input and reconstruction.

Functions

train_all_systems(data_root, output_root, ...)

Train separate autoencoders for each pretraining system.

train_autoencoder(train_data_path, output_dir)

Train a DeepSet autoencoder on atom features.

train_combined_model(data_root, output_dir)

Train a single DeepSet autoencoder on all pretraining systems combined.

Classes

AtomFeatureDataset(data_path)

PyTorch Dataset for atom features from a single system.

CombinedAtomFeatureDataset(data_paths)

PyTorch Dataset that pools atom features from all pretraining systems.