mllf.cb.deepset_pretraining_dataset
Dataset generation for DeepSet autoencoder pretraining.
This module handles Step 1 of the 4-step pretraining process: Iterate through substituent PDB files, calculate AEVs, concatenate with partial charges, and generate training tensors.
Functions
|
Detect the core PDB file in a prep directory. |
|
Detect the protein PDB file in a prep directory. |
|
Extract partial charges for a substituent from RTF file. |
|
Generate bond-topology training datasets for all pretraining systems. |
|
Generate training datasets for all pretraining systems. |
|
Generate per-substituent bond-topology training data for AtomBondGNN pretraining. |
|
Generate training data for one pretraining system. |
|
Load metadata from a pretraining system directory. |