mllf.file_handling.generate_combinations
Generate all combinations of site/sub files into separate directories.
This utility scans an input directory for files matching the pattern site{site}_sub{sub}_{label}.{ext} (e.g., site1_sub2_pres.rtf, site1_sub2_frag.pdb) and creates output subdirectories, one per combination. For each combination, it copies the relevant files, renaming them so sub-indices start at 1 within the new directory.
Each generated combination directory contains: - prep/: Copy of input prep directory with renamed RTF/PDB files - msld_flat.py: Simulation script (if included via –include pattern) - mapping.json: Records original file paths and new names - info.py: Configuration dict with nsubs, nblocks, temp, etc. - run.sh: Executable SLURM submission script for running simulations
Example
- input_dir/
site1_sub1_pres.rtf site1_sub1_frag.pdb site1_sub2_pres.rtf site1_sub2_frag.pdb site1_sub3_pres.rtf site1_sub3_frag.pdb
- Running:
python -m mllf.file_handling.generate_combinations input_dir –out combos_out
- Will produce directories like:
- combos_out/comb_0001_site1_subs_1_2/
├── prep/ │ ├── site1_sub1_pres.rtf (renamed if necessary, see mapping.json) │ ├── site1_sub1_frag.pdb │ ├── site1_sub2_pres.rtf (renamed if necessary, see mapping.json) │ ├── site1_sub2_frag.pdb │ ├── top_all36_msld.rtf (unchanged from input prep/) │ ├── par_all36_msld.prm (unchanged from input prep/) │ └── … (other prep files) ├── msld_flat.py (if included via –include) ├── mapping.json ├── info.py └── run.sh
Combination Generation Logic:
Generates both within-site and cross-site combinations
Within-site: Each substituent can be the “anchor” with others as tail
Anchor is always first, tail is sorted
Example: anchor=1 generates [1,2], [1,3], [1,2,3], etc.
Example: anchor=2 generates [2,1], [2,3], [2,1,3], etc.
Minimum 2 substituents per combination
Cross-site: Cartesian product of within-site selections across sites
Each site contributes >= 2 substituents
Example: site1 has 75 selections, site2 has 186 selections
Generates 75 × 186 = 13,950 cross-site combinations
Total combinations grow significantly with multiple sites
Additional Features: - RTF PRES tokens are automatically renumbered to match new indices - Include patterns allow copying extra files (e.g., prep/, msld_flat.py) - Archive mode creates .tar.gz files for storage
Functions
|
Generate all within-site and cross-site ordered combinations. |
|
Archive combination directories as .tar.gz files. |
|
Augment core.rtf and core.pdb with atoms from an excluded site's sub1. |
|
Create combination directories with renamed files and support files. |
|
Create a single combination directory with renamed files and support files. |
|
Scan input_dir and prep subdirectory for site/sub files. |
|
List all possible combinations without creating directories. |
|
|
|
Generate a directory name for a combination. |
|
Renumber PRES tokens in RTF file content. |