Superpose with MDAnalysis-style algorithms

by Annie Pham

Introduction to the method

This method reimplements ideas found in MDAnalysis superposition algorithms.


All aligners in opencadd are factories to configure the method parameters.

This particular follows MDAnalysis’ tutorials: 1) generating pairwise sequence alignments and 2) matching, i.e., superimposing the structures according to those pairwise alignments.

The algorithm follows these steps:

  1. Sequence alignment – using biotite

  2. Mapping between sequence alignment and residues in the structures – using a modified fasta2select

  3. Structural superposition – using MDAnalysis

Which structures you choose?

6HG4 and 6HG9


First thing you need to do is to download the proteins and pass them to opencadd. We do that with the Structure objects and the .from_pdbid() class method.

from opencadd.structure.core import Structure

structure1 = Structure.from_pdbid("6HG4")
structure2 = Structure.from_pdbid("6HG9")
from opencadd.structure.superposition.engines.mda import MDAnalysisAligner, mda_align
from MDAnalysis.analysis.rms import rmsd
aligner = MDAnalysisAligner(alignment_strategy="global", per_residue_selection="name CA or name CB")
results = aligner.calculate([structure1, structure2])
1 residues in <Universe with 4169 atoms> are missing backbone atoms. If this system was obtained from a larger structure using a selection, consider wrapping such selection with `same residue as (<your original selection>)` to avoid potential matching problems.
CPU times: user 966 ms, sys: 30.2 ms, total: 996 ms
Wall time: 963 ms


The metadata dictionary in the results contain information about the initial RMSD.

print(f'From RMSD = {results["metadata"]["initial_rmsd"]:.3f}A to optimized RMSD of {results["scores"]["rmsd"]:.3f}A')
From RMSD = 1.589A to optimized RMSD of 1.589A


If you have trouble with NGLview, follow this troubleshooting guide.

import nglview as nv

view = nv.show_mdanalysis(results["superposed"][0].atoms)