PREreview of Metric Ion Classification (MIC): A deep learning tool for assigning ions and waters in cryo-EM and x-ray crystallography structures

by Stephanie Wankowicz

Published: April 8, 2024
DOI: 10.5281/zenodo.10944563
License: CC BY 4.0

This paper developed an innovative strategy to classify modeled water and/or ions in high-resolution X-ray crystallography or cryo-EM structures. Based on local geometry fingerprints are used as input for a deep metric learn model to predict the classification of placed ions and waters within input structures based on their local chemical environments.

While this tools does not present significant improvements over existing tools such as undowser and checkmymetal, This tool presents a new method for using geometric fingerprints combined with deep metric learning and has demonstrated the ability to extend water/ion classification to high-resolution cryoEM and RNA structures along with the detection of halides.

Major:

1) At the beginning of the results section and introduction, please clarify that this tool is for checking already modeled waters and/or ions.

2) Please clarify what you mean by ‘we remove the initial features of both the density itself’. Is this referring to experimental density or data point density?

3) Please clarify your re-refinement schema as referred to in the result section ‘x-ray structures were re-refined with the alternative density and Fo-Fc maps were inspected in both cases’. Please put the entire refinement protocol in the methods section.

4) Please provide rationale for the -3 to 3 score when re-assessing potentially incorrectly labeled positions. What goes into each number? ‘We assigned each structure a score between -3 and 3, with increasingly positive scores denoting more support for the MIC prediction and increasingly negative scores support for the original label.’

5) For splitting training/testing, as only individual sites were considered, did you also examine if the training/testing split was even in terms of resolution, R-factors, or date of deposition, as all of these would impact the goodness of fit of many waters/ions.

Minor:

1) There is a typo (‘ues’) on page 4 of the intro.

2) In the results section, please provide information on the number of PDBs you use for training and their characteristics (selection of resolution, deposition year, re-refined, ect). Likewise, what was the size and characteristics of your testing set?

3) It would be of great benefit to the community if the authors deposited their updated ion/water classifications that they manually reviewed in Zenodo or somewhere else.

4) It would be interesting, but likely outside the scope of this paper, to understand how incorrectly modeled water and/or ions (i.e., if they were not placed at the center of the density peak) impact MIC or other picking algorithms.

5) Please provide information on which PDBs were chosen for CMM comparison.

Competing interests

The reviewer (Stephanie Wankowicz) is at the same institution as the first author and knows her personally.