Publication update

We will regularly highlight publications from EUCANCan members and/or with relevance to the issue of efficient analysis, management and sharing of cancer genomic data in this section.

EUCANCan consortium member Prof Dr Roland Eils from the Charité – Universitätsmedizin Berlin has contributed to the following publication:

Upmeier zu Belzen, J., Bürgel, T., Holderbach, S. et al. Leveraging implicit knowledge in neural networks for functional dissection and engineering of proteins. Nat Mach Intell 1, 225–235 (2019) doi:10.1038/s42256-019-0049-9


Proteins are nature’s most versatile molecular machines. Deep neural networks trained on large protein datasets have recently been used to tackle the unmet complexity of protein sequence–function relationships. The implicit knowledge contained in these networks represents a powerful, but thus far inaccessible, resource for understanding protein biology. Here, we show that occlusion-based sensitivity analysis can leverage the knowledge present in deep-neural-network-based protein sequence classifiers to identify functionally relevant parts of proteins. We first validated our approach by successfully predicting positions that mediate small molecule binding or catalytic activity across different protein classes. Next, we inferred the impact of point mutations on the activity of ERK and HRas, signalling factors frequently deregulated in cancer. Finally, we used our approach to identify engineering hotspots in CRISPR–Cas9 and anti-CRISPR protein AcrIIA4. Our work demonstrates how implicit knowledge in neural networks can be harnessed for protein functional dissection and protein engineering.