A catalogue of 863 Rett-syndrome-causing MECP2 mutations and lessons learned from data integration
Rett syndrome (RTT) is a rare neurological disorder mostly caused by a genetic variation in MECP2. Making new MECP2 variants and the related phenotypes available provides data for better understanding of disease mechanisms and faster identification of variants for diagnosis. This is, however, currently hampered by the lack of interoperability between genotype-phenotype databases. Here, we demonstrate on the example of MECP2 in RTT that by making the genotype-phenotype data more Findable, Accessible, Interoperable, and Reusable (FAIR), we can facilitate prioritization and analysis of variants. In total, 10,968 MECP2 variants were successfully integrated. Among these variants 863 unique confirmed RTT causing and 209 unique confirmed benign variants were found. This dataset was used for comparison of pathogenicity predicting tools, protein consequences, and identification of ambiguous variants. Prediction tools generally recognised the RTT causing and benign variants, however, there was a broad range of overlap Nineteen variants were identified that were annotated as both disease-causing and benign, suggesting that there are additional factors in these cases contributing to disease development. ; The authors would like to thank the Mutalyzer team for support and feedback, Henk van Kranen for support in liftover of ancient genetic variant descriptions, and Eric Smeets for collection of the Maastricht Rett dataset.This work was funded by ELIXIR (funded by the European Commission within the Research Infrastructures programme of Horizon 2020), the research infrastructure for life-science data (MolData2). FE and LC were also funded by The Dutch Rett Syndrome Foundation (Stichting Terre). CE, AJ, RK, AV, SCG, MB, MRi and MR also received funding from EXCELERATE (H2020, Grant No. 676559). AJ, RK, MR, MB, and SCG also received funding from RD-Connect, European Union Seventh Framework Programme (FP7/2007–2013, Grant No. 305444). FE, CE, AJ, RK, MR, MB, and SCG received funding from the European Union's Horizon 2020 Research and Innovation Program under grant agreement EJP RD N°825575. RK was also funded by NWO in project VWData (grant no. 400.17.605) and BBMRI-NL (NWO, National Roadmap for Large-Scale Research Facilities, grant no. 184.033.111). AV and SCG also received funding from INB Grant (Grant No. PT17/0009/0001 - ISCIII-SGEFI / ERDF). ; Peer Reviewed ; Postprint (published version)