TY - JOUR
T1 - MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters
AU - MiBIG 3.0
AU - Terlouw, Barbara R
AU - Blin, Kai
AU - Navarro-Muñoz, Jorge C
AU - Avalon, Nicole E
AU - Chevrette, Marc G
AU - Egbert, Susan
AU - Lee, Sanghoon
AU - Meijer, David
AU - Recchia, Michael J J
AU - Reitz, Zachary L
AU - van Santen, Jeffrey A
AU - Selem-Mojica, Nelly
AU - Tørring, Thomas
AU - Zaroubi, Liana
AU - Alanjary, Mohammad
AU - Aleti, Gajender
AU - Aguilar, César
AU - Al-Salihi, Suhad A A
AU - Augustijn, Hannah E
AU - Avelar-Rivas, J Abraham
AU - Avitia-Domínguez, Luis A
AU - Barona-Gómez, Francisco
AU - Bernaldo-Agüero, Jordan
AU - Bielinski, Vincent A
AU - Biermann, Friederike
AU - Booth, Thomas J
AU - Carrion Bravo, Victor J
AU - Castelo-Branco, Raquel
AU - Chagas, Fernanda O
AU - Cruz-Morales, Pablo
AU - Du, Chao
AU - Duncan, Katherine R
AU - Gavriilidou, Athina
AU - Gayrard, Damien
AU - Gutiérrez-García, Karina
AU - Haslinger, Kristina
AU - Helfrich, Eric J N
AU - van der Hooft, Justin J J
AU - Jati, Afif P
AU - Kalkreuter, Edward
AU - Kalyvas, Nikolaos
AU - Kang, Kyo Bin
AU - Kautsar, Satria
AU - Kim, Wonyong
AU - Kunjapur, Aditya M
AU - Li, Yong-Xin
AU - Lin, Geng-Min
AU - Loureiro, Catarina
AU - Louwen, Joris J R
AU - Sokolova, Nika
PY - 2023
Y1 - 2023
N2 - With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.
AB - With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.
U2 - 10.1093/nar/gkac1049
DO - 10.1093/nar/gkac1049
M3 - Article
SN - 0305-1048
VL - 51
SP - D603-D610
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - D1
ER -