Structure Prediction, Characterization, and Functional Annotation of Uncharacterized Protein BCRIVMBC126_02492 of Bacillus cereus: An In Silico Approach

Bacillus cereus is enteropathogenic and widely distributed pathogen in the environment, which is mainly associated with food poisoning. In the intestine, B. cereus produces enterotoxins resulting in diarrhoea, abdominal distress and vomiting, and a range of infections in humans. BCRIVMBC126_02492 is a functional protein of B. cereus, which is related to oxidation glutathione persulfide in the mitochondria, cyanide fixation, and also has a variety of biological functions. Nevertheless, protein BCRIVMBC126_02492 is not explored. Therefore, the structure prediction, functional annotation, and characterization of the protein are proposed in this study. Modeller, Swiss-model, and Phyre2 are used for generating tertiary structures. The structural quality assessment of the protein determined by Ramachandran Plot analysis, Swiss-Model Interactive Workplace, and Verify 3D tools. Furthermore, Z-scores applied to detect the overall tertiary model quality of the protein. A comparison of the results showed that the models generated by Modeller were more suitable than Phyre2 and Swiss Models. This investigation decoded the role of this unexplored protein of B. cereus. Therefore, it can bolster the way for enriching our knowledge for pathogenesis and drug and vaccine targeting opportunities against B. cereus infection.


INTRODUCTION
Bacillus cereus is omnipresent and a Gram-positive, spore-forming rod-shaped, non-capsulated, aerobic, or facultative anaerobic bacterium (Sankararaman and Velayuthan, 2013). The saprophytic life cycle of B. cereus is mainly in soil. Basedon 16S rRNA gene sequences, and it is closely related to other members in the B. cereus group, including B. anthracis and B. thuringiensis (Granum,  Generally, two types of food poisoning caused by B. cereus, such as the emetic poisoning appear 0.5-6 hours and the diarrheal syndromes poisoning appears 8-16 hours after the ingestion of contaminated food (Tallent et al., 2015). B. cereus produces a potenttoxincereulide, which is a small, acid-and highly heatresistant depsipeptide toxin resulting inthe food industries in several challenges (Rouzeau-Szynalski et al., 2020). B. cereus causes the most side effects in themicroecological preparations. The causes behind this are the overuse of antibiotics in animal feed and drug additives resulting in an unbalanced condition in the intestinal micro-ecosystems. This is also responsible for weakened immunity and drug resistance (Berthold-Pluta et al., 2015; Guo et al., 2020). B. cereus inhibits the growthof detrimental bacteria and selectively pushes the activity of the microorganisms which live in the gastrointestinal tract (Riol et al., 2018).
Additionally, B. cereus invigorates and boosts the growth of the host by producing advantageous metabolites (Raymond and Bonsall, 2013). The protein BCRIVMBC126_02492 present in B. cereus is associated with oxidation of glutathione persulfide to glutathione and persulfate in the mitochondria; cyanide fixation as well as other functions in biological systems. However, the tertiary structure with ligand binding active sites, physicochemical characterizations are not reported yet. Therefore, the tertiarystructures of the uncharacterized protein BCRIVMBC126_02492 with ligand binding active sites and functional annotations are propped in this study through an in silico approach.

Sequence retrieval
The amino acid sequence of BCRIVMBC126_02492 obtained from the National Center for Biotechnology Information (NCBI) with the accession ID SCN08319.1. The 3D Structure is not available in the Protein Data Bank (PDB). As a result, the 478 amino acid long protein BCRIVMBC126_02492 present in B. cereus undertook for modeling secondary and tertiary structures, and for characterization and functional annotation as well.

Physicochemical characterization
We have used two web-based servers for the determination of the physicochemical properties of the uncharacterized protein. ProtParam tool applied for the prediction of instability and aliphatic index, amino acid composition, aliphatic index, and GRAVY (Gasteiger et al., 2005). Besides, the Sequence Manipulation Suite (SMS) version 2 tool used for theoretical isoelectric point (pI) determination (Martin, Garrity and Yao, 2016).

Secondary structure prediction
The self-optimized prediction method with alignment (SOPMA) used for secondary structure elements prediction (Combet et al., 2000)and the SPIPRED program (Jones, 1999) used to predict the secondary structure of BCRIVMBC126_02492. The DISOPRED tool used for disorder prediction (Thakur and Kumar, 2018).

Tertiary structure modeling and validation
The homology structure modeling of the protein BCRIVMBC126_02492 of B. cereus performed as there was no tertiary structure available in the Protein Data Bank (PDB). Three servers including Modeller (Webb and Sali, 2016) following the HHpred tool (Zimmermann et al., 2018), Swiss-Model (Gasteiger et al., 2005), and Phyre2 (Kelley et al., 2015), used to predict the tertiary structures of the protein.The tertiary structures generated from Modeller, Swiss-Model, and Phyre2 compared. The most suitable tertiary structure selected for the final validation.For modeled tertiary structure validation, the Ramachandran plot analysis with PROCHECK and the Verify 3D (https://servicesn.mbi.ucla.edu/Verify3D/) followed. Also, the Swiss-Model Interactive Workplace (https://swissmodel.expasy.org/assess) applied for the final tertiary structure quality validation. Z-scores derived from the Prosa-web used for the overall tertiary model quality assessment experiment as well.

Physicochemical characterization
The amino acid sequence of BCRIVMBC126_02492 present in B. cereus was retrieved in FASTA format and used as a query sequence for the determination of physicochemical parameters. The instability index of BCRIVMBC126_02492 is 34.60 (<40) indicates the stable nature of the protein (Guruprasad et al., 1990). The protein is acidic (pI 5.76, 6.04*), with a molecular weight of 54188. 16  suggests as a decisive factor for increased thermosstability for a wide temperature range. The protein is hydrophilic, and the possibility of better interaction with water (Uddin et al., 2017) as indicated by the lower grand average of hydropathicity (GRAVY) indices value (-0.256) as shown in Table 1. The amino acid composition showed in Table 2, which obtained from the ExPASy ProtParam Tool. The amino acid composition can help us to reveal the active amino acid pocket for drug and vaccine targeting against the protein.
The uncharacterized protein has several functions, including it related to persulfide dioxygenase. This non-heme iron-dependent oxygenase catalyzes the oxidation of glutathione persulfide to glutathione and persulfide in the mitochondria as well as involved in a variety of biological functions (Sattler et al., 2015). Also, it has a sulfide dehydrogenase enzymatic function. It plays a vital role in cyanide fixation and other features in biological systems as well as a variety of biological functions (Spallarossa et al., 2001).

Homology Modeling and Structural Validation
The target sequence of BCRIVMBC126_02492 in FASTA format inserted to HHpred Template Selection tool as input and the most active template was selected (3TP9_A) among the number of hits of 250 with the probability rate of 100 percent, E-Value of 6.7e-53, SS of 53.7, Cols of 462 and the target length of 474 (data not shown), and finally stored the tertiary modeled protein structure in PDB format predicted by Modeller (Fig 3). The tertiary structure assessment analyzation of the uncharacterized protein, the Ramachandran Map by PROCHECK (Fig 4) was used which shows that 92.6% of the total residues (387) were found in the core [A,B,L]; 6.0% of residues were in the additional allowed regions [a,b,l,p]; and there was 1.0% of residue were in the generously allowed regions [~a,~b,~l,~p] and 0.5% residue was in the disallowed regions. The number of non-glycine and non-proline residues was 418, which was 100%; the end-residues (excl. Gly and Pro) were 2; the glycine residues and proline residues were 33 and 20, respectively, among the total residues of 473 ( Table 7). Verify 3D; a tertiary structure assessment tool was applied to show that the predicted tertiary Structure passed the assessment experiment (data are not shown). The Swiss-Model Interactive Workplace, another tertiary structure assessment tool, was used for the structure validation showing that the MolProbity Score was 3.25 and Ramachandran favored was 95.54% with the QMEAN (Qualitative Model Energy Analysis), Cβ, All Atom, solvation, and torsion values of -2.57, -2.45, -3.63, -0.43, -1.89, respectively (data are not shown).

CONCLUSION
In this study, it concluded that the structural model of the protein BCRIVMBC126_02492 with predicted active sites for ligand binding are useful for UniversePG I www.universepg.com 110 understanding the protein nature. The physicochemical parameters prediction and functional annotation are useful for understanding the action of this protein's activity. The homology-modeled protein provides insights into the functional role of the protein BCRIVMBC126_02492 in pathogenesis which would help to design potential therapeutic drugs against the protein.

ACKNOWLEDGEMENTS
We are grateful to the Department of Biochemistry and Molecular Biology, and the Department of Pharmacy of Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh, for supporting this study.

CONFLICTS OF INTETEST
The authors declare no conflict of interest.