In silico Structural Modelling of Ribokinase from Salmonella Typhi

The knowledge of identifiable differences in the metabolism and macromolecular structure between infective agents and their host can be exploited in rational drug design. Ribokinase, an enzyme that plays an important role in the phosphorylation of several metabolites is one of such that can be exploited. This study was therefore aimed at structurally modelling ribokinase from Salmonella Typhi, the causative agent of typhoid fever, with several known multi-drug resistant strains. NCBI BLASTp was carried out against Protein Data Bank (PDB) to run a similarity search. Multiple sequence alignment between the query sequence and the templates was carried out using clustal omega and MEGA6.0 software. The amino acid sequence was submitted to modelling servers. The predicted models from the servers were evaluated with RAMPAGE and superimposed in the template using PyMOL. Model with highest Ramachandran plot score was further validated. BLASTp result showed low identity of (41%) with pyridoxal kinase from Trypanosoma brucei in PDB database. Conserved sequence motifs were confirmed. Template 4X8F was chosen based on its high identity, query cover and appearance in the modeling tools. Swiss model showed best Ramachandran plot score (94.9%). ERRAT analysis showed quality factor: 92.9078 and VERIFY3D server showed that 84.43% of the residues have an average score of 3D/ ID score >=0.2. Superimposition confirmed the alignment of the active site residues having aspartic acid as the catalytic residue. This study can serve as a means for rational drug design for the treatment of typhoid fever. Hassana Abubakar, Yakubu Ndatsu, Achimugu Dickson Musa, Cyril Ogbiko et al. http://www.earthlinepublishers.com 192


Introduction
Salmonella enterica serovar Typhi (Salmonella Typhi), formally known as Salmonella typhi is a gram-negative bacterium belonging to the family of enterobacteriaceae and found among the first class of salmonella serovars, salmonella enterica subspecies enterica (I) [1][2][3]. This organism is a human-specific pathogen which causes typhoid fever, a general feverish sickness developed by consuming water or food polluted by human faeces [1,4]. In developing nations where sanitation is poor, typhoid fever is a major health problem, causing many deaths [5]. Infection occurs when Salmonella Typhi from contaminated water or food enters into the gastrointestinal tract of the host [6]. The organism gets into the distal ileum and then to the Peyer's patches' specialized intestinal epithelial M cells. The pathogen moves into the mesenteric lymph nodes after the intestinal invasion and subsist in macrophages. Through the blood and lymph systems, the organism gets to the liver, spleen, and bone marrow, where it reproduces to generate systemic infection [7,8].
The treatment of typhoid fever with various antibiotics such as chloramphenicol, ampicillin, co-trimoxazole, fluoroquinolones, cephalosporins, azithromycin among others has been reported [9][10][11]. However, multidrug-resistant (MDR) S.Typhi strains has resurfaced and is rapidly spreading worldwide, resulting in high rates of morbidity and mortality [9] and according to Ogbunude et al. [12], post-genome bioinformatics and experimental research can be used to easily discover drug targets. Identifiable dissimilarities between an infective agent and its hosts with respect to metabolism and macromolecular structure provide scopes for detailed characterization of target proteins and/or macromolecules as the purpose of rational drug design [13]. Salmonella bacteria has been reported to exploit a diverse nutrient in the host cell to survive, which include various carbohydrates, lipids, nucleosides, amino acids and pro-vitamins [14]. D-ribose is one of the nutrients, which its metabolism is of great interest because of its utilization in the production of nucleic acids and various cofactors. Prior to the utilization of D-ribose in the cell, it must be phosphorylated to D-ribose-5-phosphate by the enzyme, ribokinase (EC2.7.1.15) [15]. A thorough knowledge of the properties of the enzyme(s) that is unique to an infective agent is essential in order to design specific compounds that can be used to target it [13]. Ribokinase (E.C 2.7.1.15) is categorized in the phosphofructokinase B (PfkB) family of sugar kinases [16]. This enzyme plays vital role in the phosphorylation of ribose in the presence of ATP and magnesium. The product, D-ribose-5-phosphate can then be used in the biosynthesis of nucleotides, tryptophan and histidine, 193 or as an intermediate in pentose phosphate pathway. Ribokinase is also essential in the recycling of sugar generated from nucleotide degradation [15]. The catalytic mechanism of ribokinase in the metabolism of D-ribose is important, as D-ribose is the precursor of nucleic acids and various cofactors. Crystal structures of ribokinase from some organisms such as the E. coli [15] and human [17] have been reported using crystallographic method, which explained the role of ribokinase in the metabolism of ribose. Similarly, studies on the activity of the ribokinase from protozoa has been carried out using wet laboratory method [18]. Although Zhang et al. [19] reported the 3-dimensional structure of an aminoimidazole riboside kinase from Salmonella enterica, believed to have evolved from the ribokinase superfamily, to the best of our knowledge, there is no any structural model of ribokinase from Salmonella Typhi. Thus, this study was aimed at determining the 3-dimensional structure of ribokinase from a Salmonella Typhi, which we believe could be useful in structure-based drug design.

Sequence retrieval and analysis
The amino acid sequence (404 residues) of Ribokinase from Salmonella Typhi with accession number Q8Z239 was obtained from UniprotKB [20] in a FASTA format. The sequence was submitted to the NCBI for BLAST panalyses [21], it was labeled as ribokinase and run against proteins in the protein data bank (PDB) [22]. Conserved regions between the query sequence and the templates were confirmed by multiple sequence alignment using clustal omega [23] and MEGA6.0 software [24].

Homology modeling
The amino acid sequence was submitted to SWISS MODEL [25], Phyre2 [26], RaptorX [27], and I TASSER [28] servers for model prediction of ribokinase. The models from swiss model, Phyre2, RaptorX and I TASSER servers were evaluated with RAMPAGE. The model structure was visualized using PyMOL software [29].

Model validation
The predicted models from each server were assessed and validated to determine the model with good quality profile. Ramachandran plot analysis was done to check the stereochemical features of the predicted 3-dimensional structure of the models. Model with highest Ramachandran plot score was further validated with ERRAT analysis and VERIFY3D servers.

Sequence analysis
The result of the sequence analysis for similarity search that was carried out using NCBI BLASTp showed low identity of 40% (Table 1) with other proteins deposited in PDB. It also showed specific hits; superfamily hits and conserved domain ( Figure 1). Multiple sequence alignment between the query, 6CW5, 2FV7, 1VM7, 3RY7, 1RK2 and 4X8F sequences using clustal Omega and MEGA6.0 software confirmed the active site residues along with the putative conserved amino acids (Figures 2 and 3).

Homology Modelling
Among all the 3D models predicted by the different modeling servers, the from SWISS MODEL server (Figure 4) has the highest Ramachandran plot score number of residues in favored region 94.9%, allowed region 4.1%, and outlier region 1.0% ( Table 2). The surface and cartoon view of the predicted model is shown in 4 and 5 respectively. Among the templates, 4X8F high query cover and appearance in the modeling tools.  RAMPAGE results of each models from each server were analyzed and compared based on the number of residues in favored regions, allowed regions and outlier regions. Each was estimated in percentage.  (blue) superimposed with the template (yellow); Superimposition of active site residues for model (green) and template (blue).

Model validation
ERRAT analysis and VERIFY3D servers were used to further validate the predicted model from SWISS MODEL server due to its highest Ramachandran plot score. ERRAT analysis showed that the model has high quality because all the amino acid residues (green color) are within the accepted value ( Figure 8) and VERIFY3D result also confirmed the quality of the model which showed that about 80% of the residues have scored >= 0.2 in the 3D/1D profile (Figure 9). In addition, the phylogenetic tree showing the evolutionary relationship between and other ribokinase in the gene bank is shown in Table 10.  ERRAT analysis and VERIFY3D servers were used to further validate the predicted model from SWISS MODEL server due to its highest Ramachandran plot score. ERRAT analysis showed that the model has high quality because all the amino acid residues ) are within the accepted value ( Figure 8) and had a score of 92.9078% VERIFY3D result also confirmed the quality of the model which showed that about 80% of the residues have scored >= 0.2 in the 3D/1D profile (Figure 9). In addition, the ee showing the evolutionary relationship between ribokinase from and other ribokinase in the gene bank is shown in Table 10.
ERRAT analysis showing the overall quality of the model. Two lines drawn on the axis of the ERRAT Error values, signifies the confidence level with which it is possible to reject regions that exceed that error value.
. Profile for 3-dimensional identity of the model.

199
ERRAT analysis and VERIFY3D servers were used to further validate the predicted model from SWISS MODEL server due to its highest Ramachandran plot score. ERRAT analysis showed that the model has high quality because all the amino acid residues had a score of 92.9078%. VERIFY3D result also confirmed the quality of the model which showed that about 80% of the residues have scored >= 0.2 in the 3D/1D profile (Figure 9). In addition, the ribokinase from S.Typhi ERRAT analysis showing the overall quality of the model.
Two lines drawn on the axis of the ERRAT Error values, signifies the confidence level with which it is possible to reject regions that exceed that error value.

Discussion
From the sequence analysis result, the similarity search using NCBI blast shows that the query sequence belongs to the member of this protein family catalyzes ATP 31]. BLASTp result in Table 1 showed low identity of 41% with pyridoxal kinase from Trypanosima brucei (3ZS7), an enzyme that transfers the gamma vitamin B6 biosynthesis [32], which belongs to superfamily of ribokinase. However, this enzyme is not found among the templates which could be due to the low query cover of 12%. Multiple sequence alignment between the query sequence and templates generated from blast results using clustal Omega and MEGA software confirmed the conserved regions. Members of this family of enzymes have well conserved residues [15,30,33] which is seen in Figures 2 and 3. The active site residues correspond to GGKGANQ AGD. Though, the first Ala (A) that is present in the active site of enzyme from other organism is replaced by Cys(C) in the amino acid sequence of ribokinase from However, the second Ala Gln(Q) residues which their main Nitrogen chain form indirect hydrogen bond with the oxygen of the ribose through water molecule. Whereas the first (N)are involved in direct formation of hydrogen bond with the ribose sugar and remaining Gly (G) residues are needed for conformational and steric reasons [15]. Figures 2 and 3 also show an absolutely conserved motif (blue highlight) known as NXXE. This motif is a common featu asparagine residue (N) and the glutamic acid residue (E) have been discovered to be From the sequence analysis result, the similarity search using NCBI blast shows that the query sequence belongs to the phosphofructokinase b (pfkb) family (Figure member of this protein family catalyzes ATP-dependent phosphorylation reaction [30,31]. BLASTp result in Table 1 showed low identity of 41% with pyridoxal kinase from (3ZS7), an enzyme that transfers the gamma-phosphate vitamin B6 biosynthesis [32], which belongs to superfamily of ribokinase. However, this enzyme is not found among the templates which could be due to the low query cover of nce alignment between the query sequence and templates generated from blast results using clustal Omega and MEGA software confirmed the conserved Members of this family of enzymes have well conserved residues [15,30,33] 2 and 3. The active site residues correspond to GGKGANQ Though, the first Ala (A) that is present in the active site of enzyme from other organism is replaced by Cys(C) in the amino acid sequence of ribokinase from However, the second Ala (A) is the residue reported among the last Gly (G), Asp (D) and Gln(Q) residues which their main Nitrogen chain form indirect hydrogen bond with the oxygen of the ribose through water molecule. Whereas the first Gly (G), Lys (K), Asn rect formation of hydrogen bond with the ribose sugar and remaining Gly (G) residues are needed for conformational and steric reasons [15]. Figures 2 and 3 also show an absolutely conserved motif (blue highlight) known as NXXE. This motif is a common feature of the Ribokinase family [16,34]. The and the glutamic acid residue (E) have been discovered to be From the sequence analysis result, the similarity search using NCBI blast shows that phosphofructokinase b (pfkb) family ( Figure 1). The dependent phosphorylation reaction [30,31]. BLASTp result in Table 1 showed low identity of 41% with pyridoxal kinase from phosphate of ATP in vitamin B6 biosynthesis [32], which belongs to superfamily of ribokinase. However, this enzyme is not found among the templates which could be due to the low query cover of nce alignment between the query sequence and templates generated from blast results using clustal Omega and MEGA software confirmed the conserved Members of this family of enzymes have well conserved residues [15,30,33] 2 and 3. The active site residues correspond to GGKGANQ and Though, the first Ala (A) that is present in the active site of enzyme from other organism is replaced by Cys(C) in the amino acid sequence of ribokinase from S.Typhi.
residue reported among the last Gly (G), Asp (D) and Gln(Q) residues which their main Nitrogen chain form indirect hydrogen bond with the Gly (G), Lys (K), Asn rect formation of hydrogen bond with the ribose sugar and remaining Gly (G) residues are needed for conformational and steric reasons [15]. Figures 2 and 3 also show an absolutely conserved motif (blue highlight) known as family [16,34]. The and the glutamic acid residue (E) have been discovered to be related with the phosphate and metal binding to the enzyme respectively [30]. Park and Gupta [16] proposed that when a phosphate or an activator compound binds to the NXXE motif, it facilitates binding of free Mg 2+ and the substrate adenosine to the active site. Through site directed mutagenesis, Maj et al. [34] discovered the critical role asparagine and glutamic acid residues in NXXE motifplay in the binding of phosphate and Magnesium ion needed for adenosine kinase activity. In addition, mutation of glutamic acid residue in NXXE motif ofphosphofructokinase-2 from E. coli results in enzyme with very low affinity towards magnesium [35].
The active site residues of the model and the template (4X8F) were found to be well aligned by superimposition using PyMOL. These residues of ribokinase from S.Typhi and (V. cholera) were Gly128(Gly41), Gly129(Gly42), Lys130(Lys43), Gly131(Gly44), Cys132(Ala45), Asn133(Asn46), Gln134(Gln47)and Ala341(Ala253), Gly342(Gly254), Asp343(Asp255) respectively ( Figure 8). The Aspartic acid found in the active (Asp343) has been reported to be strictly conserved in this family of enzymes. Reduction and complete loss of activities of kinases from different organisms had been observed when this residue was mutated into other amino acids [30]. This residue is proposed to proceed as the catalytic base that removes a proton from the ribose O5'-hydroxyl group. A nucleophilic attack is then made on the δ-phosphate of ATP by the O5' atom as it becomes negatively charged and result in the formation of a pentacovalent transition state which is stabilized by the anion hole that the δ-phosphate of ATP is situated, [15,16,33]. With the aid of a PyMOL software, the 3-dimensional structure of the ribokinase from S.Typhi is revealed in Figure 4 (cartoon view) and Figure 5 (surface view) respectively. Also, the active site is shown in Figure 6 and the superimposition of the model and the template in Figure 7. Sigrell et al. [15] determined the first 3D crystal structure of Ribokinase from E. coli. Studies have discovered that most of this enzyme family structurally exists as dimer, with each monomer having two domains. A nine β-sheet strands sided by 10 α-helices constitute the large domain where all the interactions to ribose and ATP are made. The smaller domain is made up of four β-strands that form a cover for the active site. A flattened b-barrel is formed by the dimer interaction between the two β-sheets [15,36,33]. Predicted model from the Swiss model was remarkable when compared to model from the other servers, with virtually all the residues falling in favoured region (94.9%), allowed regions (4.1%) and just 1.0% was estimated for the residues in the outlier region (Table 2). This in essence shows great arrangement [37]. ERRAT analysis that was carried out to check the backbone configuration of the model was of very high quality because all the amino acid residues (green color) are within the 202 accepted value (Figure 8). The range of the value accepted for the ERRAT analysis is possibility of >50% and the higher the score the better the accuracy and quality of the model structure [38]. The model predicted has a score of 92.9078%, which indicates that the structural conformation is of high efficiency. To further assess the efficiency of the predicted model, it was subjected to VERIFY3D server. The results show that 84.43% of the residues have an average score 3D-1D score >=0.2 ( Figure 9) when compared with the literature that at least 80% of the amino acid residues in the structure must have a score of >=0.2 [39]. The result indicates very high score and denotes accuracy and precision of the predicted model. Phylogenetic tree is shown in Figure 10 where the enzyme is found in the midst of the pfkb proteins family.

Conclusion
From the model validation results, the 3D structure of the Ribokinase predicted by Swiss model in this study is a good model. This is further confirmed from the result of the superimposition of the model and the template result, especially the alignment of the active site residues. This study can hence serve as a guide for rational drug design aimed at the treatment of typhoid fever by inhibiting the activity of aspartic acid, which act as the catalytic base in the active site. In essence, the inhibition of ribokinase pathway will prevent the pathogen from utilizing ribose from the host cell for the synthesis of nucleic acids.