Example 2.  SARS nsp9 (polymerase)

I. Using Fold Recognition to find the appropriate templates
II. Model Building - 1st trial
III. Model evaluation

Primary sequence of nsp9:

>nsp9
SADASTFLNRVCGVSAARLTPCGTGTSTDVVYRAFDIYNEKVAGFAKFLK
TNCCRFQEKDEEGNLLDSYFVVKRHTMSNYQHEETIYNLVKDCPAVAVHD
FFKFRVDGDMVPHISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYN
CCDDDYFNKKDWYDFVENPDILRVYANLGERVRQSLLKTVQFCDAMRDAG
IVGVLTLDNQDLNGNWYDFGDFVQVAPGCGVPIVDSYYSLLMPILTLTRA
LAAESHMDADLAKPLIKWDLLKYDFTEERLCLFDRYFKYWDQTYHPNCIN
CLDDRCILHCANFNVLFSTVFPPTSFGPLVRKIFVDGVPFVVSTGYHFRE
LGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAA
LTNNVAFQTVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAI
SDYDYYRYNLPTMCDIRQLLFVVEVVDKYFDCYDGGCINANQVIVNNLDK
SAGFPFNKWGKARLYYDSMSYEDQDALFAYTKRNVIPTITQMNLKYAISA
KNRARTVAGVSICSTMTNRQFHQKLLKSIAATRGATVVIGTSKFYGGWHN
MLKTVYSDVETPHLMGWDYPKCDRAMPNMLRIMASLVLARKHNTCCNLSH
RFYRLANECAQVLSEMVMCGGSLYVKPGGTSSGDATTAYANSVFNICQAV
TANVNALLSTDGNKIADKYVRNLQHRLYECLYRNRDVDHEFVDEFYAYLR
KHFSMMILSDDAVVCYNSNYAAQGLVASIKNFKAVLYYQNNVFMSEAKCW
TETDLTKGPHEFCSQHTMLVKQGDDYVYLPYPDPSRILGAGCFVDDIVKT
DGTLMIERFVSLAIDAYPLTKHPNQEYADVFHLYLQYIRKLHDELTGHML
DMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLQ

Is it possible to use homology modelling to predict the structure of nsp9? Why not?

I. Using Fold Recognition to find the appropriate templates

1. Go to bioinfo.pl meta-server.
Enter your email address, target name and the primary sequence of the target protein.



2. Please do not press the submit button now because we do not want to abuse the server. Instead, I have submitted the job earlier. Click here to view the result.



3. The results from different servers were scored by their "3D-Jury" system. In the default setting, the top predicator (well, according the the 3D-Jury score) is BasD. Several top scorers have identified 1khv as the template structure for nsp9. The PDB code 1khv corresponds to crystal structure of RNA-dependent RNA polymerase of a RNA virus, Rabbit hemorrhagic disease virus (RHDV). Also notice that only the C-terminal region of nsp9 can be aligned with 1khv, suggesting the ~500 residues at the C-terminal of nsp9 may be a polymerase domain.

4. You can download the CA trace by clicking the [pdb] button on the right. If you have the license key for MODELLER, you can ask the server to build the full-atom model by clicking the [prog] button. You can obtained the license key for MODELLER at http://salilab.org/modeller/. Academic license is free of charge.

II. Model Building - 1st trial

1. In this tutorial, we will stick to SWISS-MODEL for model building.

2. Firstly, the N- and C-terminal residues were trimmed to simplify the alignment:


4. Now start SWISS-PDBViewer by double-clicking spdbv.exe. Load the primary sequence (nsp9C.fas) and the template structure (1khvA.pdb) to the program. I have converted the above alignment in foldfit format. After loading the foldfit alignment, you will notice the structure of nsp9 is threaded onto the structure of 1khvA.

5. Now submit the modelling request to SWISS-MODEL.

III. Model evaluation

1. Until now, we are able to predict structure with only minimal human intervention. It may give you a false impression that protein structure can be predicted automatically. However, never trust the automatically generated structure - you should inspect the alignment and the predicted structure manually to avoid obvious mistakes and to improve the prediction.

2. To save time, I have done the 1st round model building and the result can be found here (round1.pdb).

3. Now check the alignment. In particular, pay attention to gaps in the alignment. There is a big gap between Ser-335 and Asp-354 (residue numbers correspond to the sequence of 1khv) . This results in insertion of a big loop between strand A and B (see figure):



4. Is this insertion reasonable? How do you judge whether it is reasonable or not? It is time to do some literature search. More you understand the target, better the chance you predict the structure correctly. In fact, Asp-354 and Asp-355 are the active site residues of the polymerase. The insertion, which will block the active site, is very likely to be wrong.

5. Now it is time to check the result from bioinfo.pl again. Here is the alignment of the region for the top 10 scorers of bioinfo.pl server:


In this region, the alignment of the 3rd scorer is better. In this case, the insertion are moved away from the active site:


6. You can resubmit the job the SWISS-MODEL to see your result. The result is shown here.