Methodology
Step 1: interrogating PB-PENTAdb database
The program will use a sliding window of 5 residues (pentapeptide) to parse every query protein sequences. For each pentapeptide, it will interrogate PB-PENTAdb for its presence and get a list of all local structures (protein blocks) the pentapeptide is associated with. In absence of any hit, it will check in the database of the availability of the first 4 residues and will report any protein blocks they are associated with (see figure below).
Step 2: Methods for selecting among all possible PBs at each position
To select the best PB at each position, two methods were developed: (i) the majority rule method and (ii) the hybrid method. In the majority rule method, it reports the most frequently occurring PB. In the hybrid method, the most probable PB is reported by also taking into consideration the information of local structures of adjacent residues.
General scheme |
Hybrid method |