Methodology

Step 1: interrogating PB-PENTAdb database

The program will use a sliding window of 5 residues (pentapeptide) to parse every query protein sequences. For each pentapeptide, it will interrogate PB-PENTAdb for its presence and get a list of all local structures (protein blocks) the pentapeptide is associated with. In absence of any hit, it will check in the database of the availability of the first 4 residues and will report any protein blocks they are associated with (see figure below).

figure02b_kpred_database_query_and_hits

Step 2: Methods for selecting among all possible PBs at each position

To select the best PB at each position, two methods were developed: (i) the majority rule method and (ii) the hybrid method. In the majority rule method, it reports the most frequently occurring PB. In the hybrid method, the most probable PB is reported by also taking into consideration the information of local structures of adjacent residues.

General scheme

figure02_scheme_for_kpred_predictions_using_pentadb

Hybrid method

figure02c_kpred_hybrid_method