CMSC 435 Algorithm Analysis & Design

RNA Folding

We want to minimize the free energy of an RNA molecule by maximizing the number of pairs.
Constraints – Let S be the maximal set of pairs in the RNA folding problems. Then we must have:
  1. (No sharp turns) The ends of each pair are separated by at least 4 intervening bases. If (bi,bj) ϵ S, then j – i > 4

  2. Only bases A and T or G and C will pair with one another.

  3. S is a matching. No base appears in more than 1 pair.

  4. (The no crossing condition) If (bi,bj) and (bk,bh) are two pairs in S, then we cannot have i < k < j < h.

         Then any pair of bases that satisfy the above constraints may or may not be paired.

         If we let OPT (i, j) represent the maximum number of pairs in the sub-string from bi to bj then the optimal folding will either leave bj unpaired, or pair it with one of the bases bt, where t lies in the permissible range between i and j - 4

         OPT(i, j) = max[OPT(i, j – 1), max(i < t < j - 4)(1 + OPT(i, t -1) + OPT(t + 1, j – 1))]

        Construct 2 tables of size n+1 by n+1 – one to hold the number of base pairs and one to hold the identity of the paired bases.

        Let b = <b1, b2, b3, ...bn> be a sequence of bases in an RNA molecule – denoted by the characters A, C, G, T where A and T can pair with each other and C and G can form pairs.

                 Algorithm RNAfold (p) {

                       // initialize the forbidden pairs to 0;

                       for(int i = 1; i <= n – 5; i++)

                            for (int j = 0; j < i + 4; j++) {

                                 M[i][j] = 0;

                                P[i][j] = 0;

                            }

                       }

                      for (int k = 5; k < n; k++) //examine all allowable separations

                            for (i = 1; i <= n – k; i++) {

                                j = i + k;

                               int max = 0, pair;

                              for (int t = i; t < j – k; t++) {

                                  if (b[t] &&b[j] in the same set) //the two bases can be paired

                                       int temp = 1 + M[i][t-1] + M[t+1][j-1];

                                        if (temp > max) {

                                            max = temp; pair = t;

                                        }

                                  } // end if in same set

                                  if (max > M[i][j-1]) {

                                     M[i][j] = max;

                                     P[i][j] = pair;

                                 } // end if

                                 else {

                                    M[i][j] = M[i][j-1];

                                    P[i][j] = 0;

                                 } // end else

                              } //end for t

                            } // end for i

                      } //end for k

                 } // end algorithm