and
This cleaned things up relatively well ... although there are still large chunks of un-measured space ... but these are due to the low-entropy cutoff, not sure there's much I can do about that one :(. The only thing I can think of to fix this would be too allow a window to grow indefinitely if its below the entropy cutoff ... right now if the "conserved window" is greater than 5 AAs then I just skip past it. While this would fix my problem I'm not sure it is worth the time it would take to re-compute everything (and it wouldn't add much biological significance).
I could take the adjacent values (since if I allowed the window to extend it would be the same as the adjacent value). When I do that I get this:
Which looks pretty bad ... and misleading. Since this seems to imply that whole genome can be predicted from any other part with at least 70% accuracy. Which is NOT true ... if you were given one of those low-entropy regions (without the nearby AAs) you'd be out of luck.



No comments:
Post a Comment