Thursday, March 3, 2011

Today's project

Now that SequenceDownloader has gotten 50 species I need to refactor the AnalysisCode so it can deal with MULTIPLE separate species.  The question is whether I should do a "depth-first" or "breadth-first" method

Breadth First: For each step, (ie. Download, Processing, Alignment, Linkage, Enrichment, figure generation) process all species before advancing to the next step. The advantage here is that it will be the "easiest" option to code since all I'll have to do is modify the generating functions to take new inputs. The disadvantage is that I'll have to wait a while before I get any results (its an "all or nothing" type of approach). I guess I could start my processing by doing breadth-first on only a few (maybe the top 5).

Depth First: For each species do all steps before advancing to the next species. This would be nice because I can get incremental results. However, I think it would require either a complete re-tooling or some drastic "kludging" to get everything working.

As far as how am I going to keep all of the settings together ... I plan to use a YAML document and then keep that in the source-control. That way everything stays together in one place and can be easily modified when required.

No comments:

Post a Comment