From: Frank Sabouri on
Hello -

I have a dataset generated by illumina solexa sequencing. The following shows "start" and "end" points of a gene on chromosom1. Size of data is 34000-by-2.

"Chromosom" "Start" "End" "Gene"
"chr1" 4224 7502 "Gene A"
"chr1" 4224 19233 "Gene B"
"chr1" 24474 25944 "Gene C"
"chr1" 58953 59871 "Gene D"
.. . . .
.. . . .

I am going to create a plot from this data in this way:

0:1:4224 =0
4224:1:7502=1;
4224:1:19233=2;
19233:1:24474=0;
24474:1:25944=1;
25944:1:58953=0;
58953:1:59871=1;

I am appreciative if someone could show me to a script and generate this plot.

Thanks,
Frank
From: Darren Rowland on
Frank,

what's the rule for assigning numbers to the gene for plotting? Something to do with overlapping?

And what sort of plot are you expecting i.e. line chart, bar chart etc.

Darren
From: Frank Sabouri on
"Darren Rowland" <darrenjremovethisrowland(a)hotmail.com> wrote in message <hsii7d$jmf$1(a)fred.mathworks.com>...
> Frank,
>
> what's the rule for assigning numbers to the gene for plotting? Something to do with overlapping?
>
> And what sort of plot are you expecting i.e. line chart, bar chart etc.
>
> Darren

For &#8220;X axes&#8221; for example (4224:1:7502) each number corresponded with a nucleotide, whereas for &#8220;Y axes&#8221; I set three arbitrary values: &#8220;0&#8221; (indicating no open reading frame) and &#8220;1&#8221; (indicating an open reading frame). There were some overlaps in open reading frames as you noticed. I used value &#8220;2&#8221; to overcome this problem. Do you have another idea?! Dot plot (&#8216;*&#8217;) works.

Frank