From: Srikanth on
Hi all,

I know this might be a very simple question but for me it is a bit tough.
I have a string of n characters

for eg:
string s = "ATGCGCGAGACGTCGATAGC"

now i want to replicate the string by replacing each character with each one of the 4 characters [A,T,G,C]

for eg ,

string s = "ATGCGCGAGACGTCGATAGC"

desired out put ;
1st letter changed
"ATGCGCGAGACGTCGATAGC"
"TTGCGCGAGACGTCGATAGC"
"GTGCGCGAGACGTCGATAGC"
"CTGCGCGAGACGTCGATAGC"

2nd letter changed
"ATGCGCGAGACGTCGATAGC"
"AAGCGCGAGACGTCGATAGC"
"AGGCGCGAGACGTCGATAGC"
"ACGCGCGAGACGTCGATAGC"

and so on.........................till the all the characters, that means for a sequence of 30 characters the output will be 120 strings,

Can anyone please tell me how do I code it in matlab as I'm very beginner in coding.


Thanks a lot,
Srikanth
From: Roger Stafford on
Srikanth <srikanth.duddela(a)gmail.com> wrote in message <1541730942.245194.1275138197028.JavaMail.root(a)gallium.mathforum.org>...
> Hi all,
>
> I know this might be a very simple question but for me it is a bit tough.
> I have a string of n characters
>
> for eg:
> string s = "ATGCGCGAGACGTCGATAGC"
>
> now i want to replicate the string by replacing each character with each one of the 4 characters [A,T,G,C]
>
> for eg ,
>
> string s = "ATGCGCGAGACGTCGATAGC"
>
> desired out put ;
> 1st letter changed
> "ATGCGCGAGACGTCGATAGC"
> "TTGCGCGAGACGTCGATAGC"
> "GTGCGCGAGACGTCGATAGC"
> "CTGCGCGAGACGTCGATAGC"
>
> 2nd letter changed
> "ATGCGCGAGACGTCGATAGC"
> "AAGCGCGAGACGTCGATAGC"
> "AGGCGCGAGACGTCGATAGC"
> "ACGCGCGAGACGTCGATAGC"
>
> and so on.........................till the all the characters, that means for a sequence of 30 characters the output will be 120 strings,
>
> Can anyone please tell me how do I code it in matlab as I'm very beginner in coding.
> Thanks a lot,
> Srikanth
- - - - - - - - - - -
The original string is to be placed in 'S'. The columns of the 'T' table determine the sequence of replacement of the top character. For example, as defined below, 'C' in the top of the fourth column is to be followed in the column by 'T', 'G', and then 'A'. The requested string matrix will be found in S2.

% Define string & table
S = 'ATGCGCGAGACGTCGATAGC';
T = ['ATGC';
'TATT';
'GGAG';
'CCCA'];

% Create multiple strings in S2
n = length(S);
[D,p,p] = unique([T(:).',A]);
t = reshape(p(1:16),[],4);
s = p(17:16+n);
S2 = reshape(D(repmat(s,4*n,1)),[],n);
[ix,iy] = meshgrid(1:n,1:4);
q = p(1:4);
S2(iy+4*(n+1)*(ix-1)) = D(t(:,q(s(:))));

% Result:
ATGCGCGAGACGTCGATAGC
TTGCGCGAGACGTCGATAGC
GTGCGCGAGACGTCGATAGC
CTGCGCGAGACGTCGATAGC
ATGCGCGAGACGTCGATAGC
AAGCGCGAGACGTCGATAGC
AGGCGCGAGACGTCGATAGC
ACGCGCGAGACGTCGATAGC
ATGCGCGAGACGTCGATAGC
ATTCGCGAGACGTCGATAGC
ATACGCGAGACGTCGATAGC
ATCCGCGAGACGTCGATAGC
.......

Roger Stafford
From: Bruno Luong on
Is it what you want?

S = 'ATGCGCGAGACGTCGATAGC';

% Engine
template = 'ATGC';
[trash X] = ismember(S,template);
X = mod(bsxfun(@plus, X(:)-1,0:3),4)+1;
X = template(X).'

Bruno
From: Roger Stafford on
"Roger Stafford" <ellieandrogerxyzzy(a)mindspring.com.invalid> wrote in message <htt7go$3u0$1(a)fred.mathworks.com>...
> ......
> % Create multiple strings in S2
> n = length(S);
> [D,p,p] = unique([T(:).',A]);
> t = reshape(p(1:16),[],4);
> s = p(17:16+n);
> S2 = reshape(D(repmat(s,4*n,1)),[],n);
> [ix,iy] = meshgrid(1:n,1:4);
> q = p(1:4);
> S2(iy+4*(n+1)*(ix-1)) = D(t(:,q(s(:))));
> ......

I need to make one change in the above code to make it work for all 'T'. In place of the line

q = p(1:4);

it should be

q = 1:4; q(p(1:4:13)) = q;

Roger Stafford
From: Srikanth on
Hi

Thanks a lot for the help. That was Awesome.

Thanks again.