From: Oleg Komarov on 23 Dec 2009 07:19 > cellfun(@(x) regexp(x,'(\S+)(\d+)','match'),myCellstr,'un',0); > > Branko Thanks Branko! just two thoughts: cellfun(@(x) regexp(x,'(\S+)(\d+)','match'),myCellstr); % no need to enclose in cell output this solutions is slightly slower on my system. Oleg
From: Jason Breslau on 23 Dec 2009 09:36 regexp and regexprep both support cell arrays, so you don't need the call to cellfun: regexprep(myCellstr,'\([A-z]+\)','') -=>J
From: Jan Simon on 23 Dec 2009 16:15 Dear Oleg! > regexprep(myCellstr,'\([A-z]+\)','') REGEXPREP can be slow for large cell strings. So let's trying something different: myStr = [myCellStr{:}]; newCellStr = dataread('string', myStr, '%s', 'delimiter', '()'); cleanCellStr = newCellStr(1:2:length(newCellStr)); For large cell strings, the [C{:}] can be accelerated by CStr2String: http://www.mathworks.com/matlabcentral/fileexchange/26077 I'll try it with some test data and post the times soon. Jan
From: Jan Simon on 23 Dec 2009 16:40 Dear Oleg! > > > regexprep(myCellstr,'\([A-z]+\)','') > > REGEXPREP can be slow for large cell strings. So let's trying something different: > myStr = [myCellStr{:}]; > newCellStr = dataread('string', myStr, '%s', 'delimiter', '()'); > cleanCellStr = newCellStr(1:2:length(newCellStr)); > > For large cell strings, the [C{:}] can be accelerated by CStr2String: > http://www.mathworks.com/matlabcentral/fileexchange/26077 Nope, that does not help in Matlab 2009a anymore, but in Matlab 6.5. The REGEXPREP is twice as fast as the DATAREAD(CAT) method for a {10000 x 1} cell now. Another idea: Has the initial part always the same lengths? cleanCellStr = dataread('string', sprintf('%.12s#', myCellStr{:}), '%s', 'delimiter', '#'); But forget this: REGEXPREP is 12 times faster... Better: cleanCellStr = cellfun(@(x) x(1:12), myCellstr, 'UniformOutput', false); This is 3.5 times faster than the REGEXPREP, but assuming the equal length of the inital part may be too sloppy. More general: cleanCellStr = cellfun(@(x) x(1:findstr(x, 'C')), myCellstr, 'UniformOutput', false); This is at least 35% faster than the REGEXPREP method, but it fails if there is not exactly one opening bracket. Kind regards, Jan
From: Jan Simon on 23 Dec 2009 19:11 Typo: Replace 'C' by '(': cleanCellStr = cellfun(@(x) x(1:findstr(x, 'C')), myCellstr, 'UniformOutput', false); ==> cleanCellStr = cellfun(@(x) x(1:findstr(x, '(')), myCellstr, 'UniformOutput', false); Jan
|
Pages: 1 Prev: Tanengrad (Tanenbaum`s Method) Next: Could not start JVM while installing |