From: Arthur Zheng on 22 Nov 2009 00:35 I'm trying to download Genomes in Progress from NCBI. The accession number,for example, can be "NZ_ACIR00000000". The code is as simple as below: seq = getgenbank('NZ_ACIR00000000', 'SequenceOnly', 'true'); I tried a numver of times. However, I always got the error: ********************************************************************* ??? Error using ==> getncbidata>accession2gi at 370 The key NZ_ACIR00000000 has more than one sequence file associated with it in the nucleotide database. Error in ==> getncbidata at 179 [giID,db] = accession2gi(accessnum,db,'quick'); Error in ==> getgenbank at 82 gb = getncbidata(accessnum,varargin{:},'database','nucleotide','fileformat','FASTA'); ******************************************************************** What's wrong? thanks.
From: Paola Favaretto on 1 Dec 2009 11:16 Hi Arthur, Currently GETGENBANK can retrieve only one sequence at a time. The record you are trying to access (NZ_ACIR00000000) is associated with 216 sequences. Therefore, you could do one of the following: 1) You can access the information using the EUtililites. See the demo (ncbieutilsdemo - Accessing NCBI Entrez Database with E-Utilities) that ships with the toolbox for more information on how to use the EUtilities from MATLAB. 2) Alternatively, you will have to retrieve each sequence separately. Because the sequences have consecutive accession numbers starting from NZ_ACIR01000001 up to NZ_ACIR01000216, you can even automate the search by creating the accession number string and then calling getgenbank with that accession. However, there might be restrictions on how many searches of this type can be done at the NCBI site. Using the EUtils is preferable. I hope this helps. -Paola
From: Arthur Zheng on 1 Dec 2009 23:21 "Paola Favaretto" <myname.mylastname(a)mathworks.com> wrote in message <hf3fg6$o40$1(a)fred.mathworks.com>... > Hi Arthur, > > Currently GETGENBANK can retrieve only one sequence at a time. The record you are trying to access (NZ_ACIR00000000) is associated with 216 sequences. Therefore, you could do one of the following: > > 1) You can access the information using the EUtililites. See the demo (ncbieutilsdemo - Accessing NCBI Entrez Database with E-Utilities) that ships with the toolbox for more information on how to use the EUtilities from MATLAB. > > 2) Alternatively, you will have to retrieve each sequence separately. Because the sequences have consecutive accession numbers starting from NZ_ACIR01000001 up to NZ_ACIR01000216, you can even automate the search by creating the accession number string and then calling getgenbank with that accession. However, there might be restrictions on how many searches of this type can be done at the NCBI site. Using the EUtils is preferable. > > I hope this helps. > > -Paola Hi Paola, thanks for your response. I'll try your suggestions. Hao
|
Pages: 1 Prev: ACE (Alternating Conditional Expectation) Next: Close Excel using MATLAB |