From: Sal on
Hello everyone,
I am new to Matlab and was wondering if anyone could help me here.

I am trying to read wind data (a huge 3D array i.e. 129x109x55516), from a netcdf-4 file.
I am only able to get the data from this file, by treating it as a hdf5 file (apparently, the netcdf-4 package/library has not been installed in the lab i work in) so basically, for now, I only have hdf5 to use.

The dataset of concern is called "uwnd."

When i used the following syntax, it ended up taking a very long time and i had to abort the operation

uwnd = hdf5read( 'geowinds.nc', '/uwnd')

I also tried HDF5 low level functions but it also went on infinitely;

fileID = H5F.open('geowinds.nc', 'H5F_ACC_RDONLY', 'H5P_DEFAULT');

dataID = H5D.open(fileID, '/uwnd')

u_wnd = H5D.read(dataID, 'H5ML_DEFAULT', 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT');

Main Goal: is to get all(and not skip any) of the data stored in the dataset uwnd, and store it within some array.

I would appreciate any help on getting this enormous amount of data without taking ages??

regards

salman
From: Steven Lord on

"Sal " <salman.hafeez(a)mail.mcgill.ca> wrote in message
news:i0bfgs$7fk$1(a)fred.mathworks.com...
> Hello everyone,
> I am new to Matlab and was wondering if anyone could help me here.
> I am trying to read wind data (a huge 3D array i.e. 129x109x55516), from a
> netcdf-4 file.

If the data stored in that file is real (not complex) and double precision,
then you will need just under 6 GB of contiguous memory to store the result.
Do you have a 6 GB block of contiguous memory available to MATLAB? Even if
you do, allocating the memory, accessing that many elements from disk, and
copying them into the (large) block of memory will take some time.

> Main Goal: is to get all(and not skip any) of the data stored in the
> dataset uwnd, and store it within some array.

I would advise you, if you can break whatever calculation you're planning to
perform into pieces, is to read in a portion of the data, process it, and
write the results out to disk so you don't have to have the very large block
of memory allocated all at once. Then repeat this for each piece of the
data.

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on
http://www.mathworks.com


From: Sal on
Hello again,
I tried reading the data in increments but it didn't really work..(used the while loop, which may seem really primitive..sorry, I'm really new to Matlab!)

So following is the code.
Part 1 calls the function in increments
Part 2 is the actual function itself

Part 1
% calling the hdf5read slab
it = 5000;
offset_vari = 0;
while it <= 55516


if it == 55000

it = 55516;

offset_variable = 50000;

uwind = hdf5readslab('geowinds.nc','/uwnd',[0,0,offset_vari],[129, 109, it]);


else

uwind = hdf5readslab('geowinds.nc','/uwnd',[0,0,offset_vari],[129, 109, it]);

end

offset_vari = offset_vari + 5000;
it = it + 5000;
end
________________________________

Part 2

function data = hdf5readslab(filename, datasetname, offset, count)

% open hdf5 file
fileID = H5F.open(filename, 'H5F_ACC_RDONLY', 'H5P_DEFAULT');

% open the dataset
datasetID = H5D.open(fileID, datasetname);

%Get dataspace ID
dataspaceID = H5D.get_space(datasetID);

% select the hyperslab
stride = [1 1 1];
blocksize = [];
H5S.select_hyperslab(dataspaceID, 'H5S_SELECT_SET', fliplr(offset), fliplr(stride), fliplr(count), fliplr(blocksize));

% The following commands give me the dimensions and the max dimensions for the
% data set I am trying to read. h5_dims = [129 109 55516] and
% h5_maxdims = [129 109 -1]

h5_dims = rootlevel_Datasets(4).Dims;
h5_maxdims = rootlevel_Datasets(4).MaxDims;


% create space for the hyperslab in memory
memspaceID = H5S.create_simple(3, fliplr(h5_dims), fliplr(h5_maxdims));


%Read data with offset
data = H5D.read(datasetID, 'H5ML_DEFAULT', memspaceID, dataspaceID, 'H5P_DEFAULT');

H5S.close(dataspaceID);
H5S.close(memspaceID);
H5D.close(datasetID);
H5F.close(fileID);
end
_____________________________

Problems.
1. I couldn't figure out a way to write/save the results. (printing them on command window took too much time)

2. secondly, i realized that every time I called the function in the first part, the data obtained in the previous iteration was being over written. I also confirmed this by comparing the values i had previously obtained (e.g. uwind(1,1,1) from code i had earlier did not match uwind(1,1,1) i obtained from this code). I am wondering if there's a way to create variable sized array, which keeps expanding in size.

3. I tried declaring uwind as array but couldn't figure out how to ensure that the data from the next iteration started right after the last data of the previous iteration.

4. I was thinking of writing the data to text or Mat file..but the question here again is..how can i ensure that the data from the upcoming iteration is saved immediately after the data from the previous iteration??

I would really appreciate any guidance/pointers that could send me in the right direction.

Salman
From: TideMan on
On Jul 5, 9:14 am, "Sal " <salman.haf...(a)mail.mcgill.ca> wrote:
> Hello again,
> I tried reading the data in increments but it didn't really work..(used the while loop, which may seem really primitive..sorry, I'm really new to Matlab!)
>
> So following is the code.
> Part 1 calls the function in increments
> Part 2 is the actual function itself
>
> Part 1
> % calling the hdf5read slab
> it = 5000;
> offset_vari = 0;
> while it <= 55516
>
>    if it == 55000
>
>        it = 55516;
>
>        offset_variable = 50000;
>
>        uwind = hdf5readslab('geowinds.nc','/uwnd',[0,0,offset_vari],[129, 109, it]);
>
>    else
>
>        uwind = hdf5readslab('geowinds.nc','/uwnd',[0,0,offset_vari],[129, 109, it]);
>
>    end
>
> offset_vari = offset_vari + 5000;
>    it = it + 5000;
>  end
> ________________________________
>
> Part 2
>
> function data = hdf5readslab(filename, datasetname, offset, count)
>
> % open hdf5 file
> fileID = H5F.open(filename, 'H5F_ACC_RDONLY', 'H5P_DEFAULT');
>
> % open the dataset
> datasetID = H5D.open(fileID, datasetname);
>
> %Get dataspace ID
> dataspaceID = H5D.get_space(datasetID);
>
> % select the hyperslab
> stride = [1 1 1];
> blocksize = [];
> H5S.select_hyperslab(dataspaceID, 'H5S_SELECT_SET', fliplr(offset), fliplr(stride), fliplr(count), fliplr(blocksize));
>
> % The following commands give me the dimensions and the max dimensions for the
> % data set I am trying to read. h5_dims = [129 109 55516] and
> % h5_maxdims = [129 109 -1]
>
> h5_dims = rootlevel_Datasets(4).Dims;
> h5_maxdims = rootlevel_Datasets(4).MaxDims;
>
> % create space for the hyperslab in memory
> memspaceID = H5S.create_simple(3, fliplr(h5_dims), fliplr(h5_maxdims));
>
> %Read data with offset
> data = H5D.read(datasetID, 'H5ML_DEFAULT', memspaceID, dataspaceID, 'H5P_DEFAULT');
>
> H5S.close(dataspaceID);
> H5S.close(memspaceID);
> H5D.close(datasetID);
> H5F.close(fileID);
> end
> _____________________________
>
> Problems.
> 1. I couldn't figure out a way to write/save the results. (printing them on command window took too much time)
>
> 2. secondly, i realized that every time I called the function in the first part, the data obtained in the previous iteration was being over written. I also confirmed this by comparing the values i had previously obtained (e..g. uwind(1,1,1) from code i had earlier did not match uwind(1,1,1) i obtained from this code). I am wondering if there's a way to create variable sized array, which keeps expanding in size.
>
> 3. I tried declaring uwind as array but couldn't figure out how to ensure that the data from the next iteration started right after the last data of the previous iteration.
>
> 4. I was thinking of writing the data to text or Mat file..but the question here again is..how can i ensure that the data from the upcoming iteration is saved immediately after the data from the previous iteration??
>
> I would really appreciate any guidance/pointers that could send me in the right direction.
>
> Salman

All this seems very difficult, when it should be easy.
Seems like you are simply trying to read a standard netCDF file that
is written by GFS or similar NOAA model.
The way I would do this with my ancient version of Matlab using mexcdf
is simply:
nc=netcdf('GeoWinds.nc','nowrite');
u=nc{'uwind'}(itime,ilat,ilon);

where itime is the range of indices for time, ilat for latitude and
ilon for longitude.
For example, to get all the data for one point in the grid, it would
be:
u=nc{'uwind'}(:,100,100);

With a modern version of Matlab, the netCDF facility is built-in.
AFAIK, netCDF 4 is not a problem, and I read from GFS, NWW3, and MWW3
model .nc files daily.