From: CyberFrog on
Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <heet1i$5u8$1(a)fred.mathworks.com>...
> Bob wrote:
>
> > I have replicated the problem with the following code:
> >
> > data = {1 4 'ab' 7; 2 5 'cd' 8; 3 6 'ef' 9};
> > data_fields = {'w' 'x' 'y' 'z'};
> > S = dataset();
> > for i=1:length(data_fields)
> > S.(data_fields{i})=[data{:,i}]';
> > end
> >
> > This should give a dataset with the same dimensions - 3 rows, 4 columns. However, it actually gives a 6x4 dataset due to the two-character strings being expanded to separate rows.
> >
> > S =
> > w x y z
> > 1 4 a 7
> > 2 5 b 8
> > 3 6 c 9
> > 0 0 d 0
> > 0 0 e 0
> > 0 0 f 0
> >
> > Needless to say, this is unexpected behavior and I am unsure whether I do not sufficiently understand cell arrays, or if this is an actual bug in the language. I'm leaning towards a bug because if this behavior was introduced by design, I would think that the integer rows would have been duplicated as the strings in that row were expanded, rather than the dataset being padded with zeros, e.g.
>
> Not a bug, and only partially having to do with the finer points of cell arrays. It's this line here:
>
> S.(data_fields{i})=[data{:,i}]';
>
> that blindly takes {'ab'; 'cd'; 'ef'}, and turns it into ['abcdef']', a 6x1 column vector. When you assign that as a new variable into the existing dataset array, you implicitly add rows with default values for the other variables (i.e., zeros). This is essentially the same thing that would happen if you assigned to the (6,4)th element of a 4x2 double array, by the way.
>
> So the code I wrote was simple, but not flexible enough for your needs. You'll have to improve on it. Put in an if test, and no worries.

On the subject of arrays, how would this work if there were more than two cell arrays?