From: Neha on
"Faraz Afzal" <farazafzal(a)gmail.com> wrote in message <i0a73m$hlj$1(a)fred.mathworks.com>...
> "Neha " <smile_neha16(a)yahoo.co.in> wrote in message <i0a63r$c5g$1(a)fred.mathworks.com>...
> > Hi,
> >
> > There is one problem regading reading of the data. Actually I am imputing the missing values using pchip but the problem in computation is that there are around 200 different categories in the excel sheet for which I need to calculate the values independently. in total there are around 16000 rows. Now I want to compute the missing value for 1-100, then 100-200 etc. I am using the following code
> >
> > for i=1:100
> > y(i)=data(i,6);
> > end
> >
> > x=1:100
> > B=isnan(y);
> > for i=1:100
> >
> > if (B(i))
> > data (i,6) = ceil((pchip(x,y,x(i)))
> > end
> > end
> >
> > Now what I want is that I used pchip to compute the missing values for first 100 observations then 100-200. For this I changed in the above code i= 100-200 (I am not sure if this is right to do).Similarly I will change for different rows.
> >
> > Please suggest how can I do that, since it will be really cumbersome to make sepearte excel sheets for each category and then import them seperately.
> >
> > Regards
> > Neha
>
> Dear Neha,
>
> Tell me what You mean by categories..
> If you are saying categories are the columns of an excel sheet your problem is failry simple..
> As I understood you have an excel sheet of size (16000 x 200)
> rows = 16000
> columns = 200
> and you want to use pchip to interpolate the missing data within all 200 columns and 16000 rows..
> If this is what u want tell me and i will show u how u will get thta..
> Lastly, if u want the interpolation to be realistic why u r using CEIL ??
> You problem bit more explained would let me help you. Perhaps an example will surve the prupose.
>
> Regards,
> Faraz

Hi Faraz,

Thanks for your response.

Actually,
I am working for a company which have their count data for customers. This data is for each day. Also there are two different counts for different classes. But the problem is that the counts for each day is not available i.e. its missing. But this count is necessary since company want to forecast the demand for the future. I started working on the algorithm part of Matlab and used the above mentioned code.

Here in my case categories are not columns.

My excel sheet looks like as follow:

Category Day of week Date First Class Second Class
A Mon 19-04-2010 23 125
A Mon 26-04-2010 NAN 200
A Mon 03-05-2010 35 NAN
A Tues 20-04-2010 2 65
A Tues 27-04-2010 13 NAN
A Wed 21-04-2010 NAN 85
A Wed 28-04-2010 2 NAN
B Mon 19-04-2010 7 NAN
B Mon 26-04-2010 35 13
B Tues 20-04-2010 NAN 26
B Tues 27-04-2010 NAN NAN
B Thurs 22-04-2010 NAN 123
B Thurs 29-04-2010 8 NAN

Now these are different categories/codes the count of which is independent to each other i.e. missing value of A should be computed on the basis of values in B. Also for a particular category there is no correlation in the demand pattern for different days of the weeks i.e. Mon is independent from Tuesday and so on.

This means I need to compute first values of only Mondays in category A, then tues and so on.

So now I want the excel to read these rows with same days seperately and compute values i.e. it should interpolate first withing first three rows i.e. all mondays in category A and so on.

I know its really cumbersome.

For CEIL: The concern is its a count data which cannot be in decimals and I have to take the integer. Probably I can interpolate and then round off. Please suggest.

Initially I started the interpolation using Spline but then the values were coming negative which is not feasible in my case so with the suggestion I changed it to Pchip.

Any suggestions on ground of interpolation method will also be help.

Regards
Neha
 | 
Pages: 1
Prev: Generation of pdf from Matlab
Next: Axes in GUI