From: Neha on 29 Jun 2010 09:30 "Neha " <smile_neha16(a)yahoo.co.in> wrote in message <i0as0g$8sc$1(a)fred.mathworks.com>... > "Faraz Afzal" <farazafzal(a)gmail.com> wrote in message <i0an6u$j67$1(a)fred.mathworks.com>... > > "Neha " <smile_neha16(a)yahoo.co.in> wrote in message <i0a8il$ois$1(a)fred.mathworks.com>... > > > "Faraz Afzal" <farazafzal(a)gmail.com> wrote in message <i0a73m$hlj$1(a)fred.mathworks.com>... > > > > "Neha " <smile_neha16(a)yahoo.co.in> wrote in message <i0a63r$c5g$1(a)fred.mathworks.com>... > > > > > Hi, > > > > > > > > > > There is one problem regading reading of the data. Actually I am imputing the missing values using pchip but the problem in computation is that there are around 200 different categories in the excel sheet for which I need to calculate the values independently. in total there are around 16000 rows. Now I want to compute the missing value for 1-100, then 100-200 etc. I am using the following code > > > > > > > > > > for i=1:100 > > > > > y(i)=data(i,6); > > > > > end > > > > > > > > > > x=1:100 > > > > > B=isnan(y); > > > > > for i=1:100 > > > > > > > > > > if (B(i)) > > > > > data (i,6) = ceil((pchip(x,y,x(i))) > > > > > end > > > > > end > > > > > > > > > > Now what I want is that I used pchip to compute the missing values for first 100 observations then 100-200. For this I changed in the above code i= 100-200 (I am not sure if this is right to do).Similarly I will change for different rows. > > > > > > > > > > Please suggest how can I do that, since it will be really cumbersome to make sepearte excel sheets for each category and then import them seperately. > > > > > > > > > > Regards > > > > > Neha > > > > > > > > Dear Neha, > > > > > > > > Tell me what You mean by categories.. > > > > If you are saying categories are the columns of an excel sheet your problem is failry simple.. > > > > As I understood you have an excel sheet of size (16000 x 200) > > > > rows = 16000 > > > > columns = 200 > > > > and you want to use pchip to interpolate the missing data within all 200 columns and 16000 rows.. > > > > If this is what u want tell me and i will show u how u will get thta.. > > > > Lastly, if u want the interpolation to be realistic why u r using CEIL ?? > > > > You problem bit more explained would let me help you. Perhaps an example will surve the prupose. > > > > > > > > Regards, > > > > Faraz > > > > > > Hi Faraz, > > > > > > Thanks for your response. > > > > > > Actually, > > > I am working for a company which have their count data for customers. This data is for each day. Also there are two different counts for different classes. But the problem is that the counts for each day is not available i.e. its missing. But this count is necessary since company want to forecast the demand for the future. I started working on the algorithm part of Matlab and used the above mentioned code. > > > > > > Here in my case categories are not columns. > > > > > > My excel sheet looks like as follow: > > > > > > Category Day of week Date First Class Second Class > > > A Mon 19-04-2010 23 125 > > > A Mon 26-04-2010 NAN 200 > > > A Mon 03-05-2010 35 begin_of_the_skype_highlighting 03-05-2010 35 end_of_the_skype_highlighting NAN > > > A Tues 20-04-2010 2 65 > > > A Tues 27-04-2010 13 NAN > > > A Wed 21-04-2010 NAN 85 > > > A Wed 28-04-2010 2 NAN > > > B Mon 19-04-2010 7 NAN > > > B Mon 26-04-2010 35 13 > > > B Tues 20-04-2010 NAN 26 > > > B Tues 27-04-2010 NAN NAN > > > B Thurs 22-04-2010 NAN 123 > > > B Thurs 29-04-2010 8 NAN > > > > > > Now these are different categories/codes the count of which is independent to each other i.e. missing value of A should be computed on the basis of values in B. Also for a particular category there is no correlation in the demand pattern for different days of the weeks i.e. Mon is independent from Tuesday and so on. > > > > > > This means I need to compute first values of only Mondays in category A, then tues and so on. > > > > > > So now I want the excel to read these rows with same days seperately and compute values i.e. it should interpolate first withing first three rows i.e. all mondays in category A and so on. > > > > > > I know its really cumbersome. > > > > > > For CEIL: The concern is its a count data which cannot be in decimals and I have to take the integer. Probably I can interpolate and then round off. Please suggest. > > > > > > Initially I started the interpolation using Spline but then the values were coming negative which is not feasible in my case so with the suggestion I changed it to Pchip. > > > > > > Any suggestions on ground of interpolation method will also be help. > > > > > > Regards > > > Neha > > > > I was expecting something that could help me understand your problem but this has made it even more complex... Let me help u according to my understanding of problem. > > > > Say you want to separate the DAY OF THE WEEK column with MON in it only.. > > If this is what you want then you can make use of logical operators for your operations.. > > > > you can use if -else statements.. > > > > e.g. > > for i = 1:100 > > Here you will define a loop which scans each and every row of your excel sheet > > If 'MON' == 'MON' > > Here you can supply your operation what ever you want to do with MONDAYS. > > else > > Here you can supply the loop to g0 further to scan other categories when u r up with all ur mondays. > > > > Lastly I wonder why u r using the Hermitian distribution for the interpolation. If your data is not changing widely linear interpolation is the best way to get you through.. > > Try using interp1.. Spline is true for fairly smooth data in the curves.. > > > > I dont know if i really helped u in any ways.. > > > > All you can do is to write me in steps what u want and i will see if i can come up with solution. > > > > E.g. > > 1. Scan all the 16000 rows for word, letter, character, number etc > > 2. once extracted do this bla bla. > > 3. now compare them with data on these bla bla rows.. > > 4. if condition is fulfilled do this bla bla.. > > > > I know tehre is too much of bla bla as i was not able to come up with something better. > > Regards, > > Faraz > > Hi Faraz, > > Thanks for this. > > I think I am not so perfect in explaining the problem. Well, this time I try and explain my concerns in better way. > > If you don't mind inspite of posting them here, can I mail you. Since its not easy to send anything in table form? If not then I will explain the problem here only. > > Regards > Neha Hi Faraz, Well, I did something using what you suggested last but thats giving some error. I will say my problem once again if that make some sense else I will try and figure out something. 1) In the excel sheet as I told you before I have 16000 rows, now I have segmented the whole sheet under verious segments and have different starting and ending point points for each segment, i.e. I am having different slopes for different segments. 2) The code I have written will draw just one slope for all 16,000 rows. Whereas I want to draw difefrent slope for each segment and impute the values seperately for each segment within that slope. (Each segment might have 50-100 rows in it) 3) I want the code that should read my excel sheet as follow: 3.1.) for first 100 rows i.e. for i=1:100 when it is monday then compute the value for first and second class seperately - This means I want to take 1st and 100th value as starting and ending point to calculate all Mondays value lying in that segment 3.2.) for rows 100-200 when it is tuesday then compute the values for first and second class - Similarly, in rows 100 to 200 I have all tuesday data so I want 100th and 200th as starting and ending value respectively and set another slope and compute values. I have my sheet sorted as per my segments and days of the week. I hope this will explain my problem. For Hermitian distribution for the interpolation: Actually my data is right skewed with a very long tail. So, I thought of using pchip but I will try with linear as well. Thanks for this. Thanks for your help. Regards Neha
|
Pages: 1 Prev: How to close m-file with code Next: Deriving mean of each row in a matrix, disregarding zeros |