From: UKPhD on
I am using RRR to create dietary patterns at a baseline time point and
using this pattern to look at diet over a 10 year period which
requires repeated scores.

Using the xweight ods output from PROC GLM RRR we are able to use PROC
SCORE to produce a score based on individuals food intake. However, we
have hit a problem. To check the methodology, we have applied the PROC
SCORE to the same data used in the PROC GLM. The hypothesis was that
the applied and natural scores would be the same, however, they are
not. They are systematically different by 11.86%. We have found the
same systematic difference when we do these analyses in two quite
different cohorts - a child cohort and a severely-obese adult
cohort.

I have checked the coding and it appears proc score is being used
correctly and I have also manually calculated the score and this
systematic difference is still present.

Does anyone have any ideas why this very consistent error keeps
cropping up? Does anyone have information on the way in which PROC GLM
method = RRR applies its xweights to the data to create the score?

Hope there is some sense in there....I can post the code if needed

Thank you for any help...
From: UKPhD on
CORRECTION PROC PLS METHOD = RRR and not PROC GLM

From: UKPhD on
Below is a simplified pattern with predictor variables grouped as
$food

************************************************************************;
* DIETARY PATTERNS (RRR)


*********************************************************************************
EXPLORATORY RRR;

ods output percentvariation=rrr10percentvariation;
ods output xloadings=rrr10loadings;
ods output xweights=rrr10xweights;
ods output yweights=rrr10yweights;
ods output codedcoef=rrr10codedcoef;
ods output parameterestimates=rrr10parameterestimates;

proc pls data=temp method=RRR nfac=3 varss details ;
model pcfat_10 fd_10 ded_10
= $foods /solution ;

output xscore=pred10score yscore=resp10score out=pattern10
/*output centred and scaled predictor and response variables
produced by RRR*/
stdx= $foods2
stdy= pcfat02 fd02 ded02;
run;


*KEEP CENTRED AND SCALED PREDICTOR (FOOD GROUP) VARIABLES & NATURAL DP
SCORE PRODUCED BY EXPL RRR
& REMOVE RAW DATA;
data scaled;
set pattern10;
keep cid_477a qlet $foods2
pred10score1 ;
run;



************************************************************************
CONFIRMATORY RRR USING CENTRED AND SCALED DATA;


*MAKE XWEIGHTS (SCORING FILE) SUITABLE FOR PROC SCORE;

data scores;
set rrr10xweights;

if Numberoffactors > 1 then delete;*only interested in 1st pattern;
drop Numberoffactors;

_TYPE_="SCORE";
_NAME_="Factor1";

/* rename scoring variables to match scaled predictor variable names*/
rename $foods = $foods2;
run;


*RE-SCORE SCALED AND CENTRED PREDICTOR VARIABLES using scoring
coefficients to test confirmatory RRR;

proc score data=scaled out=pattern10_1 score=scores type="SCORE"
nostd;
var $foods2;
run;


***************************************************************************
COMPARE 'NATURAL' AND 'APPLIED' SCORES;

*check correlation between natural and applied scores;
proc corr data=pattern10_1;
var pred10score1 factor1;
run;

*calculate differences and ratio b/w natural and applied scores;

proc rank data=pattern10_1 out=ranks;
ranks rankpred10 rankfact1;
var pred10score1 factor1;
run;

proc sort data=ranks;
by pred10score1;
run;

data rankdiff;
set ranks;
difpat1=factor1 - pred10score1;
ratiopat1=factor1/pred10score1;
difrank=rankpred10 - rankfact1;
run;

proc means;
var difpat1 ratiopat1 difrank;
run;