ebook img

ERIC ED563036: Approaches to Incorporating Late Pretests in Experiments: Evaluation of Two Early Mathematics and Self-Regulation Interventions PDF

2013·0.08 MB·English
by  ERIC
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ERIC ED563036: Approaches to Incorporating Late Pretests in Experiments: Evaluation of Two Early Mathematics and Self-Regulation Interventions

AbstractTitlePage Notincludedinpagecount. Title:Approaches to IncorporatingLatePretestsinExperiments:EvaluationofTwoEarly Mathematics and Self-Regulation Interventions Authors: FatihUnlu,AbtAssociates Carolyn Layzer,AbtAssociates Douglas Clements,UniversityofDenver JulieSarama,UniversityofDenver DavidCook,AbtAssociates SREEFall2013ConferenceAbstractTemplate AbstractBody Limit4pagessingle-spaced. Background/Context: ManyeducationalRandomizedControlledTrials(RCTs)collect baselineversionsofoutcomemeasures (pretests)tobeusedintheestimationofimpactsat posttest.Althoughpretestmeasuresarenotnecessaryforunbiasedimpactestimatesinwell- executedexperimentalstudies,usingthemincreases theprecisionofimpactestimatesand reducessamplesizerequirements todetectdesiredeffectsizes(e.g.,Raudenbushand Lui,2000; Bloom,Richburg-Hayes, &Black,2005;Hedges andHedberg,2007;and Schochet,2008b). Ideally,baselinedatacollectionshouldoccurbeforethestartofprogram/intervention implementationbutunforeseenfactors maydelayit,leadingto“contaminated”baseline measures(henceforth: “late”pretests). Incorporatinglatepretests inimpactregressionshastwoconsequences(Schochet,2008a). First, resultingimpactestimates maybebiased,reflectingearlyinterventioneffects.Magnitudeand directionofthisbias dependsonthesizeanddirectionoftheearlyeffects (e.g.,largeand positiveearlyeffectscauseimpactestimateswithlargedownwardbias).Second,latepretests tendtobecorrelatedwith thetreatmentindicator(s)includedinimpactregressionsduetothe earlytreatmenteffect, yieldinglargerstandarderrorsthanuncontaminated pretests.Despitethese adverseeffects,usinglatepretestsinimpactanalyses maystillbepreferablebecausetheymay explainasignificantportionoftheoutcomevarianceandhelpwithprecision,offsettingthebias theyintroduce.Thisbias-precisiontradeoffdependsonthesizeofearlytreatmenteffects, growth trajectoryoftreatmenteffects,andhowwelllatepretests explainposttestmeasures. Purpose/ Objective/ResearchQuestion/Focus of Study: Thispaperprovidesatheoretical overviewofthelatepretestissueandempiricallyassesses thebias-precision tradeoffinthe experimentalevaluation oftheefficacyofaddingacurriculumdesignedtosupportthe developmentofchildren’sself-regulationskills(ScaffoldedSelf-Regulation,SSR)toan establishedearlymathematicscurriculum(BuildingBlocks,BB)toformasynthesized curriculum,BBSSR. Setting:TheRCTdescribedinthis paperis conductedinthreelargedistrictsinSouthern California. Population/Participants/Subjects: Ouranalyticsampleincludes 807students who participatedinbothbaselineandposttestdatacollection.Thesestudents werein844-year-old classrooms acrossthethreedistricts.Alargeproportionoftheirclassroomsweremulti- racial/multi-ethnic. InthethreedistrictsHispanicchildrenwerethemajorityminorityaton average39%,AsianPacificIslander18%,African-American11%,andnon-Hispanic White 31%.Onaverage,27%ofthestudentswereEnglish LanguageLearners (withroughly20% havingSpanishastheprimarylanguage). Intervention/Program/Practice: Bothoftheevaluatedinterventionsaretheoreticallyand empiricallygrounded.TheNSF-supportedBuildingBlocks(BB)projectproducedaresearch- basedmathcurriculumthataddresses geometricandspatialideas andskillsandquantitative ideasandskills.TheapproachofBBis findingthemathin,anddevelopingmathfrom,children's activity. FundedbytheNSFand IES,threeRCTevaluations havedocumentedBB’spositive SREEFall2013ConferenceAbstractTemplate 1 effectson youngchildren's mathachievement (e.g.,Authors,2007,2008).Increasingmath proficiencyissignificant—usingeachofsix longitudinaldatasets,thestrongestpredictors of laterachievementareearlymathskills,followedbyreadingskillsandthen attention(Duncan et al.,2007).Theself-regulationapproach combines currentresearchonthedevelopmentofself- regulationandexecutivefunctionwith LevVygotsky’s cultural-historicaltheoryofchild developmenttodesignoptimalwaystosupportthedevelopmentofself-regulationin young children(Bodrova&Leong,2007).Studiesindicatethatscaffoldingthatpromotesself- regulationimprovesmathematicslearning(Barnettetal.,2006).TheBBSSRinterventionisa theoretically-groundedsynthesis ofthisscaffoldingandBB. Significance/Noveltyof study: Manyrecenteducationalevaluationscollectandcontrolfor baselineoutcomemeasuresthanks tothevastliteraturerecommendingthemforboosting statisticalpower.Thedynamicnatureofschooldistricts andschoolscoupledwithunforeseen logisticalfactors oftendelaycollectionofsuchdatayieldingcontaminated pretestmeasures, whichinturnproducebiasedimpactestimatesanddiminish gainsinstatisticalpower.Building onSchochet(2008a),weexaminethis importantissuewithinthecontextofaclusterRCT analyzingtheeffectsoftwopreschoolinterventions.Wealsoproposeanovelapproachthataims todecontaminatelatepretestmeasuresand canbeappliedinmanysimilarsituations. Statistical,Measurement,orEconometricModel: Schochet(2008a)theoreticallyand empiricallyexamines the implications ofusinglatepretestsinRCTs.Specifically;hecompares theperformanceofthefollowingfourimpactestimatorswithrespecttotheirmeansquarederror andminimumdetectableeffectsizes:  Posttest-onlyestimator:does notuseanypretestmeasures,  Difference-in-differences estimator(DID):uses gainsscoresastheoutcomevariable,  ANCOVAestimator:controlsforpretest measure(s),and  UnbiasedANCOVA(UANCOVA)estimator:controlsforuncontaminatedpretest measure(s)obtainedfromextantsources(e.g.,school-leveltestscores for priorstudent cohorts) SchochetconductssimulationsfortypicalRCTsanddatageneratingparameterssuchas intra- classcorrelations,pretestandposttestcorrelations,andimpact growthtrajectories.Hefinds that:  BoththeDID andANCOVAestimatorispreferredtotheposttest-onlyestimatorunless earlytreatmenteffects arelarge (e.g.,largerthan 0.10standarddeviations forthe ANCOVAestimatorinaclusterRCT);  TheANCOVAestimator ispreferredtotheUANCOVAestimatorunless predictive powerof alternativepretestsandearlytreatmenteffectsarelarge;and  TheDID estimatortypicallyhaslargerbiasandvariancethantheANCOVAestimator. FollowingSchochet,we assesstheextentofearlytreatmenteffectsand compareestimated impactsfromtheposttest-onlyand ANCOVAestimators fortheBB andBBSSRconditions. In addition,weimplementanalternativeANCOVAestimatorthatremovesthecontaminated portionofpretests, yieldingunbiasedimpactestimateswhilestillexplainingsomeportionofthe outcome.This approach entails:(i)buildingamodelthatusesthecontaminatedpretest asthe outcomeandallavailableexogenouscovariates andthetimebetweenschoolstartandbaseline SREEFall2013ConferenceAbstractTemplate 2 testingascovariates,(ii)estimatingthismodelusingonlythecontrolstudents’pretestmeasures, and(iii)creatingpredictedpretestsforallstudents usingtheestimated modelbutsettingthetime betweenschoolstartand pretestdatetozero.Notethatthisprocess yieldspredictedpretest scores thatarefreeofearlytreatmenteffectsbecausetheprediction model is estimatedusing covariates thatwerecollectedbeforethetreatment andonlycontrolstudents’scores. Usefulness/Applicabilityof Method:Numerouseducationalstudiesrelyonpretestmeasures tobooststatisticalpowerbutunanticipatedfactors leadtocontaminationofthesemeasuresin manycases.Theapproachdescribed aboveaims toaddressthis issueandcanbeappliedinall suchcases withexogenous predictorsofthepretestmeasures. ResearchDesign: Weanalyzeathree-armed clusterRCTinwhichclassroomsinearly childhoodcenters orschoolswererandomizedto thetwointerventionconditionsandabusiness- as-usualcontrolcondition.Randomizationwasconductedseparatelyforschools withoneortwo participatingclassrooms (groups AandB).Schools/centersin groupAwereplacedintofive randomizationblockssuchthateachblockconsistedofallhalf-dayorfull-dayPreKclassrooms inoneofthethreestudydistricts. Withineachblock,schools/centers weresortedwithrespectto priormathachievement,%reducedpriceluncheligible,and %ELLand randomlyassignedto thethreeconditionsthreeatatime,startingwitharandomlychosenpointinthesortedlist.This processwas usedtoensurebalancedexperimental groups.Ineach groupB center/school,thetwo studyclassroomswererandomlyassignedtotwoconditionsthatwererandomlydetermined. Toexaminethelatepretestissue,wepresentseparateimpactestimates forBB andBBSSR conditionscontrastingtheiroutcomeswiththecontrolconditionfromthreedifferentmodels.The firstmodelexcludes pretestcontrols (posttest-onlyestimator).Thesecond modelincludes four pretestmeasures describedsubsequently(ANCOVAwithlatepretests).Thethirdmodeluses decontaminatedversionsofpretestsaggregatedto theschool-levelbecausemostoftheavailable predictors usedinthecorrectionprocesswereattheschool-level:ageand genderofstudentsand school-levelcovariatesincludingsize, averageclasssize,percentminorityandlowincome, and priorachievement(ANCOVAwithcorrectedpretests).Allthree models (i)aretwo-level hierarchicallinearmodels (HLM;Raudenbush&Bryk,2002)thatneststudentswithin classrooms;(ii)includetwoindicatorvariablesfortheBBandBBSRconditionsas primary predictors yieldingimpactsofBBandBBSSR;and(iii)includestudents’ageatposttestand genderandindicatorsfor randomizationblocks(fixedeffects)ascovariates. DataCollectionandAnalysis:Exhibit1describespretestandposttestmeasures usedinthis paper. Measuresofself-regulationincludedPencilTappingtask(Diamond &Taylor,1996), Head-Toes-Knees-Shoulders (Ponitz&McClelland,2008),Self-Ordered Pointing(Blair & Willoughby,2006),and ForwardandBackward DigitSpan(Gathercole&Pickering,2000; Wechsler,1986).ExpressiveVocabularyTest,SecondEdition(EVT-2, Williams,2007)is the languagemeasureanalyzed.Measures ofearlymathematics abilityweretheToolsfor ElementaryAssessmentinMathematics (TEAM; Clements,Sarama,& Wolfe,2011),andthe mathematicsbatteryfromtheEarlyChildhood LongitudinalStudy-Birth Cohort(ECLS-B; NCES). In addition,wearecurrentlycollectingscores fromtheDesiredResultsDevelopmental Profile(DRDP),anearlychildhoodassessmentadministeredbyteachers in thestudydistricts,to beusedinfurtheriterationsofthelatepretest-correctionprocess described above. SREEFall2013ConferenceAbstractTemplate 3 In mostschools,baselinedatawas collectedinfall2010,between30 and90days afterthestart oftheschooldayasseen in Figure1.1 Factors contributedtothisdelayincludeddelaysinschool districtapprovalsandparentpermission,anddelaysinsecurityclearancefordatacollectors. Posttestmeasures werecollectedinspring2011 withoutanymajorissues. Findings/Results:Exhibit2presentsvarious impactestimates.Columns 1-4showimpacts of BB whileColumns5-8presentBBSSRimpacts. Specifically,Column1presentstheestimated earlyimpactofBB onthreepretestmeasureswhileColumns2-4showBB impact estimates at posttestfromthethreeestimatorsdescribedabove(no-pretest,ANCOVAwithactualpretests, andANCOVAwithcorrectedpretests estimators),respectively.Column1 suggeststhattheearly impactsofBB weresizeable:0.24standarddeviations(sds)andstatisticallysignificantonHead- Toes-Knees-Shouldersscoreand0.17sdsonTEAMscore.Consequently,controllingforthese pretestmeasures inimpactregressionsforposttestmeasuresdramaticallyreducesimpact estimatesforalmostalloutcomes, asseeninColumns 2and3. Forexample,forECLS-B,the impactis 0.29sdsandstatisticallysignificantfromtheposttest-onlymodelwhileitdecreasesto 0.13sds andis notsignificantwhenlatepretestsareincludedascovariates. Alsonotethatadding contaminatedpretestshardlyimprovestheprecisionofestimatedimpacts(standarderrors of impactestimates decreaseby10-20%), which maybeduetosizeableearlytreatmenteffects. TheseresultssuggestthatwhenestimatingimpactsforofBB,controllingforcontaminated pretestsis notworthwhileduetolarge earlytreatmenteffectsandthelowexplanatorypowerof latepretests.ComparingColumns2and4showsthatusingdecontaminated pretests changes neitherthemagnitudeoftheimpactestimatesnor theirstandarderrors. Whiletheformer resultis expected,thelattersuggeststhatpredictorsusedin thedecontaminationprocess(student demographicsandschoolcharacteristics)didnotpredictpretestmeasures well. Columns 5-8ofExhibit2displayresultsforBBSSR.Column5showsthatearlytreatment effectsaremuchsmaller. Hence,controllingforlatepretestsdoesnotchangetheestimated impacts muchwhiletheboostinprecisionissomewhatlargerthanforBB–around20-30%. ExaminingColumn8suggests thatcorrectedpretestmeasuresdonotchangetheimpact estimatesorstandard errors.Theseresults suggest thatcontrollingforlatepretestsinthis case maynotbe as problematicasitwasinthepreviouscase. Conclusions:Analysesconductedthus farshowthatcontrollingforcontaminatedpretest measuresinimpactregressionscanleadtosubstantiallybiasedimpactestimates whiletheir effectontheprecisionof impact estimatesis less profound,causingonetoquestiontheoverall meritofusingthem.Thepreliminaryapplicationofourdecontaminationapproachwithalimited setofpredictors yieldsunbiasedestimates,butitdoesnotseemtohelpmuchwithprecision. Theseanalyseswillbeextendedandimprovedintimefortheconferencewithadditionaldata collectioncurrentlyunderway. Itwillbeinterestingtoobservehowcorrectedpretestsperform whenDRDPscores —presumablybetterpredictors ofstudy-administeredpretests—areusedin thecorrectionprocess.2 1Therewerenostatisticallysignificantdifferencesinthetimingofbaselinetestingacrossthethreeconditions. 2WewillalsocompareresultsfromourapproachwiththosefromtheunbiasedANCOVAestimatorthatwill directlycontrolforthesescores. SREEFall2013ConferenceAbstractTemplate 4 Appendices Notincludedinpagecount. AppendixA.References Barnett, W.S., Yarosz, D.J.,Thomas,J.,&Hornbeck,A.(2006).Educationaleffectivenessofa Vygotskian approachtopreschooleducation:Arandomizedtrial:National InstituteofEarly EducationResearch. Blair,C.B.& Willoughby, M.T.(2006). Measuringexecutivefunctionin youngchildren:Self- Orderedpointing.ChapelHill,NC:ThePennsylvaniaStateUniversityand TheUniversityof NorthCarolinaatChapelHill. Bloom,H.S.,Richburg-Hayes, L.,andBlack,A. R.2005.UsingCovariates toImprove Precision:EmpiricalGuidanceforStudiesthatRandomizeSchools to MeasuretheImpactsof Educational Interventions.MDRC WorkingPaper. Bodrova,E.,&Leong,D.J.(2007b).Playandearlyliteracy:AVygotskian approach. In K.A. Roskos&J. F.Christie(Eds.),Playandliteracyinearlychildhood(2nd ed)(pp.185200). Mahwah,NJ: LawrenceErlbaumAssociates. CaliforniaDepartmentofEducation.(2007).DesiredResultsDevelopmentalProfile-Revised (DRDP-R). Clements,D.H.,Sarama, J.,& Wolfe,C.B.(2011). TEAM—Tools forearlyassessmentin mathematics.Columbus, OH:McGraw-HillEducation. Diamond,A., &Taylor, C.(1996).Developmentofanaspectofexecutivecontrol:Development oftheabilitiestorememberwhat Isaidandto “Doas Isay,notas Ido”. Developmental Psychobiology,29,315–334. Duncan,G.J.,Dowsett,C.J.,Claessens,A.,Magnuson,K.,Huston,A.C.,Klebanov,P.,…, Japel,C.(2007).School readinessandlaterachievement.DevelopmentalPsychology, 43(6), 1428–1446. Gathercole,S.E. &Pickering, S.J.(2000). Workingmemorydeficits inchildrenwithlow achievementsinthenationalcurriculumat7 years ofage. BritishJournalofEducational Psychology,70(2),177-194. Hedges, L.V.&Hedberg,E.C.(2007). Intraclasscorrelationvaluesforplanninggroup randomizedtrialsineducation.EducationalEvaluationandPolicyAnalysis,29,60-87. Ponitz,C.E.C,McClelland,M.M.,Jewkes,A.M., Connor,C.M.,Farris,C.L., Morrison,F.J. (2008).Touch yourtoes! Developingadirectmeasureofbehavioralregulationinearly childhood.EarlyChildhoodResearchQuarterly23,141–158. Raudenbush,S. W.,&Lui,X. F.(2000). Statisticalpowerand optimaldesignfor multisite randomizedtrials. PsychologicalMethods,5(2),199-213. SREEFall2013ConferenceAbstractTemplate A-1 Raudenbush,S. W.,&Bryk,A. S.(2002). HierarchicalLinearModels:Applicationsanddata analysis methods(Seconded.).NewburyPark: Sage. Schochet,P.Z.(2008a). TheLatePretestProbleminRandomizedControlTrialsofEducation Interventions(NCEE2009-4033). Washington,DC: NationalCenterforEducationEvaluation andRegionalAssistance,InstituteofEducationSciences,U.S.DepartmentofEducation. Schochet,P.Z.(2008b).StatisticalPowerforRandomAssignmentEvaluations ofEducation Programs.JournalofEducationalandBehavioralStatistics, 33(1),62-87. U.S.DepartmentofEducation,NationalCenterforEducationStatistics.EarlyChildhood LongitudinalStudy-Birth (ECLS-B)Cohort. Washington,D.C.:InstituteofEducationSciences. Wechsler,D.(1986). WechslerAdultIntelligenceScale,Revised(WAIS-R).NewYork,NY: ThePsychologicalCorporation. Williams,(2007).ExpressiveVocabularyTest,SecondEdition(EVT-2). Minneapolis,MN: Pearson. SREEFall2013ConferenceAbstractTemplate A-2 AppendixB.TablesandFigures Notincludedinpagecount. Figure1:Days Between SchoolStartandBaselineTesting 2 . 5 1 . n o cti a Fr1 . 5 0 . 0 0 50 100 150 Days SREEFall2013ConferenceAbstractTemplate B-1 Exhibit1:OutcomeMeasures Measure Pre-test Post-test Description PencilTapping X X Inthis measureofinhibitorycontrol,children (“Pencil”) areaskedtotapapencilonceiftheassessor tapstwiceandtaptwiceiftheassessortaps once. Head,Toes, Knees, X X Inthis measureofbehavioralregulation andShouldersScore (specifically,inhibitorycontrolandworking memory),thechild mustdoanactionthatis systematicallydifferentfromtheoral instruction givenbytheassessor(andnot followtheinstruction givenbytheassessor). TEAMScaled Score X X Thisisameasureofdevelopmental progressions innumber(e.g.,verbal counting, objectcounting,subitizing,number comparisonandsequencing,number compositionanddecomposition,addingand subtracting, andplacevalue)and geometry (e.g.,shaperecognition, shapecompositionand decomposition,congruence,constructionof shaped,spatialimagery,geometric measurement andpatterning). PPVT-III X ThePeabodyPictureVocabularyTest,3rd Edition,isatestofreceptivevocabulary. ECLS-BMathScore X ThisisthemathbatteryfromtheECLS-Btest andiscomprisedofacollectionofitemsfrom otherwidely-usedtests ofearlymathematics. ForwardDigitSpan X Thisisameasureof workingmemory,testing Score achild’sabilitytoreproduceanincreasingly longstringofnumbersthattheassessorsays. Theassessorpresents(orally)thesubjectwith aseriesofdigits (e.g., '8, 3,4'), andthesubject mustimmediatelyrepeat themback.Ifs/he doesthis successfully,s/heis given alonger list(e.g., '9,2,4,6').The lengthofthelongest listapersoncanrememberisthatperson'sdigit span. Backward DigitSpan X Thisisameasureof workingmemory,testing Score achild’sabilitytoreproduceanincreasingly longstringofnumbersthattheassessorsays, inreverseorder. Whiletheparticipantisasked torepeatthedigitsinthegivenorderinthe forwarddigit-spantask,s/heisaskedtorepeat theminreverseorderinthebackwarddigit- spantask(e.g.,ifpresentedwith‘8,3,4’,the SREEFall2013ConferenceAbstractTemplate B-2 subjectmustproduce‘4, 3,8’).Asin Forward DigitSpan,thelengthof thelongestlistthe subjectisabletorepeatis thesubject’s digit span. EVT-2Score X Thisisameasureof expressiveorallanguage ability(English).Thesubjectis showna pictureandaskedstandardizedpromptsto elicitanEnglishword for theobject’slabel, depictedaction,oruse. Self-Ordered Pointing X Thisisameasureof workingmemoryinwhich Score thechildisshownmultiplepages thatcontain thesamepictures indifferentarrangements. Thechildis askedtoidentifyapicturewhich s/hehas notselectedonpreviouspagesso must recallwhichitem(s)s/hehas alreadyselected. SREEFall2013ConferenceAbstractTemplate B-3

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.