Data analysis of agroforestry experiments 11 by Workshop overview & session summaries Richard Coe, ICRAF, Kenya Roger Stern, SSC, UK Eleanor Allan, SSC, UK edited by Jan Beniest, ICRAF, Kenya Janet Awimbo, ICRAF, Kenya The World Agroforestry Centre (ICRAF) is the Founded in 1983, the Statistical Services Centre international leader in Agroforestry - the science (SSC) is a not-for-profit body within the School of and practice of integrating 'working trees' on Applied Statistics at The University of Reading, smallholder farms and in rural landscapes. UK. The SSC provides training and consultancy Agroforestry is an effective and innovative means in both statistics and data management in the to reduce poverty, create food security, and improve international arena. We aim to encourage good the environment. The Centre and its many partners statistical practice, and the use of modern provide improved, high quality tree seeds and statistical methods in applied problems. seedlings, and the knowledge needed to use them The SSC currently has nine statisticians, plus effectively. We combine excellence in scientific computing professionals and administrative research and development to address poverty, staff. hunger and environmental needs through collaborative programs and partnerships that transform lives and landscapes, both locally and globally. ICRAF Statistical Services Centre The World Agroforestry Centre The University of Reading United Nations Avenue Harry Pitt Building Whiteknights Road PO Box 30677 P.O.Box 240 Nairobi, Kenya Reading RG6 6FN, UK Tel: + 254 2 524 000 Tel: +44 (0) 118 378 8025 Fax: + 254 2 524 001 Fax: +44 (0) 118 975 3169 Contact via the USA E-mail: [email protected] Tel: + 1 650 833 6645 Internet: www.rdg.ac.uk/ssc/ Fax: + 1 650 833 6646 E-mail: [email protected] Internet: www.worldagroforestrycentre.org © World Agroforestry Centre 2002 ISBN 92 9059 145 5 Design: Mariska Koornneef Printed by: Kul Graphics Ltd, Nairobi, Kenya The Training Materials These Training Materials were developed to help us present a series of courses on the analysis of data from agroforestry experiments. They are published here to assist others give similar training in the future. The course is very practical and built around the analysis of real data sets. Concepts are explained largely without using mathematics. The computer software takes care of calculations and hence formulae are not used. Instead the course emphasises understanding of the analyses it is sensible to use, and the interpretation of results. We distinguish between learning to use the statistical software (buttons to press or commands to use) and understanding the statistical concepts, models and methods. The course was designed initially to help with analysis of agroforestry experiments, and the examples given are from agroforestry trials. However both the statistical and teaching ideas can be applied to trials from agriculture, forestry and other application areas. Only one out of 17 sessions is dedicated to peculiarities of agroforestry research, and it should be easy to substitute other examples when using the materials. The materials refer to both on-station and on-farm trials. Emphasising the distinction between on-station and on-farm experiments is not necessary or helpful for this course. The approaches and methods for the analysis of a trial depend on its objectives, treatments, layout and measurements, not on where it was carried out. The materials are presented in four printed parts together with a computer CD. Part 1 contains an overview of the course and teaching approaches, with suggestions on how the materials may be used and adapted. It also contains a summary of each of 17 teaching sessions. Part 2 contains the lecture notes, one for each of the sessions. They form a useful and readable resource in their own right and hence are presented as a separate document. Part 3 contains suggested exercises for each session. These are presented as a separate document as they are most likely to be adapted and modified to use local examples. Part 4 contains a protocol describing each of 16 experiments, the data from which are used in examples. The CD contains (cid:123) a data file (in Microsoft Excel format) for each of the 16 example experiments (cid:123) files (in pdf format) for each of the 4 parts, so that further copies can be printed (cid:123) the original word processor files of all the text (in Microsoft Word format), so users may modify and adapt the text (cid:123) some additional documents (in pdf format) that are referred to in the materials We encourage the copying and modification of these materials as long as the original source is acknowledged, and resulting products are not sold without our permission. We would appreciate being informed of any use and developments of these materials. The materials were produced through a long term collaboration between the World Agroforestry Centre (ICRAF) in Nairobi, Kenya and the Statistical Services Centre of the University of Reading, UK. Table of contents The course structure and strategy 5 Introduction 5 Audience 6 Recource persons 7 Datasets and data management 8 Duration 9 Teaching style 10 Software 12 Course content 12 Resource materials 16 Strategy 18 Acknowledgements 20 Workshop sessions 21 Session 1. Review of experimental design 21 Session 2. Objectives and steps in data analysis 23 Session 3. Software familiarization 27 Session 4. Descriptive analysis and data exploration 31 Session 5. Analysis of variance as a descriptive tool 35 Session 6. Ideas of simple inference 37 Session 7. An introduction to statistical modelling 39 Session 8. An introduction to multiple levels 41 Session 9. Writing up and presenting results 43 Session 10. Where are we now?- Review of basic statistics 45 Session 11. Design and analysis complexity 47 Session 12. Dealing with categorical data 49 Session 13. Getting more out of on-farm trials and multilevel problems 51 Session 14. Complications in agroforestry trials 53 Session 15. Complications in data 55 Session 16. Data analysis 57 Session 17. On your return 59 The course structure and strategy Introduction These course notes are on the analysis of data from experiments. They result from a series of statistics training courses organized by ICRAF/World Agroforestry Centre. These courses were originally on the design and analysis of agroforestry experiments, but they have been used more widely than this. The first component was on the design of experiments. This analysis course assumes familiarity with the main concepts from the design course. A brief review is given in Session 1. The second component was on data entry and management. This is a key area because poor data management often limits the processing of data. In this analysis course the examples provided have been ’managed’ so that the concepts related to the analyses could be illustrated easily. We anticipate that an initial phase in the course preparation will be to organize datasets from participants similarly. Hence, the data management component though normally undertaken prior to this component, is not a necessary prerequisite. 55 This course is divided into two parts. The first part is entitled The Everyday Toolkit and covers the concepts that we believe scientists should be able to understand fully and the y corresponding analyses that they should be able to undertake unaided following the training. g e t a tr s The second part is called Handling Complexities. This examines how experimental data d n a can be processed where there are complications. These complications are divided into three e ur broad types. The first is due to complexities in the design which may either be due to a complex t c u treatment structure, to difficulties in the layout of the trial, or to the way data were measured. tr s e For example, a measurement of farmers’ responses may be on a 5-point scale ranging from very urs o good to very poor. The analysis of this type of ’categorical data’ is described here. c e h T We do not consider on-farm trials as a special category and hence examples of them will be used throughout the course. However, their analysis is often complicated because of their combination of a complex layout (many farmers, with few plots per farm) and the nature of measurement. The complexities arise from the lack of control of factors that would be within the treatment structure in an on-station trial, and the fact that this lack of control occurs both within and between farms. The handling of these complexities is discussed in the course. The second type of complexity is that which is due to the particular field of application. Particular features of agroforestry trials include ’repeated measures’, both in time and space, and difficulties that arise from the need to measure multiple components (e.g. concerning both trees and crops) within each plot. Courses for other audiences need to replace this section, as each subject area has a set of problems and methods specific to it. As an example, we provide a parallel session that considers some of the complications that are commonly encountered in livestock experiments. Finally we consider complexities that arise because of the nature of the data. Coping with zeros in the data and missing values are among the topics considered here. Our main aim in this second part of the course is for scientists to be aware of the methods that now exist to handle complex data. These are methods where scientists, at least initially, might want to work jointly with statisticians. Audience This course is intended primarily for scientists undertaking agricultural research. In targeting the course for scientists we are assuming some prerequisites. We assume 66 scientists have some practical experience in the design and analysis of trials. They may have felt y diffident in their write-up of an experiment for which they have been responsible, but they have g e t an awareness of the process of conducting and processing an experiment. a tr s d n a We assume basic computing skills. Scientists who do not have regular access to a computer e ur and who are not comfortable with the use of a word processor and a spreadsheet should not take t c u this course. Most scientists will already have some experience in data processing using a statistical tr s e package. Those without this experience should make themselves aware of the capabilities of this s ur o type of software before the course. c e h T We assume some basic statistical knowledge, usually from statistics courses taken while participants were students. We do not assume that they liked the course, or that they understood all the content. We hope that at least something was understood and a little is still remembered! The design course is also a good preparation for this training and we hope that participants who have followed it still remember (and use!) some of the concepts that were covered in that course.
Description: