INTERNATIONAL UNION OF CRYSTALLOGRAPHY BOOK SERIES IUCr BOOK SERIES COMMITTEE E.N.Baker,NewZealand J.Bernstein,Israel P.Coppens,USA G.R.Desiraju,India E.Dodson,UK A.M.Glazer,UK J.R.Helliwell,UK P.Paufler,Germany H.Schenk(Chairman),TheNetherlands IUCrMonographsonCrystallography 1 Accuratemolecularstructures A.Domenicano,I.Hargittai,editors 2 P.P.EwaldandhisdynamicaltheoryofX-raydiffraction D.W.J.Cruickshank,H.J.Juretschke,N.Kato,editors 3 Electrondiffractiontechniques,Vol.1 J.M.Cowley,editor 4 Electrondiffractiontechniques,Vol.2 J.M.Cowley,editor 5 TheRietveldmethod R.A.Young,editor 6 Introductiontocrystallographicstatistics U.Shmueli,G.H.Weiss 7 Crystallographicinstrumentation L.A.Aslanov,G.V.Fetisov,G.A.K.Howard 8 Directphasingincrystallography C.Giacovazzo 9 Theweakhydrogenbond G.R.Desiraju,T.Steiner 10 Defectandmicrostructureanalysisbydiffraction R.L.Snyder,J.FialaandH.J.Bunge 11 DynamicaltheoryofX-raydiffraction A.Authier 12 Thechemicalbondininorganicchemistry I.D.Brown 13 Structuredeterminationfrompowderdiffractiondata W.I.F.David,K.Shankland,L.B.McCusker,Ch.Baerlocher,editors 14 Polymorphisminmolecularcrystals J.Bernstein 15 Crystallographyofmodularmaterials G.Ferraris,E.Makovicky,S.Merlino 16 Diffusex-rayscatteringandmodelsofdisorder T.R.Welberry 17 Crystallographyofthepolymethylenechain:aninquiryintothestructureofwaxes D.L.Dorset 18 Crystallinemolecularcomplexesandcompounds:structureandprinciples F.H.Herbstein IUCrTextsonCrystallography 1 Thesolidstate A.Guinier,R.Julien 4 X-raychargedensitiesandchemicalbonding P.Coppens 5 Thebasicsofcrystallographyanddiffraction,secondedition C.Hammond 6 Crystalstructureanalysis:principlesandpractice W.Clegg,editor 7 Fundamentalsofcrystallography,secondedition C.Giacovazzo,editor 8 Crystalstructurerefinement:acrystallographer’sguidetoSHELXL P.Müller,editor IUCrCrystallographicSymposia 1 PattersonandPattersons:FiftyyearsofthePattersonJunction J.P.Glusker,B.K.Patterson,andM.Rossi,editors 2 Molecularstructure:Chemicalreactivityandbiologicalactivity J.J.Stezowski,J.Huang,andM,Shao,editors 3 Crystallographiccomputing4:Techniquesandnewtechnologies N.W.IsaacsandM.R.Taylor,editors 4 Organiccrystalchemistry J.GarbarczykandD.W.Jones,editors 5 Crystallographiccomputing5:Fromchemistrytobiology D.Moras,A.D.Podjarny,andJ.C.Thierry,editors 6 Crystallographiccomputing6:Awindowonmoderncrystallography H.D.Flack,L.Parkanyi,andK.Simon,editors 7 Correlations,transformation,andinteractionsinorganiccrystalchemistry D.W.JonesandA.Katrusiak,editors Crystal Structure Refinement A Crystallographer’s Guide to SHELXL Peter Müller MassachusettsInstituteofTechnology,Cambridge,USA Regine Herbst-Irmer UniversityofGo¨ttingen,Germany Anthony L. Spek UtrechtUniversity,TheNetherlands Thomas R. Schneider TheFIRCInstituteofMolecularOncology,Biocrystallography,andStructural Bioinformatics,Italy Michael R. Sawaya UniversityofCalifornia,LosAngeles,USA Editedby Peter Müller INTERNATIONALUNIONOFCRYSTALLOGRAPHY 1 3 GreatClarendonStreet,OxfordOX26DP OxfordUniversityPressisadepartmentoftheUniversityofOxford. ItfurtherstheUniversity’sobjectiveofexcellenceinresearch,scholarship, andeducationbypublishingworldwidein Oxford NewYork Auckland CapeTown DaresSalaam HongKong Karachi KualaLumpur Madrid Melbourne MexicoCity Nairobi NewDelhi Shanghai Taipei Toronto Withofficesin Argentina Austria Brazil Chile CzechRepublic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore SouthKorea Switzerland Thailand Turkey Ukraine Vietnam OxfordisaregisteredtrademarkofOxfordUniversityPress intheUKandincertainothercountries PublishedintheUnitedStates byOxfordUniversityPressInc.,NewYork ©OxfordUniversityPress,2006 Themoralrightsoftheauthorshavebeenasserted DatabaserightOxfordUniversityPress(maker) Firstpublished2006 Allrightsreserved.Nopartofthispublicationmaybereproduced, storedinaretrievalsystem,ortransmitted,inanyformorbyanymeans, withoutthepriorpermissioninwritingofOxfordUniversityPress, orasexpresslypermittedbylaw,orundertermsagreedwiththeappropriate reprographicsrightsorganization.Enquiriesconcerningreproduction outsidethescopeoftheaboveshouldbesenttotheRightsDepartment, OxfordUniversityPress,attheaddressabove Youmustnotcirculatethisbookinanyotherbindingorcover andyoumustimposethesameconditiononanyacquirer BritishLibraryCataloguinginPublicationData Dataavailable LibraryofCongressCataloginginPublicationData Crystalstructurerefinement:acrystallographer’sguide: SHELXL/PeterMüller...[etal.];editedbyPeterMüller. p.cm.—(InternationalUnionofCrystallographymonographsoncrystallography;19) Includesbibliographicalreferences. ISBN-13:978–0–19–857076–9(alk.paper) ISBN-10:0–19–857076–7(alk.paper) 1.Crystals—Structure. 2.Crystals—Refining. 3.Crystals—Dataprocessing. I.Müller,Peter,1970–II.Series. QD921.C7722006 548(cid:1).81—dc22 2006011794 TypesetbyNewgenImagingSystems(P)Ltd.,Chennai,India PrintedinGreatBritain onacid-freepaperby BiddlesLtd,King’sLynn,Norfolk ISBN 0–19–857076–7 978–0–19–857076–9 1 3 5 7 9 10 8 6 4 2 Foreword A Short History of SHELX The 5000 lines of FORTRAN code that became known as SHELX-76 had their origins around 1970 when the University of Cambridge replaced the ICL Titan computer with an IBM-370. My previous attempts to write programs used Titan Autocode,asimplebutefficientprogramminglanguageclosertoassemblerthanto amodernhigh-levellanguage.WiththeIBMcomputercametwomajorinnovations: aFORTRANcompilerandpunchedcards.Beingforcedtorewritemyfirstattempt at a crystallographic least-squares refinement program (called NOSQUARES) in another language was a good opportunity to learn from my mistakes, but—since I was too lazy to read the FORTRAN manual or attend a course—I rewrote the programinaverysimplesubsetofFORTRANthatboreacuriousresemblanceto TitanAutocode,andavoidedfeaturesthatmighthavebeendifficulttoporttoother computerssothatIwouldneverhavetorewriteitagain.Thishadtheadvantagethat itproducedefficientcode,essentialinviewofthelimitedspeedandmemoryofthe mainframecomputersofthetime(about0.0001timesthatofcurrentPCs).Actually SHELX-76stillcompilesandrunscorrectlyusingalmostanymodernFORTRAN-95 compiler. AtthetimeIwouldhaveregardedmyselfasaninorganicchemistwhowasinter- estedinapplyingavarietyofphysicalmethods;thetitleofmyPh.D.thesis(under the supervision of Evelyn Ebsworth) was ‘NMR Studies of Inorganic Hydrides’. When I moved to the Georg-August University of Göttingen in 1978 I discovered thatmyGermancolleaguesweresomuchbetterat‘cooking’(preparativechemistry) thanIwasthatitwouldbebetterifIconcentratedoncrystalstructuredetermination, forwhichtherewasapressingneedinordertocharacterizeallthecompoundsthey weresynthesizing. Oneofthemethodswemadegooduseofinthe1960s,forexample,fordetermin- ingthestructuresofrelativelyunstable-SiH derivativesthathadahabitofexploding 3 in contact with air, was gas phase electron diffraction. This required synthesizing the compounds in Cambridge and taking them to Glasgow University and later to UMIST (Manchester) where Durward Cruickshank had the only operational gas phaseelectrondiffractionmachineinthecountry.OnonevisitImentionedtoDur- wardthatIwouldneedtodosomeX-raycrystallographybecausenotalloursamples werevolatileenoughtodeterminethestructuresinthegasphase.Ihadmanagedto find an X-ray generator and a Weissenberg camera but still needed to write a suit- able Autocode program for the Titan computer to analyze the data. Durward very kindly provided me with a set of notes describing least-squares refinement that he laterpresentedatthe1969computingschoolinOttawa:theseformthebasisofthe least-squaresrefinementinSHELXtothisday. vi Foreword SHELX-76 SHELX was written for use by myself and my students and I never imagined that it would ever find use outside the ivory towers of Cambridge. However, after a fewyearsduringwhichtheprogramwasfairlywelldebugged,itbecameclearthat it would be a good idea to have one definitive ‘export’ version. This was named SHELX-76andwasintendedtobethefinaldefinitiveversion.SHELX-76included Lp and absorption corrections for Weissenberg data; in addition to the camera we hadacquiredaWeissenberggeometrydiffractometerforwhichIwrotethecontrol program in binary to get the most out of the 4K 12bit words of memory. Fortu- nately the use of the concept of direction cosines made it possible to handle data fromothersources.InSHELX-76thiswasfollowedbythemergingofthedatato produce a list of unique reflections, some primitive direct and Patterson methods for structure solution, least-squares structure refinement, calculation of dependent parametersandFouriersyntheses.Theresultingprogramwassolargethatthe5000 FORTRANstatementsweretooheavytocarryaroundasindividualpunchedcards, soIwrotealittlecompressionprogramfortheFORTRANandanotheroneforthe testdata(itaveraged9reflectionspercard,i.e∼9bytesperreflection,butcomprom- ized a little on precision). The program, test data and (uncompressed) FORTRAN decompressionprogramallfittedintoastandard2000-cardboxthatcouldbeposted andcamewithmeontripsabroad.Infactthese‘compresseddata’canstillberead using ‘HKLF 1’ and were subject to a brief renaissance when BITNET was intro- duced. Once when I was on vacation one of my students dropped the only dataset fromavaluablecrystalanddistributedthecardsalloverthefloor,butsucceededin crackingthecodeandputtingthecardsbackintherightorderbeforeIreturned! One problem that soon became apparent was that the restriction of the array dimensionstoallowonly160atoms(includinghydrogenatoms)wasalittleonthe small side; I had thought that this would never need increasing. Dobi Rabinovitch workedouthowtoincreasethearraystohold400atomsandthisbecamethestand- ard version. When I was faced with the problem of converting the program to the first(DataGeneral)minicomputersIwasabletoovercomethememoryrestrictions (the program and operating system had to fit into 64 Kbytes) by extensive use of ‘overlay’(onlyholdingasmallpartoftheexecutablecodeinmemory)andwitha ratherefficientblockedcascadeleast-squaresrefinementalgorithmthatrefinedthe structure in small dynamically selected blocks, but only recalculated the structure factorcontributionsforatomsthathadchangedinthepreviouscycle.Thiswasthe basis of the XLS refinement program in the SHELXTL version that I had adapted forSyntex(wholaterbecameNicolet,thenSiemensandfinallyBruker).XLSonly fittedintotheNovacomputermemorywithtwobytestospare,sofurtherextension andevenbug-fixingweredifficult. SHELX-97 Ilearntagreatdealaboutdirectmethodsofstructuresolutionattheexcellentschools thatMichaelWoolfsonandLodovicoRivadiSanseverinoorganized,firstinParma Foreword vii (1970)andthenfrom1974oninErice.Bythe1980sdirectmethodshadmadesuch progressthatIdecidedtoproduceaseparatestructuresolutionprogram(SHELXS- 86).ThiswaseventuallyfollowedbyanewrefinementprogramSHELXLin1993, partly because Syd Hall and the editors of Acta Crystallographica were pester- ing me to produce CIF output. However CIF is by no means the ideal answer to the data exchange and archiving problem; even though the CIF file is longer than the corresponding SHELXL .res file, it lacks much information, for example about the constraints and restraints applied in the refinement. Both SHELXS and SHELXLwereupdatedagainin1997andprovedsufficientlyreliablethatnofurther updateswererequired.Bothprograms(andthesubsequentSHELXC,SHELXDand SHELXEformacromolecularphasing)weretestedformanyyearsbeforetheywere released, with the result that they were by that stage already fairly well debugged. This contrasts with the current general programming philosophy that code is dis- tributed as soon as possible and the users will find the bugs! This package, which includedtheprogramCIFTABforworkingwithCIFformatfilesandSHELXPRO thatactedasaninterfacetothemacromolecularworld,becameknownasSHELX-97. Documentationisalwaysaproblem,soIsentoutthefirstbeta-testcopiesofthis package(startingin1992)oneatatime.Apotentialbeta-testerwassentacopyof themanualandwastoldthatheorshewouldbesenttheprogramsonlyaftersending meatleastthreeerrorsinthedocumentationorgoodsuggestionsforimprovingit.I thenmadeallthecorrectionsbeforesendingittothenextguinea-pig.Thefirsttesters ranitthroughtheirspellingcheckersandfoundplentyofmistakes(myspellingwas neververygood)butafteracoupleofhundredbeta-testshadbeensentout,people began to complain that it was all a diabolical plot and that I had simply written an error-free manual for programs that I had no intention of sending out and that probablydidn’tevenexist! Program Style FewcomputerprogramsoftheantiquityofSHELXareinwideusetoday(though ORTEP is an even older survivor). One possibility is that the use of a very simple standardsubsetofFORTRAN,trueevenofthemorerecentadditionstotheSHELX system, makes it trivial to port the programs to new computer hardware. In com- parisonwithothercomputerlanguages,FORTRANhasremainedremarkablystable andupwardscompatible.InthemeantimeIhavelearntsomeCandC++andeven (severalyearsago)heldPASCALcourses,butconsiderthatFORTRANisstillthe language of choice for rugged number-crunching programs. FORTRAN shows no signs of fading out, as exemplified by the excellent selection of FORTRAN com- pilersavailableforLinuxsystems,andthesheerinertiaofthevastbaseofscientific FORTRAN code will ensure its survival for a long time to come. There are many excellentnumericallibrariesavailableforFORTRAN,butIpreferredtowriteevery line of SHELX myself and did not use these libraries; over the years this has cer- tainlyenhancedportabilitybecausetheprogramshavenotbecometime-lockedinto a particular computing environment. The programs are written with (by modern standards) totally excessive attention to optimizing execution speed and the use viii Foreword of memory; a negative side-effect of this is that compiler optimization rarely pro- duces much improvement in performance. Maybe the spartan programming style, forexample,therestrictiontoafewsingledimensionarrayswithoneletternames and the terse comments—originally so that the punched cards could be squeezed intoonebox—hassimplydeterred‘improvements’totheprogramcode. The User Interface An important part of SHELX, and one to which I gave a great deal of thought, is the user interface. The number of input and output files is kept to an absolute minimumandtheprogramsusenoconfigurationfilesorenvironmentvariables.So forastructurerefinement,the(usuallystaticallylinked)executableSHELXLshould beputsomewhereinthePATHandtwoinputdatafileswiththeextensions.hkl(for thereflectiondata)and.ins(foreverythingelse)areallthatarerequired.SHELX-76 wasoftenrunfromasinglecard-deckbyconcatenatingthecondenseddatareflection data (see above) onto the end of the remaining data using ‘HKLF-1’. If one could find a card-reader, the same card-deck could be fed into SHELXL-97 today and would produce sensible results. Some users have still not forgiven me for the last small change I made to the format of the .hkl reflection data file (in 1975). Since remainingcompatiblehasthehighestpriority,Icouldnotchangethisformatagain now,thoughitwouldmakeagreatdealofsensetoputtheunit-cellthatcorresponds totheindexinginthefilebeforethefirstreflection. The.insinputfilewasdesignedtobeeditedbyhumans,notcomputers.Extensive useofdefaultvalueskeepsitshort.Defaultvaluesrequirecarefulplanningbecause theygetused99%ofthetime!FreeformatinputwasararitywhenSHELX-76came out;itwasnotsupportedbyFORTRAN-66andsohadtobeencodedinFORTRAN, character by character, but at least this was fully portable. Four-letter words play animportantrolebothintheSHELXinputandintheEnglishlanguageingeneral! In addition to the default values, there is also another feature of the .ins file that makesitverydifficulttoparsewithanothercomputerprogram;tosavespaceIdid not—like the PDB and other formats—start each atom with ‘ATOM’, so an atom nameissimplyakeywordthatdoesnothavesomeotherdefinedmeaning.Again, it would be nice to change this but retaining upwards compatibility is even more important. Refinement strategy MostofSHELXisbasedprimarilyonideasofotherpeople,inparticularusersofthe program.About90%ofmyowninnovationsthatItriedtoincludeintheprogram turnedouttobeuseless;Iwascarefultoeradicatealltracesofthesesothatnoone wouldbetemptedtomisusethem.Thefewinnovationsthatturnedouttobeuseful instructurerefinementareworthcommentingonhere.Oneofthesewastheintro- ductionoffreevariables,whichenabledlinearconstraintstobeappliedinasimple and general way; to do this with other programs often required the user to write a Foreword ix special subroutine for each case. Special position constraints were a major applic- ationoffreevariablesinSHELX-76,bySHELX-97therecognitionandconstraint of special positions had been fully automated, and the most common application of free variables today is probably their use to couple the refinement of the occu- pancies of different disordered atoms and groups. Many other protein refinement programs still lack special position and occupancy constraints. Rigid group defin- ition (and the removal of rigid group constraints) is very simple and intuitive in SHELXL, though the potentially powerful use of quaternions to fit standard frag- mentstoselectedelectrondensitypeakshasbeenwidelyignoredbyusers.Theuseof aconnectivityarrayandPARTnumbersprovidesasimpleandeffectiveframework for defining disorder and generating hydrogen atoms and various restraints; other macromolecularprogramstendtouseamuchmorecomplicatedtemplateapproach inwhichallbonds,hydrogenatoms,etc.aredefinedintemplatelibraries(thisiswhy someproteingraphicsprogramscannotdrawdisulfidebonds!).Theuseofacircular difference Fourier to find the best positions for hydrogen atoms in -OH and -CH 3 groups was another SHELXL-97 innovation. The ‘similar distance’ restraints and therestraintsontheanisotropicdisplacementparameters(DELU,SIMUandISOR) also first came into wide use in SHELXL-97, though the rigid bond restraint was probably first used by John Rollett. These restraints are essential both for macro- molecularrefinementandforhandlingdisorder(oftenofsolventmolecules)insmall molecule structures. I am sure that we will be able to find better ways of restrain- ing the displacement parameters in the future this was never intended to be the lastword. I had never imagined that SHELX would eventually find application in mac- romolecular refinement, and the introduction of several essential features for this purposecanbeattributedtoencouragementfromZbigniewDauterandKeithWilson, whointheearly1990swerelookingforwaystorefineagainsttheveryhighresol- utionproteinstructuresusingdatacollectedontheEMBLbeamlinesattheDESY synchrotron in Hamburg. These features included the solvent model (based on the method used in Dale Tronrud and Lynn Ten Eyck’s TNT program) and conjugate gradient solution of the least-squares normal equations (as in John Konnert and WayneHendrickson’sPROLSQprogram).Ididintroducesomeconvergenceaccel- erationintothisCGLSmethodbytakingintoaccounttheshiftsinthepreviouscycle; infactCGLSshouldbemorewidelyusedforlargesmallmolecules,itisveryrobust. HoweverCGLSdoesnotenablethestandarduncertaintiesintheparameterstobe estimated,soafinalL.S.cycle—usuallywithBLOC1andDAMP00—isrequired toobtaintheseesdsformacromolecules.ThemostcomplexpartofSHELXLtopro- gramwasprobablythederivationofstandarduncertaintiesinallderivedparameters takingallcorrelationtermsfromthefullinvertedleast-squaredmatrixintoaccount. Onearea,stillneglectedbymacromolecularcrystallographers,istherefinementof merohedralandnon-merohedraltwins.TestsbyGaribMurshudovandothershave shown that a significant fraction of structures deposited in the PDB are seriously in error because twinning had not been taken into account. My colleague Regine Herbst-Irmermademajorcontributionstothewaysofhandlingandrefiningtwins withSHELXL-97.