research papers ActaCrystallographicaSectionD Intensity statistics in the presence of translational Biological noncrystallographic symmetry Crystallography ISSN0907-4449 Randy J. Read,a* Paul D. Adamsb In the case of translational noncrystallographic symmetry Received5October2012 and Airlie J. McCoya (tNCS),twoormorecopiesofacomponentintheasymmetric Accepted2November2012 unit of the crystal are present in a similar orientation. This causes systematic modulations of the reflection intensities in aDepartmentofHaematology,Cambridge the diffraction pattern, leading to problems with structure InstituteforMedicalResearch,WellcomeTrust/ determination and refinement methods that assume, either MRCBuilding,HillsRoad,CambridgeCB20XY, England,andbLawrenceBerkeleyNational implicitly or explicitly, that the distribution of intensities is a Laboratory,Berkeley,CA94720-8235,USA function only of resolution. To characterize the statistical effects of tNCS accurately, it is necessary to determine the translationrelatingthecopies,anysmallrotationaldifferences Correspondencee-mail:[email protected] in their orientations, and the size of random coordinate differences caused by conformational differences. An algo- rithm to estimate these parameters and refine their values againstalikelihoodfunctionispresented,anditisshownthat byaccountingforthestatisticaleffectsoftNCSitispossibleto unmaskthecompetingstatisticaleffectsoftwinningandtNCS and to more robustly assess the crystal for the presence of twinning. 1. Introduction Therehave beengreatadvancesinthemethods available for macromolecular crystallography, such that a significant frac- tion of structure determinations are now relatively straight- forward. However, there is still the potential for serious complications when the crystals possess features that break the assumptions underlying the routine structure-solution pathways. The presence of translational noncrystallographic symmetry (tNCS) is particularly insidious in causing difficul- ties in all stages of crystal structure determination, from indexing the diffractionpattern torefining the structure. In tNCS, two or more crystallographically independent copiesareinthesame(ornearlythesame)orientationinthe unit cell. Their contributions to a structure factor have the same (or similar) amplitudes but have relative phases deter- mined by the projection of the translation vector on the diffractionvector.Asaresult,theyinterfereconstructivelyfor somereflectionsanddestructivelyforothers,sothatthereisa systematic modulation of the sum of their contributions. The mostseriouscaseiswhenthetranslationisapproximately,but not exactly, equal to a potential lattice translation such as a centringoperatororacelldoubling.Theexactrelationshipis often broken by a small rotation (typically less than 10(cid:2)) in addition to the translation. Such translations are referred to as pseudo-translations or pseudo-centrings because of their pseudo-crystallographic nature, and they lead to pronounced effects, with large numbers of systematically very weak and verystrongreflections.Theperturbationofthedistributionof intensities leads to difficulties with statistical tests based on intensitystatistics,aswellasviolatingtheassumptionsbehind 176 doi:10.1107/S0907444912045374 ActaCryst.(2013).D69,176–183 research papers likelihood targets for phasing and refinement, which assume so the structure factors are all implicitly assumed to be for that the data followan isotropic Wilson distribution. reflection h. Translational NCS is a frequent issue in solved macro- Consideracrystalcontaining in its asymmetricunittwo or molecularcrystalstructures.ThefrequencyoftNCShasbeen more copies of components with similar structure. The total investigatedbyZwartetal.(2005).TheexistenceoftNCScan structure factor (F) is made up of contributions from copies be detected by the presence of a large non-origin Patterson relatedbyacombinationofN noncrystallographicandN ncs sym peak. Usingthe criterionthat anon-origin peak greater than crystallographicoperations, 20% of the origin peak was present in a Patterson map computedusingdatato5A˚ resolution,itwasfoundthatabout NPsymNPncs F¼ F ; 8% of structures deposited in the Protein Data Bank (PDB; k¼1m¼1 km Berman et al., 2000) probably possess tNCS. Translational N P F ¼ f expð2(cid:2)ih(cid:3)x Þ; ð1Þ NCS can also prevent structure solution, for which there are km jm jkm j¼1 anecdotalaccounts but no statistical records. In the following, the effect of tNCS on structure-factor where intensity statistics is investigated. A method to characterize x ¼T ½O(cid:4)1 V Oðx þ (cid:3) Þþ v (cid:5)þt the parameters describing the tNCS has been developed and jkm k F m j F jm F m k tested,anditisshownthatcorrectedintensitystatisticscanbe ¼T O(cid:4)1 V Oðx þ (cid:3) ÞþðT v þt Þ: k F m j F jm kF m k used to detect the presence of twinning. The implications for molecularreplacement,experimentalphasingandrefinement Inthis,thereisanallowancefordifferencesinthescattering will be explored in subsequent publications. factors for atoms in different copies (fjm could differ among NCS-related molecules m, particularly because of differences in the incorporatedeffectsofB factors).The coordinates are represented in terms of those from a canonical copy of the 2. Statistical effects of noncrystallographic symmetry moleculecentredontheoriginandconformationaldifferences Afullmaximum-likelihoodtreatmentofNCSwouldcoverthe relativetothatmolecule( (cid:3) ).Forconvenience,wecantake F jm very general case of a number of different components that the canonical copy to be in the same orientation as the copy are related by different noncrystallographic symmetries. In with k = m = 1, so that x = x (cid:4) v (cid:4) (cid:3) and V is an j j11 F 1 F j1 F 1 practice, the NCS-related deviations in structure-factor identitymatrix.Notethatsinceconformationaldifferencesare intensities from an isotropic Wilson distribution are most assigned even to the first copy, the canonical copy can be seriouswhen thereisexacttranslationalNCSornearlyexact consideredtobeanaveragestructure.Thenumberofatomsin translational NCS (a small rotation is present), particularly onecopy ofthecomponentisgivenbyN.TheNCSrotations if these are translations close to crystallographic centring couldberepresentedintermsofonematrix,C,inthenotation operatorsandifonlyonesetofNCSoperatorsispresent.For usedbyBricogne(1997),butthephysicalmeaningiseasierto this reason, and for simplicity of notation, we will only deal understandintermsofrotations( V )inorthogonalspace,so F m with the case where there is one set of NCS operators, that the transformations from (O) and to (O(cid:4)1) fractional although the formulae are presented in a way that may be coordinates must be included explicitly. The crystallographic generalizedtomultiplesetsofoperators.Inordertodealwith symmetryoperationsarerepresentedbyarotationmatrix,T , k the very common case that the relationship is not a perfect andatranslation vector,t . k translation but is rather a translation combined with a small We start by considering the covariances among the contri- rotation, we start with the case of NCS operations that butions to the structure factor where (similar to the case of combine translations withrotations ofany size. experimental phasing; Read, 2003) terms between common atomswilldominate, 2.1. Covariance elements sensitive to the effects of hF F(cid:6)i’PNhf f exp½2(cid:2)ih(cid:3)ðx (cid:4)x Þ(cid:5)i: ð2Þ noncrystallographic symmetry km ln jm jn jkm jln j¼1 The statistical effects of NCS are easiest to evaluate by Forcovariancesinvolvingatomswithinthesamecopy(k=l considering correlations between NCS-related contributions for crystallographic symmetry and m = n for noncrystallo- to the structure factors and then assembling them into a graphic symmetry), we can consider the atoms to be inde- picture ofthe overall effects ofNCS. pendent because we have factored out any relationships As pointed out by Bricogne (1997), the presence of NCS leading tocorrelations, leads to modulations in the intensities, which can be used to characterizethenatureoftheNCS.Thefollowingtreatmentof N hF F(cid:6) i¼(cid:2) ’Pf2: ð3Þ intensity statistics is similar in spirit to that of Bricogne, with km km Fm jm j¼1 the addition of an allowance for small random differences amongtheNCS-relatedcopiesinthepositionsandscattering If the expressions for the transformed coordinates are factorsoftheatomsthatmakethemup.AsinBricogne(1997) enteredexplicitly,thedotproductinsidetheexponentialin(2) we will not consider correlations among structure factors, can be expanded as follows: 177 ActaCryst.(2013).D69,176–183 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS research papers h(cid:3)ðx (cid:4)x Þ¼h(cid:3)½ðT O(cid:4)1 V O(cid:4)TO(cid:4)1 V OÞx workonapplicationstomolecularreplacement,experimental jkm jln k F m l F n j phasingandrefinement.) þ ðT v þt Þ(cid:4)ðT v þtÞ ð4Þ kF m k lF n l For the covariances between copies related purely by þ ðT O(cid:4)1 V O (cid:3) (cid:4)TO(cid:4)1 V O (cid:3) Þ(cid:5): k F m F jm l F n F jn crystallographicsymmetry(m=nbutk6¼l),thepresenceor absence of tNCS is not relevant. These terms will only differ Withsomerearrangementandchangesofvariable,thiscan significantlyfromzerowhenthesymmetryrotationisparallel be expressed more succinctly: tothediffractionvector(TTh=TTh,sothat h =0).When k l FF klmn h(cid:3)ðxjkm(cid:4)xjlnÞ¼FFhklmn(cid:3)xjþh(cid:3)FFvklmnþh(cid:3)FF(cid:3)jklmn; ð5Þ there is no phase shift between the contributions of these copies,theywillcontributetoincreasingtheexpectedintensity where factor; otherwise, they will lead to systematic absences. Such h ¼ðOT VTO(cid:4)1TTT (cid:4)OT VTO(cid:4)1TTTÞh; pairs of contributions can be handled in a simple fashion by FF klmn F m k F n l settingthecovariancetermsform=n,k6¼ltozeroandthen v ¼ðT v þt Þ(cid:4)ðT v þtÞ; FF klmn kF m k lF n l multiplyingtheremainingdiagonalelementsinthecovariance (cid:3) ¼T O(cid:4)1 V O (cid:3) (cid:4)TO(cid:4)1 V O (cid:3) FF jklmn k F m F jm l F n F jn matrix by the usual expected intensity factor ". The interesting covariances are those between copies sothat related by noncrystallographic symmetry (m 6¼ n). If we hF F(cid:6)i’PNhf f expð2(cid:2)i h (cid:3)xÞexpð2(cid:2)ih(cid:3) v Þ assume that the differences in scattering factors and atomic km ln jm jn FF klmn j FF klmn positionsareindependentofthepositionsoftheatomswithin j¼1 the components, then the expected value can be treated as a (cid:8) expð2(cid:2)ih(cid:3) (cid:3) Þi: ð6Þ FF jklmn productofexpectedvalues,separatingthecorrelation( (cid:4) ) FF mn The first exponential term in (6) accounts for the effect of ofthestructurefactorsforthecomponentsiftheywereinthe rotation on interference, with h being equal to the sameposition and orientation from the interference effects, FF klmn differencebetweentwocopiesoftheoriginalindexhrotated hF F(cid:6)i’ (cid:4) ð(cid:2) (cid:2) Þ1=2hexpð2(cid:2)i h (cid:3)xÞi by different combinations of crystallographic and noncrys- km ln FF mn Fm Fn FF klmn j tallographic symmetry in the crystal; the closer h is to (cid:8) hexpð2(cid:2)ih(cid:3) v Þi; ð7Þ FF klmn FF klmn zero, the larger the interference effect. The second exponen- where tial accounts for a systematic translation-derived phase shift betweenthecontributionsofthetwocopiesofthecomponent. * + N The third exponential (along with the scattering factors) (cid:4) ð(cid:2) (cid:2) Þ1=2 ¼ Pf f expð2(cid:2)ih(cid:3) (cid:3) Þ : FF mn Fm Fn jm jn FF jklmn accountsfortheeffectsofdifferencesamongtheNCS-related j¼1 copies.Notethatifthecoordinate differencesareconsidered If there is an atomic model, then at least the approximate to be drawn randomly from a spherically symmetric distribu- locationsoftheatomsineachcomponentareknown,sothat tion,thenrotatingthesedifferences(e.g.inthevariable (cid:3) ) F jm theexpectedvalueoftherotationalinterferencetermcanbe willnotchangethenatureoftheirprobabilitydistributions,so computed. However, if we are characterizing translational thatthedistributionof (cid:3) willbeindependentofkandl. FF jklmn NCSpriortostructuresolution,thebestwewillhaveissome (The subscripted prefix FF indicates terms relating two idea of the envelope containing the component. In this case, contributionstotheobservedstructurefactor,F,todistinguish theexpectedvalueoftheinterferencetermisanintegralover them from terms relating contributions involving calculated the volume of the envelope (denoted U for the volume of F structurefactors,G.Suchtermswillbeneededforsubsequent a unique component contributing to the structure factor F), which is equivalent to the Fourier transform of the envelope or a G-function (Rossmann & Blow, 1962). Because the envelope is finite in volume and does not possess crystallo- graphic symmetry, it is convenient to index it in terms of a diffraction vector(in units ofA˚(cid:4)1), R hexpð2(cid:2)i h (cid:3)xÞi¼ expð2(cid:2)i h (cid:3)xÞ FF klmn j FF klmn j UF ¼G ð s Þ; ð8Þ F FF klmn where s ¼O(cid:4)1T h FF klmn FF klmn ¼ð VTO(cid:4)1TTT (cid:4) VTO(cid:4)1TTTÞh: F m k F n l Beforetheshapeofthemolecule(oratleastitsorientation) Figure1 isknown,itmaybeappropriatetoapproximateitasasphere G-functioncomputedfromtheFouriertransformofaspherecentredon withradiusr,sothattheG-functionistheFouriertransformof theoriginplottedasafunctionoftheproductofrands,i.e.theratioof thesphereradiusandtheresolution. a sphere (Rossmann & Blow, 1962), 178 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS ActaCryst.(2013).D69,176–183 research papers Gðr;j s jÞ v ¼T ð v (cid:4) v Þ; FF klmn FF kkmn k F m F n 3½sinð2(cid:2)rj s jÞ(cid:4)2(cid:2)rj s jcosð2(cid:2)rj s jÞ(cid:5) s ¼ð VT (cid:4) VTÞO(cid:4)1TTTh; ¼ FF klmn FF klmn FF klmn : ð9Þ FF kkmn F m F n k ð2(cid:2)rjFFsklmnjÞ3 FF(cid:3)jkkmn ¼TkO(cid:4)1ðFVmOF(cid:3)jm(cid:4)FVnOF(cid:3)jnÞ: ð11Þ AG-functioncomputedfromaspherecentredontheorigin Notethatthephase-shifttermcontaining v nowonly FF kkmn (Fig. 1) gives insight into the general behaviour of the inter- depends on the translation vector between the NCS-related ference term; the G-function differs significantly from zero copies and not on the translational component of the crys- only for values of s with a magnitude substantially less tallographic symmetryoperators.Thishastheadvantage that FF klmn than the reciprocal of the sphere radius. G-functions from ananalysisoftheeffectsoftNCScanbecarriedoutwhenthe volumes with finer details in their shapes and lacking Lauegroupisknownbutnot necessarilythe particular space symmetry will also lack spherical symmetry and will have group. features extending to higher resolution, although the largest values will still be close to the origin. 2.2.EffectoftNCSontheexpectedintensityoftheobserved The argument of the G-function, s , will be near zero structure factor FF klmn either when the two corresponding copies of the structure Correlationsamongthecomponentsofthestructurefactor component (related by combinations of crystallographic and lead tosystematic modulation of the observed intensities. noncrystallographic symmetry) are in nearly the same orien- The variance (expected intensity) of the structure factor tation or when the rotation axis is nearly parallel to the that is the sum of the contributions of the different compo- diffractionvector,so that nents is the sum of all of the covariances between these contributions.Thisissimplifiedbythefactthatweareignoring VTO(cid:4)1TTTh’ VTO(cid:4)1TTTh: ð10Þ F m k F n l terms between different crystallographic symmetry operators andcollectingtheirinfluenceintheexpectedintensityfactor". The former condition will apply for all structure factors, Toallowsimplyforthepossibilityofapartofthecrystalthat leading to an overall modulation of the diffraction pattern, does not obey these NCS operators, we can add a term (cid:2) whilethelatterconditionwillleadtospikesinthediffraction Fr fortherest ofthestructure.(Notethat(cid:2) could include the pattern with a significant modulation (Bricogne, 1997). The Fr contributionofanothercomponentwithadifferentsetofNCS maximummodulationalongthedirectionofthespikesarising operators, showing how the treatment presented here could from this component of the symmetry would be equal to the easily be generalized.) number of copies in the asymmetric unit. However, the mtioanximaxuims cwooinucldidoednlywbitehrethaceheddiffirfatchteiodnirveectcitoonroafntdheifrotthae- hF2i¼"(cid:2)Frþ"NPsymNPncs(cid:2)(cid:2)Fmþ2 NPncs FF(cid:4)mnð(cid:2)Fm(cid:2)FnÞ1=2 disposition of the copies were such that they were equally k¼1m¼1 n¼mþ1 (cid:3) spaced between the Bragg planes. In principle, knowing the (cid:8) Re½G ð s Þexpð2(cid:2)ih(cid:3) v Þ(cid:5) : ð12Þ F FF kkmn FF kkmn directions of such spikes would contribute to understanding the rotational part of the NCS, and the pattern of intensity Inthisexpression,termswithm<nhavebeenpairedwith modulation along these spikes would give information about their complex conjugates, i.e. the terms with m > n, so that the relative positions ofcopies of components.However,this the imaginary parts cancel. The unmodulated terms can be is a minor contribution to the overall modulation of the collectedintoatermrepresentingtheintensitythatwouldbe structure-factor intensities in the case of translational NCS. expected after averaging over themodulations,(cid:2) , N Including this term does not significantly alter the corrective factors, but does significantly increase the computation time hF2i¼"(cid:2) þ2"NPsymNPncs NPncs (cid:4) ð(cid:2) (cid:2) Þ1=2 (results not shown). In the remainder we will neglect the N FF mn Fm Fn k¼1m¼1n¼mþ1 contribution to the covariances of copies in significantly (cid:8) Re½G ð s Þexpð2(cid:2)ih(cid:3) v Þ(cid:5) differentorientations. F FF kkmn FF kkmn Although a noncrystallographic translation can be gener- ¼"(cid:2) (cid:2)1þ2NPsymNPncs NPncs FF(cid:4)mnð(cid:2)Fm(cid:2)FnÞ1=2 ated by a combination of crystallographic symmetry and N (cid:2) k¼1m¼1n¼mþ1 N noncrystallographicsymmetry(forexample,acrystallographic (cid:3) twofoldandanearlyparallelnoncrystallographictwofold),we (cid:8) Re½GFðFFskkmnÞexpð2(cid:2)ih(cid:3)FFvkkmnÞ(cid:5) : ð13Þ can choose without loss of generality to consider the copies related by noncrystallographic translations as belonging to Theterminthecurlybracescanbethoughtofasanextra" the same asymmetric unit, so that k = l for the pairs we will factoraccountingforthemodulationoftheintensitiesbyNCS. consider; the covariance elements hF F(cid:6)i will be approxi- This general expression could be applied when there is an km ln mated as zero for k 6¼ l. (As above, we deal with the case in atomicmodel,whichdefinestheenvelopeenclosingtheparts which the symmetry rotation is parallel to the diffraction of the structure that obey tNCS, and the rotations and trans- vectorbymultiplyingincludedtermsbytheexpectedintensity lations that relate these parts of the structure. Before the factor".)Thisleadstosimplificationoftheexpressionsinthe structure is solved, there is no way to know the shape of the covariances, envelope (or at least how it should be oriented, if there is a 179 ActaCryst.(2013).D69,176–183 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS research papers molecular-replacement model), so it is simplest to assume a (v.8.0;WolframResearch,Champaign,Illinois,USA).Inthese sphere,inwhichcasetheG-functionisrealanddependsonly simulations, data have been generated for a crystal in space on the resolution. This approach should capture the most groupP1containingtwo‘molecules’relatedbytNCS.Forthe importanteffectsoftNCSevenwhenthereisadetailedatomic first copy of the molecule, atoms were generated randomly model, withinasphereandcopiesoftheseatomswerethengenerated byapplyingasmallrotation,atranslationandarandomshift. hF2i¼"(cid:2) (cid:4)1þ2NPsymNPncs NPncs FF(cid:4)mnð(cid:2)Fm(cid:2)FnÞ1=2 Sincethemoleculeshaveasphericalenvelope,theG-function N (cid:2) k¼1m¼1n¼mþ1 N istheFouriertransformofasphere,asdiscussedbyRossmann (cid:5) &Blow(1962). Thesimulationsshowthataccountingforthe (cid:8) G ðj s jÞcosð2(cid:2)h(cid:3) v Þ : ð14Þ F FF kkmn FF kkmn effects of orientation and conformation differences between tNCS-relatedcopieswillbeessentialtogainagoodagreement Fortheverycommonspecialcaseinwhichthereisonlyone between theory and observation. translational NCS operator, the equation can be simplified further, 3.1. Modulations of observed intensities hF2i¼ "(cid:2)N(cid:2)1þ2F(cid:5)ncsNPsymGFðjFFskk12jÞ Asdescribedby(13),tNCSintroducesamodulationofthe k¼1 expectedintensitiesdependingprimarilyonthephaseshiftof (cid:3) (cid:8) cos½2(cid:2)TTh(cid:3)ð v (cid:4) v Þ(cid:5) ; ð15Þ the contributions from copies related by tNCS. The modula- k F 1 F 2 tion drops in strength if there are differences in the confor- where mationsortheorientationsofthecopies.Fig.2illustratesthe effects of random coordinate differences (assumed to be (cid:5) ¼FF(cid:4)12ð(cid:2)F1(cid:2)F2Þ1=2: drawn from a Gaussian distribution) and differences in F ncs (cid:2) orientationonthestrengthofmodulationforstructurefactors N obtained from a crystal with two spherical molecules. Note Inthisform,theweight (cid:5) appliedtothemodulationterm F ncs that when the model is complete and the two copies scatter iseffectivelythefractionofthescatteringofonecomponentin with the samestrength then the term (cid:5) in (15) is equal to the unit cells that obeys the translational NCS, corrected for F ncs half of the complex correlation between these copies (cid:4) . theeffectofdifferencesamongtNCS-relatedcopies.Notethat FF 12 When the coordinate differences are drawn from a Gaussian this automatically allows the presence of a component that distributionwithanr.m.s.coordinatedifferenceof(cid:6) ,thenthis does not obey tNCS. r complex correlation can be calculated using the appropriate formula for (cid:6) , which is also a complex correlation (Read, A 3. Simulations to test the probability distributions 1990), The probability distributions describing the statistical effects of tNCS have been tested by simulations in Mathematica Figure 3 Comparisonofpredictedaverageintensity(line)withsimulatedaverage intensity(points).ThecrystalisequivalenttothatusedforFig.2,except Figure2 that thetwocopies differ byarotation of2(cid:2) around thexaxis andan Predicted average intensity in the direction parallel to c* for a crystal r.m.s.coordinatedifferenceof0.5A˚.Eachpoint(correspondingtoa00l (spacegroupP1,unit-cellparametersa=b=c=50A˚,(cid:7)=(cid:8)=(cid:9)=90(cid:2)) reflection) is obtained by carrying out 1000 simulations in which 200 containingtwocopies[separatedbyafractionaltranslationof(0.47,0.47, atomsaregeneratedrandomlywithinthesphericalenvelopeofthefirst 0.47),i.e.approximatelybody-centred]ofasphericalmolecule(r=20A˚) molecule(centredontheorigin);thesecondcopyisthengeneratedby comprised of200 single-electron pointscatterers.Thesolidlines shows perturbing these atomic positions followed by rotation and translation. thecaseinwhichthetwocopiesareidenticalinconformationbutdiffer The points for the first-order and second-order reflections are omitted bya5(cid:2)rotationaroundthexaxis(blackline)oraroundthezaxis(grey because the assumptions behind the Wilson (1949) distribution are line).Thedashedlineshowsthecaseinwhichthetwocopiesareinthe violatedwhentheBraggspacingsarelargecomparedwiththesizeofthe sameorientationbuthaver.m.s.coordinatedifferencesof1.5A˚. molecularenvelope. 180 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS ActaCryst.(2013).D69,176–183 research papers (cid:6) 2(cid:2)2 (cid:7) A refinement of the relative orientation is carried out if (cid:4) ¼exp (cid:4) (cid:6)2jsj2 : ð16Þ FF 12 3 r therearetwocopiesrelatedbytNCS;formultiplecopies,we currently approximate the effect of rotational differences as AsshowninFig.2,randomconformationaldifferencesand random differences among copies related by a pure transla- rotational differences between the copies can have a similar tion. Because the orientation refinement does not always effectonthestrengthoftheintensitymodulation,exceptthat converge uniquely from any starting point, refinements are there is a direction-dependence of the effect of the rotation started from several relative orientations and that giving the difference: a rotation around the diffraction vector has no best agreement with the data is chosen. The rotational effect (because it does not change the positions of the atoms difference between the two copies is parameterized as a relative to the Bragg planes), whereas a rotation around an combinationofsmallrotationsaboutthex,yandzaxes,which axisperpendiculartothediffractionvectorhasalargeeffect. behave well in refinement because they are approximately This figure also shows that the information to distinguish the orthogonal. Note that when the exact shape and size of the effects of random conformational differences and rotational molecule that obeys tNCS is not known, there is a trade-off differences may be most obvious at higher resolution. between the assumed radius of the sphere that approximates ThesimulationinFig.3demonstratesthat(15)providesan themolecularenvelopeandthesizeoftherotationangles.The excellent description of the average intensities for different rotational difference enters the likelihood target through the reciprocal-latticevectors,evenwhenthereisacombinationof G-functionterm,which depends ontheamountbywhich the conformational and orientation differences between the rotational difference rotates the diffraction vector. For small copies. rotations,theabsolutesizeofthemovementofthediffraction vector is, to a good approximation, proportional to the rota- 4. Refining parameters characterizing tNCS tion angle, so an error in the assumed sphere radius can be compensatedbyareciprocalchangeinthesizeoftherotation To characterize tNCS from a data set, parameters describing angle. the NCS translation, the difference in orientation of the Finally, the complex correlation between pairs of tNCS- tNCS-relatedcopiesandtherandomdifferencesbetweenthe related copies ( (cid:4) in 14) is currently assumed to be structures of the copies must be estimated and refined. This FF mn equivalent for all pairs when there is more than one NCS hasbeenimplementedwiththefollowingalgorithminPhaser translation,andwedonotcurrentlyaccountforthepossibility (McCoyetal.,2007).Thecurrentimplementationisoptimized ofdifferentoverallBfactorsamongthecopies.Inthiscase,we forthecommoncaseoftwocopiesrelatedbytNCS.Multiple can refine the resolution-dependent parameter (cid:5) assumed tNCS copies can also be handled, as long as the copies are F ncs tobeequivalentforallpairsoftNCS-relatedcopies.InPhaser generated by successive applications of the same translation this is reported as aLuzzati D factor (Luzzati,1952). In fact, vector, but a more general treatment has not yet been the refined parameter is given by the corresponding variance implemented. The parameters characterizing the tNCS are term refined against a likelihood function given by the Wilson (1949) distribution of amplitudesfor acentric reflections, (cid:6)2 ¼1(cid:4) (cid:5)2 ; ð19Þ ncs F ncs 2F (cid:6) F2 (cid:7) which has better refinement properties, as the likelihood p ðFÞ¼ exp (cid:4) ; ð17Þ a hF2i hF2i functionismorenearlyquadraticwhenexpressedintermsof this parameter. orcentricreflections, (cid:6) 2 (cid:7)1=2 (cid:6) F2 (cid:7) 5. Intensity moments in the presence of tNCS p ðFÞ¼ exp (cid:4) : ð18Þ c (cid:2)hF2i hF2i Intensitymomentscanbeausefuldiagnosticforthepresence In this likelihood function, the expected value of the of twinning (Stanley, 1972; Rees, 1980), but their usefulness intensityiscomputedusing(14),sotherefinedparametersare can be reduced by other influences on the distribution of the parameters from that equation. intensities,suchasoverallanisotropyand,inparticular,tNCS An initial estimate of the translation vector between the (Padilla&Yeates,2003;Lebedevetal.,2006).Correctionsfor twocopies(orthefirsttwoofsuccessivecopies), v (cid:4) v ,is overall anisotropy are now well established (Popov & Bour- F 1 F 2 obtainedfromthelargestoff-originpeakinanativePatterson enkov, 2003; McCoy et al., 2007). We were interested in map. If the translation is close to a centring operator, determining whether a further correction for the statistical symmetry-relatedcopiesofthePattersonpeakwillmergeinto effectsoftNCSwould atleastpartiallyunmaskthestatistical asinglepeakon aspecial position. Refinement would not be effects oftwinning. able tomove this translation vector toone oftheequidistant E-valuesthathavebeencorrectedforthestatisticaleffects symmetry copies so, if the Patterson peak is on a special of tNCS can be computed using the expression for the position, the translation vector is first perturbed by a small expected intensity in (14), translation of d /6 in each of the x, y and z directions; we min F havefoundthistobesufficienttoavoidtherefinementbeing E¼ ð20Þ hF2i1=2 trapped on an exact centring translation. 181 ActaCryst.(2013).D69,176–183 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS research papers Table1 Secondmomentsofintensity(hE4i/hE2i2)inthepresenceandabsenceoftwinning. Beforeanisotropycorrection BeforetNCScorrection AftertNCScorrection PDBcode Centric Acentric (cid:3)B (A˚2) Centric Acentric Centric Acentric Twinfraction p-value aniso 2fuq 5.04 3.10 29.0 4.42 2.81 3.01 2.01 —† 1 1un7 4.33 2.84 4.5 4.44 2.88 2.73 1.97 —† 0.221 1y9r — 1.88 0.2 — 1.88 — 1.75 0.08 1.4(cid:8)10(cid:4)20 1eh4 2.45 2.35 0.0 2.45 2.38 2.56 1.81 0.10 3.6(cid:8)10(cid:4)6 1upp 3.05 1.84 2.3 3.04 1.84 2.52 1.71 0.46 2.1(cid:8)10(cid:4)76 † Nomerohedralorpseudomerohedraltwinoperatorpossible. andthentheseE-valuescanbeusedinthestandardmoment Table 2 tests. ComparisonofestimatedandrefinedtNCSoperators. Several test data sets were selected from the PDB for Rotationangle((cid:2)) Translationvector(fractional) structures with pairs of molecules or assemblies in the asym- PDB Angular metric unit related by tNCS: PDB entries 2fuq (Shaya et al., code Refined PDB difference† Refined PDB‡ 2006), 1un7 (Vincent et al., 2004), 1y9r (Fagart et al., 2005), 2fuq 0.33 0.89 0.78 (cid:4)0.038,0.497,0.000 (cid:4)0.038,0.499,0.000 1eh4 (Mashhoon et al., 2000) and 1upp (Karkehabadi et al., 1un7 2.05 2.52 1.05 0.487,0.500,0.500 0.482,0.499,0.500 2003). These cases were chosen to illustrate the effects of 1y9r 2.49 1.24 2.60 0.325,0.662,0.589 0.324,0.662,0.589 1eh4 3.63 4.00 1.24 0.009,0.007,0.493 0.002,0.010,0.493 anisotropy, twinning and small rotational deviations from a 1upp 4.01 3.43 2.93 0.004,(cid:4)0.496,0.494 0.007,(cid:4)0.498,0.496 puretranslation.Oneofthesecases,1upp,wasalsochosenby Lebedev et al. (2006) to illustrate the effect of combining † Angulardifferencemeasuredusingthesymmetry-relatedtransformationthatagrees most closely with the NCS translation in the PDB file and choosing the (arbitrary) twinning and tNCS. directionofrotationthatminimizestheangulardifference. ‡ PDBtranslationvector Table 1 shows the results that are obtained by computing measuredasavectorbetweencentresofmassofcommonmain-chainatoms second intensity moments for centric and acentric reflections beforeandaftercorrectionforoverallanisotropyandforthe importantly, the twin-related reflections in this case will be effects of tNCS. Note that if the data obey standard Wilson affected by different modulations, so that the model of the distributions the expected value for this moment is 3 for effects oftNCS will be a compromise.In 1upp the two mole- centric reflections and 2 for acentric reflections, but in the cules are related by a translation of approximately 0, 1/2, 1/2 presence of perfect twinning the moments would be reduced andarotationof3.43(cid:2)aboutanaxisverynearlyparalleltothe to 2 for centric reflections and 1.5 for acentric reflections y axis. The largest modulations will therefore be seen for (Stanley, 1972). To assess the significance of any deviation reflectionswithsmallhandlindices,forwhichtherotationhas fromthevaluesexpectedforuntwinneddata,ap-valueisalso very little effect on scattering. However, the twin law is k, h, shown; this p-value is the probability (computed from the (cid:4)l, so that reflections near the h00 axis, with large values of observeddistributionofintensities)thatthetruevalueofthe the h index and thus relatively little modulation, are super- second moment for the acentric reflections is 2 or greater. In imposed on reflections near the 0k0 axis with significant Phaser, a p-value of 0.001 or less triggers a warning that the modulation. crystal is likely tobe twinned. TheresultsinTable2showthatthemethodisabletodetect As an objective measure of twinning, the twin fraction deviations from exact centring operators, even when the obtained by twin refinement in phenix.refine (Afonine et al., Patterson peaks merge into a single peak consistent with a 2012) is shown for the structures in cells that support mero- perfect centring operation. The refined translation vectors hedral or pseudomerohedral twinning. In addition, Table 2 agree well with the vectors determined from the refined compares the refined values for the tNCS operators with the models.Also, even though the assumption of spherical mole- values determined from the deposited models to allow an cules is not necessarily obeyed well, the refined rotations are assessment of the simplified model of the crystal used to correlatedtothetruerotations.Therotationsaredetermined characterize tNCS. more accurately when the translations are closer to centring These tests demonstrate that the correction for the statis- operators.Inthissituation,moreofthereflectionsareaffected ticaleffects oftNCScanindeedunmaskthestatisticaleffects bystrongmodulations,sothatthereismoresignalfromwhich oftwinning.Thep-valuesfortwinnedcrystalsaresignificantly therotational parameters can be deduced. lowerthanthethresholdof0.001evenwhenthetwinfraction To test whether it is important to model the rotational isaslowasabout0.1.However,forthecaseofnearlyperfect difference between pairs of tNCS-related molecules, or twinning in 1upp, the second moment is 1.71, which is signif- whether the refinement of the Luzzati D parameters can icantlylargerthanthevalueof1.5thatwouldbeexpectedfor compensate, we repeated the test calculations for two of the perfect twinning. This may, at least in part, be because the crystals that showed a significant rotational difference, 1un7 molecular assembly differs significantly from the assumed and 1eh4, but not allowing the modelled rotation to refine spherical shape with a radius of about 33A˚; it is a U-shape away from zero. For 1un7, the mean value of the second fitting into a box of approximately 88 (cid:8) 54 (cid:8) 42A˚. More moment of the intensity was 2.25, compared with 1.97 when 182 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS ActaCryst.(2013).D69,176–183 research papers the rotation was modelled. For 1eh4, the second moment ThisworkwassupportedbytheWellcomeTrust(Principal withoutrefiningtherotationalparameterswas1.94,compared Research Fellowship award082961 toRJR)and bythe NIH/ with 1.81. Note that asecondmoment of1.94 does not differ NIGMS (P01GM063210 to PDA and RJR). This work was significantly from the value of 2 expected for an untwinned supported in part by the US Department of Energy under crystal,withap-valueof0.148.Theseresultsdemonstratethat Contract No.DE-AC02-05CH11231. itisindeedimportanttomodeltherotationaldifferenceswhen characterizing tNCS. References 6. Conclusions Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty,N.W.,Mustyakimov,M.,Terwilliger,T.C.,Urzhumtsev, This analysis has shown that the effects of tNCS depend on A.,Zwart,P.H.&Adams,P.D.(2012).ActaCryst.D68,352–367. the exact values of the translation, which can be estimated Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., precisely,andonsmalldifferencesinorientationbetweenthe Weissig,H.,Shindyalov,I.N.&Bourne,P.E.(2000).NucleicAcids Res.28,235–242. NCS-related copies, which can be given better than random Bricogne,G.(1997).MethodsEnzymol.276,361–423. estimates even under conditions where the simplifying Fagart,J.,Huyet,J.,Pinon,G.M.,Rochel,M.,Mayer,C.&Rafestin- assumptions of spherical molecules are not valid. By taking Oblin,M.-E.(2005).NatureStruct.Mol.Biol.12,554–555. accountofthestatisticaleffectsoftNCS,thestatisticaleffects Karkehabadi, S., Taylor,T. C. & Andersson, I. (2003). J. Mol. Biol. of twinning can be unmasked sufficiently to provide a clear 334,65–73. Lebedev,A.A.,Vagin,A.A.&Murshudov,G.N.(2006).ActaCryst. diagnostic for twinning.This is important in practice because D62,83–95. tests that depend on twin laws rely on having the symmetry Luzzati,V.(1952).ActaCryst.5,802–810. correctlyassigned(Lebedevetal.,2006).Ifthedatahavebeen Mashhoon,N.,DeMaggio,A.J.,Tereshko,V.,Bergmeier,S.C.,Egli, mergedwithtoohighsymmetrythesetestscannotbeapplied, M.,Hoekstra,M.F.&Kuret,J.(2000).J.Biol.Chem.275,20052– butifthedatahavebeenmergedwithtoolowsymmetrythen 20060. McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., these tests will generate false positives. Note that when the Storoni,L.C.&Read,R.J.(2007).J.Appl.Cryst.40,658–674. symmetryiscorrectlyassigned,testssuchastheL-test(Padilla Padilla,J.E.&Yeates,T.O.(2003).ActaCryst.D59,1124–1130. & Yeates, 2003) are preferable for their ability to assess the Popov,A.N.&Bourenkov,G.P.(2003).ActaCryst.D59,1145–1153. twin fraction reasonably reliably. In the application of the Read,R.J.(1990).ActaCryst.A46,900–912. L-test, reflections with indices differing by even numbers are Read,R.J.(2003).ActaCryst.D59,1891–1902. Rees,D.C.(1980).ActaCryst.A36,578–581. typically chosen to minimize the statistical effects of tNCS Rossmann,M.G.&Blow,D.M.(1962).ActaCryst.15,24–31. arising from pseudo-centring (Padilla & Yeates, 2003); Shaya, D., Tocilj, A., Li, Y., Myette, J., Venkataraman, G., however, when the tNCS differs from a pseudo-centring Sasisekharan, R. & Cygler,M. (2006). J. Biol. Chem.281,15525– operationitmaybehelpfultocorrectforthestatisticaleffects 15535. oftNCS before applyingthe L-test. Stanley,E.(1972).J.Appl.Cryst.5,191–194. Vincent, F.,Yates,D., Garman, E., Davies, G. J. & Brannigan, J.A. Infuturework,wewillshowhowthisunderstandingofthe (2004).J.Biol.Chem.279,2809–2816. statisticaleffectsoftNCScanbeusedtoimprovemethodsfor Wilson,A.J.C.(1949).ActaCryst.2,318–321. molecular replacement, phasing by single-wavelength anom- Zwart,P.H.,Grosse-Kunstleve,R.W.&Adams,P.D.(2005).CCP4 alous diffraction and structure refinement. Newsl.42,contribution10. 183 ActaCryst.(2013).D69,176–183 Readetal. (cid:7) IntensitystatisticsinthepresenceoftranslationalNCS