Journal of Information, Control and Management Systems, Vol. 8, (2010), No.3 257 DOSE-EFFECT MODELING OF EXPERIMENTAL DATA Kaloyan YANKOV Medical Faculty, Trakia University, Stara Zagora, Bulgaria [email protected] Abstract This paper proposes a mathematical model for estimation of the dose-response relationship of experimental data. А modified logistic function is used as analytical model for determining the dose-effect. Parameters of the model are identified with the optimization procedure based on the cyclic coordinate descent method. Formulas are derived to calculate the effective dose level and the standard deviation of doses. The approach is implemented in the computer program KORELIA-Dynamics. Keywords: dose-time-response, toxicity, effective dose level, lethal dose, modeling, coordinate descent method. 1 INTRODUCTION To analyze the effects of a drug, the dose-response relationship is studied in pharmacology and toxicology. Dose-response experiments are routinely conducted in preclinical and clinical trials to study the relationship between the dose level of a drug and the probability of a response, be it “cured” or “poisoned”. Therefore the response of these experiments is binary or quantal, which means either the drug takes effect and the subject has a reaction or not. The underlying assumption for quantal bioassay is that a subject is administered a stimulus at a dose level x, and that the response Y is a binary random variable with success probability p(x), i.e. Y ~ Bin(1,p(x)). The function p:[0,1]is called a dose response curve and is usually assumed to be strictly monotone. This curve describes the probability of success of the analyzed drug. For a given (p(0),p(1)) the parameter of interest is the effective dose level: ED = p−1(α) α 258 Dose-Effect Modeling of Experimental Data This is the dose, where 100α % of the subjects reacts. In case the main effect is death, a lethal dose LD is defined. Depending on the value of α several doses of practical importance are defined. α = 0. Effective/lethal dose ED0/LD0 - the estimated dose at which none of the population is expected to react or die. α = 1. One hundred percent effective/lethal dose ED100/LD100 – the lowest possible dose that causes effects in all studied cases. α = 0.5. Median effective/lethal dose ED50/LD50 - the dose of a drug predicted to produce a characteristic effect in 50 percent of the subjects to whom it is given. The ED50 is the most frequently used standardized dose by means of which the potencies of drugs are compared. An ED50 can be determined only from data involving all or none (quantal) response. Further in the text the term used is ‘effective dose’, being understood that the results will apply also to the lethal dose. Experimental animals are treated with the drugs in order to study the effects and to determine the value of the base dose. The experiment is conducted under n dose levels. For each dose level x (i = 1,2,…n), N animals are examined. They are observed i i over a period of time T sufficient for the investigated drug to demonstrate its action and at the end of the experiment the number of reactived animals z is noted. The i probability that an individual animal on dose level x responds by time T is defined as: i p = z/ N i i i It is obvious that 0 = p ≤ p ≤···≤ p ≤ p = 1. 0 1 n n+1 The function presented by arranged pairs of points (x , p) is cumulative i i distribution function. It belongs to the class of the so-called S-shaped curves [1]. Many publications deal with the problem of calculating the effective dose. The approaches for calculation of dose-effect based on the cumulative curve used in most of them are described below. 1.1 Linear regression models. Linear interpolation. The effects determined during the experiment of the applied doses are associated with line segments. The desired effective dose is calculated taking into account the coordinates in the relevant line segment. In case of explicit non-linearity a logarithmic scaling is performed upon the data. Thus the graph acquires a relatively linear form, and then linear interpolation is applied. This approach uses the assumption that there is a logarithmic correlation between doses and effect of doses. For the entire range of applied doses that is not true, but in a narrow range around the relevant dose the method is appropriate. These two approaches were used before the era of modern computers when the most powerful computational tools were a pocket calculator and graph paper for Journal of Information, Control and Management Systems, Vol. 8, (2010), No.3 259 drawing graphs. Publications dealing with these methods are available from the 1970s till now [2,3,4]. 1.2 Nonlinear regression models. Interpolation of the data with cubic spline [5,6,7]. Piece-wise spline inperpolation is a convenient tool for interpolation of data being continuously differentiable to the second derivative. When appropriate software is available, this approach is very convenient and has great precision.. Modeling techniques based on the analytical form of mathematical function. They assume a functional relationship between “dose” and “response”, according to a pre-specified parametric model, such as a linear, quadratic, logarithmic, exponential, logistic and so on [8]. By the application of minimizing numerical procedure, parameters of the function are determined, i.e. data identification is implemented. This approach provides the greatest accuracy due to the use of computers and specialized software. Another benefit of this approach is the additional parameters that the software can produce and the possibility for graphical illustration of the results. The most commonly used optimization methods are the method of least squares, the method of weighting by least squares or the maximum likelihood method [9]. This paper aims to formulate a mathematical model of dose-response relationship using experimental data. The goal is to model the curve in the domain of definition and to minimize the possible error of approximation. The model is derived applying algorithm and software for system identification developed by the author [10,11,12]. 2 MODEL STRUCTURE One of the possible equations for describing a sigmoidal curve is the Verhulst – Pearl equation: dp(x) p(x) r1 p(x) (1) dx K p(x ) = C 0 0 Where: r – the rate constant of the process K –the asymptotic limit of growth (carrying capacity). This equation express quantitative changes in the presence of limited resources, represented by the constant К. The solution p(x) of Eq.(1) is the logistic function: 260 Dose-Effect Modeling of Experimental Data K p(x) K (2) 11 e(r.x) C 0 C = p(0) 0 This equation may be revised in view of the goal to identify an effective dose, thereby making K=1. Тo avoid the case C =0 in Eq.(2), аs training logistic function as 0 proposed in [11] is used: 1 p(x) (3) 1 Ae(r*x) A>0, r > 0 Three parameters are necessary to specify the curve, A, r and β, defined as follows: A is the number of times that the initial value p(0) must grow to reach 100%. r is the slope or the growth rate parameter that specifies "width" or "steepness" of the S-curve. β is a dose correction parameter. These parameters must be identified by fitting the Eq.(3) to the experimental data. 3 DATA IDENTIFICATION To fit the function p(x) to the data, an identification procedure will be performed. The values p in n number of points are known as a result of the measurement. They i are used to identify the parameters А,r, β. Let us chose to subject to system identification the function 1 p(x,A,r,) p (4) i 1 Ae(r*xi) i A > 0, r > 0, i=1,2,…,n The identification vector Q(A,r,β) must be defined from the system (4). But this system is not well defined. For each point x the residual d is: i i d(Q) = || p-p(x,Q) || i i i The aim is to minimize d(Q) for the whole interval of identification [x -x ]: i n 1 D(Q) = inf|| d(Q) || < , >0 (5) i Thus the identification goal is translated into an optimization problem. The optimization method is defined depending on the way of minimization of D(Q). Least square fitting. Journal of Information, Control and Management Systems, Vol. 8, (2010), No.3 261 An object of minimization is the functional: n D(Q)d2 min (6) i i1 Uniform fitting The uniform fitting is aimed at minimizing the maximal deviation in the base points p: i D(Q)max|d |min (7) i The Cyclic coordinate descent (CCD) method is an iterative heuristic search technique that attempts to minimize the error by varying one parameter at a time [10]. It reduces the multi-criteria optimization to the single-criteria one. A number of passes are made over the model equation to find the global minimum D for system (5). 4 CALCULATION OF DOSE-EFFECT PARAMETERS To calculate the effective or lethal dose, it is necessary to find the inverse function p-1(x). After trivial transformations of Eq.(3): 1 p() x() lnAln (9) r 1 p() As a result of the identification, the vector Q(A,r,β) is known and x(α) is calculated directly from Eq.(9). For most commonly used dose ED50 (LD50) at p(α)=0.5: 1 ED50 x(0.5) lnA (10) r The final result is easy for computing. The main burden is the process of identification of these three parameters. The received analytical model of the sigmoidal curve allows obtaining other characteristics of dose-effect curve. After differentiation of Eq.(3) the frequency f(x) of the effectiveness of doses analytically and graphically (fig.1) is obtained: dp(x) Are(rx) f(x) (11) dx [1 Ae(rx)]2 The extreme point corresponds to ED50/LD50. Thus the dose value of x(0.5) can be calculated from the equation df(x) dx 0. Solution of the equation d2f(x) 0 dx2 262 Dose-Effect Modeling of Experimental Data gives the coordinates of inflexion points I , I : x1 x2 2 3 2 3 ln ln A A I I x1 r x2 r and they are used to calculate the standard deviation of the frequency of doses (fig.1): 2 3 ln I I 2 3 1.317 (12) SD x2 x1 2 2r r It is possible to define three dose ranges: Low response dose. These doses are lower than (x(0.5)-2SD) Mean response dose. Doses in range (x (0.5)-2SD, x (0.5)+2SD) High response dose. Doses greater than (x (0.5)+2SD) Figure.1. Dose frequency 5 EXAMPLE The method is implemented in program KORELIA–Dynamics using its ability for system identification. A dialog window is appended for calculation of ED50/LD50, standard deviation and visualization of sigmoidal curve and frequency of doses. The example from page 44 of [2] will be considered. The author discusses a toxicity of antibiotic tubazid. The calculations are performed using the Behrens approach (cit. Behrens B., Arch.exper. Path. Pharm., 140, 237, 1929) and Кoerber (cit. Кoerber G., Arch.exper. Path. Pharm., 162, 480, 1931). The experimental data are scattered on fig.2. Both methods are based on linear interpolation to determine LD50 and standard deviation SD. Cited data are: Behrens method: LD50±SD = 172.3 ±9.5 [mg/kg]. Кoerber method: LD50±SD =172.5 ±15 [mg/kg]. The scatterplot on fig.2 should clearly indicate the appropriateness of using a logistic model (3) to fit this data. After identification with KORELIA-Dynamics the calculated model parameters are: A = 927.3; r = 0.1855; β = -25.13 Journal of Information, Control and Management Systems, Vol. 8, (2010), No.3 263 Identification errors are: absolute = 0.0346; relative = 0.0834; quadratic = 0.00453. The identified equation for data is: 1 p(x) 1927.3e(0.1855*x25.13) Thus, the LD50 and SD according to equations (10) and (12) are LD50±SD = 172.3034 ±7.0997 [mg/kg]. On fig.2 the experimental data, identification curve and frequency of doses are presented. Figure.2. Graph of the identification curve and frequency of doses 6 CONCLUSIONS This work proposes an analytical model for determining the dose-effect. А modified logistic function is used for data identification. Parameters of the model are calculated with the optimization procedure that targets minimal residual using least- square or uniform criteria. From the analytical model formulas for effective dose and standard deviation of doses are derived. The advantages of the proposed analytical model are: Explicit provides the frequency of doses and standard deviation. Can provide the smallest error of approximation, because it reflects the nonlinearity of the process by selecting the appropriate function; Shows the behavior of the system throughout the range of tested doses; Drops some requirements as strictly regulated modification of doses to conduct the experiment at pre-selected law;