ebook img

Stochastic Optimal Control for Modeling Reaching Movements in the Presence of Obstacles: Theory and Simulation PDF

1.1 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Stochastic Optimal Control for Modeling Reaching Movements in the Presence of Obstacles: Theory and Simulation

Stochastic Optimal Control for Modeling Reaching Movements in the Presence of Obstacles: Theory and Simulation Arun Kumar Singh1,2, Member, IEEE, Sigal Berman1, Member, IEEE, and Ilana Nisky2, Member, IEEE Abstract—In many human-in-the-loop robotic applications motiontomotionpredictioninaparticulartask[8].However, such as robot-assisted surgery and remote teleoperation, pre- the accuracy of the model that is used for a particular dicting the intended motion of the human operator may be problem is critical to the success of IOC-based approaches. usefulforsuccessfulimplementationofsharedcontrol,guidance 7 virtual fixtures, and predictive control. Developing computa- Whilemanystudiesconsideredoptimalcontrolformodeling 1 tional models of human movements is a critical foundation reaching trajectories in free space [2], [4], [5], [6], there 0 for such motion prediction frameworks. With this motivation, has been much less effort towards modeling reaching in the 2 we present a computational framework for modeling reaching presenceofobstaclesusingoptimalcontrol[22].Thisinturn, movementsinthepresenceofobstacles.Weproposeastochastic n hinders the development of efficient IOC based approaches optimal control framework that consists of probabilistic col- a for prediction. J lision avoidance constraints and a cost function that trades- off between effort and end-state variance in the presence of In the current paper, we propose a stochastic optimal 6 a signal-dependent noise. First, we present a series of refor- controlframeworkformodelinghumanreachingtrajectories mulations to convert the original non-linear and non-convex in the presence of obstacles. This framework is designed ] O optimal control into a parametric quadratic programming to be incorporated in motion prediction for a variety of problem. We show that the parameters can be tuned to model R applications of teleoperation in cluttered spaces. Our pro- various collision avoidance strategies, thereby capturing the . quintessentialvariability associated with humanmotion. Then, posedframeworkisbuiltonexperimentalstudiesthatsuggest s c we present a simulation study that demonstrates the complex that reaching movements amongst obstacles are optimized [ interaction between avoidance strategies, control cost, and the consideringthelikelihoodofcollision[9],[10],[12],andthat probabilityofcollisionavoidance.Theproposedframeworkcan obstacle avoidance is sensitive to human perception of free 1 benefit a variety of applications that require teleoperation in v clutteredspaces,includingrobot-assistedsurgery.Inaddition,it space [10]. In line with these findings, the proposedoptimal 7 can also beviewed as a new optimizer which producessmooth control model incorporates probabilistic collision avoidance 4 and probabilistically-safe trajectories under signal dependent constraintstoensurethatthelikelihoodof collisionisbelow 5 noise. a specified threshold. We also consider signal-dependent 1 noise in human movement control [11], and the uncertainty 0 I. INTRODUCTION in the perception of the size of the obstacle to model the . 1 Motion prediction is important for finding the middle error in estimation of free space. 0 ground between pure teleoperation and autonomous con- We present a series of reformulations based on [14] to 7 1 trol of robotic systems. It allows the robot to anticipate approximate a difficult non-linear and non-convex optimal : the future motions of the users and, consequently, their control problem by a parametric quadratic optimization v intention, and assist them in performing a given task. To problem.The reformulationsuse substitution ofchancecon- i X improve the performance of motion prediction algorithms, straints with a family of surrogate constraints. Satisfaction r it is beneficial to ground the prediction in experimentally- of each member of the family of surrogate constraints a validated computational models of human movement [1]. can be mapped to a lower bound probability with which Optimal control is used extensively in computational motor the original chance constraints would be satisfied. Further, control,andprovidesa powerfulframeworkforexplaininga we show that the parameters of the reformulated quadratic wide range of empirical phenomena associated with human optimizationproblemcanbetunedtogenerateadiverseclass motion [2], [3], [4]. In this view, it is hypothesized that of trajectories, and simulate the quintessential diversity of human motion is driven by well-defined rewards or cost human motion. We present a simulation study highlighting functions.ThecomplimentaryInverseOptimalControl(IOC) the interaction of the parameters with the control cost and framework attempts to identify the structure and parameters the probability of collision avoidance. We also discuss how ofthesecostfunctionsfromasetofobservedtrajectories[7]. the main insights derived from the simulation study agrees Thus,IOCallowsforthetransitionfrommodelingofhuman with the existing experimentalfindingson obstacle avoiding reaching movements. TheresearchwassupportedbytheHelmsleyCharitableTrustthroughthe The rest of the paper is organized as follows. Section Agricultural, Biological and Cognitive Robotics Initiative of Ben-Gurion II reviews the existing works which considers collision University of Negev, Israel, and by the Israeli Science Foundation (grant 823/15) avoidance within the context of optimal control. Section III 1 Department of Industrial Engineering and Management, Ben-Gurion presents the optimal control problem followed by a series University oftheNegev,Israel 2 DepartmentofBiomedical Engineering, Ben-Gurion University ofthe of reformulations to convert it into a tractable parametric Negev,Israel quadratic optimization problem. Section IV presents a se- quentialquadraticprogrammingbased algorithmfor solving III. PROBLEM FORMULATION the reformulated optimization problem. Section V presents A. Dynamics and Task Description simulationresultsthatdemonstratehowtheparametersofthe We consider the task of reaching movements in a 2D reformulated problem result in a diverse set of trajectories cluttered environment. We chose a simple linear model for and control costs. In section VI we discuss the results of themovementoftheendpointofthehand-atripleintegrator oursimulationsinlightoftheexistingexperimentalfindings – system. We denote the state of the hand at time instant t on reaching movements among obstacles and present future by Xt = (xt,yt,x˙t,y˙t,x¨t,y¨t), where the individual state directions. variables are defined as the following normal distributions: II. RELATEDWORK xt ≈N(µt,σ2 ),yt ≈N(µt,σ2 ) (1) x xt y yt Optimal control or optimization based obstacle avoidance x˙t ≈N(µ ,σ2 ),y˙t ≈N(µ ,σ2 ) x˙t x˙t y˙t y˙t hasbeenextensivelystudiedintheroboticsliterature,andto x¨t ≈N(µ ,σ2 ),y¨t ≈N(µ ,σ2 ). a smaller extent, in computational motor control. x¨t x¨t y¨t y¨t A. Obstacle Avoidance in Robotics The parameters of the distributions, i.e. their means and variances are obtained from the following discrete time Optimal control or optimization are used extensively to ... ... dynamics with jerk U = (u ,u ) = (x,y) as the control plancollision-avoidingtrajectoriesthatalsooptimizeaspec- x y input. ified cost function [15], [16]. In [16], optimal control is ap- plied to stochastic systems with additivenoise,and collision Xt+1 =AXt+B(Ut+εt ), (2) avoidanceisensuredbyintroducingapenaltyontrajectories U that come close to the obstacles. An expectation over the where A and B represent state transition and controlscaling cost is taken which suggests that the optimization is risk matrices of dimensions conforming to that of the state, and neutral;thatis, it doesnotmodelthe probabilityofcollision 2 avoidance. Trajectory optimizerslike [17], [19] incorporates a penalty on the probability of collision avoidance. εtU = φiMiU (3) Xi=1 Other studies put hard constraints on probability of col- lision avoidance [18], [20]. However, these studies assumed M1=(cid:20)c0x 00(cid:21),M2 =(cid:20)00 c0y(cid:21). (4) an additive noise model. We aim at planning trajectories for a human arm which is assumed to be modeled as The term εt in (3) represents the time varying signal- U a stochastic system with signal dependent noise [11]. An dependent noise, and is formulated in terms of constant optimal control based framework presented in [21] presents scaling matrices Mi and φi which are a set of zero-mean collision avoidance under signal dependent noise, but for unit-variance normal random variables. This form of (3) single integrator systems. ensuresthatindeedthestandarddeviationofthenoisegrows linearly with the magnitude of the control signal [4], and the constants c ,c determine the magnitude of noise as a B. Obstacle Avoidance in Computational Motor Control x y fraction of the control input. Reaching trajectories in the presence of obstacles were B. Optimal Control studied in computational motor control for understanding movementcoordination.Experimentalstudies[9],[10],[12], The discrete time optimal control can be represented by [13] investigated the effects of obstacle position and size the following set of equations. on obstacle avoidance. In particular, [13] observed that the obstacle avoidance strategy exhibited by human subject minJ =J +J (5) duringreachtograspmovements,consistedoftwobasic but opt Ut Xt coupled components namely moving around the obstacle or Pr(Cjt(xt,yt,xj,yj,Rj)≤0)≥η,j =1,2..n, slowing down near them. where An optimal control model for a single-obstacle avoidance t=tf was proposedin [22]. Theysolvedtheoptimalcontrolprob- JUt =kUk2,JXt = E[L(Xt,Ut)], (6) lem using simulated annealing. Simple obstacle configura- tX=t0 tions, predominantlywith a single obstaclewere considered. 6 Our proposed approach differs from [22] in terms of the L(Xt,Ut)=Xwi(Xit−Xtf)2, (7) technical approach followed to solve the optimal control i=1 problem.Inparticular,weexploitsomeefficientstructuresin R ≈N(µ ,σ2 ). (8) the problem. Moreover, we consider complex obstacle con- j Rj Rj figurations to highlight the interaction between parameters, Theobjectivefunctionin(5)consistsofacontroleffortterm control cost and probability of collision avoidance. and a state-dependent term which penalizes the end point varianceofthetrajectory.Thetermw determinestherelative i weighting between the components of the state-dependent Var[Ct(.)]=C σ2 +2σ4 (13) cost term. The constraints Cj(.) ≤ 0 in (5) represent the j Rj Rj Rj collision avoidance requirement in a deterministic setting. +h2(µxt,xt∗,µyt,y∗t,σx2t,σy2t,µxj,µyj,µRj), Thus,thesetofinequalitiesin(5)signifyconstraintsthatthe wheretheterms(µ ,µ )and(σ2 ,σ2 )representthemean collision avoidance requirementis satisfied with a particular xt yt xt yt andvarianceofthehandposition(xt,yt).ThetermC and lower bound probability η. The terms xj, yj and Rj denote Rj the position and size of the jth obstacle. To model the functions h1(.) and h2(.) are given below. It can be noted uncertainty in the estimation of obstacle size, R is defined that h2(.) can be represented as sum of squares and thus, is j non-negative. as normally-distributed random variable. The optimization (5) is difficult to solve due to the constraintsonprobabilityofcollisionavoidance,alsoknown CRi =4µ2Ri (14) a[a2s3trc]a.hcaHtanebcnleececf,oonrwmsetraarinendftosr,smhaounwldatteahraethtectshoeemcprheufatonartcmieounclaaoltlniyostnrinanitnraattscutraianbltlloye h1 =µ2Ri +2µxtµxi −µ2xi+−22µµyytµtyy∗ti+−(µx2yt∗i)−2+2µ(xyt∗tx)t∗2 (15) leads to an efficient optimization structure. h2 =2(2µ2xiσx2t +2µ2yiσy2t−4µxiσx2t(xt∗)2−4µyiσy2t(y∗t)2 (16) +2σx2t(xt∗)2+2σy2t(y∗t)2) C. Reformulating Chance Constraints D. Reformulated Optimal Control Problem We follow [14], and substitute of Pr(Ct(.)) with: i To arrive at the final reformulated version of (5), we make the following sequence of observations. The second Pr(Cjt(xt,yt,xj,yj,Rj)≤0)≥η (9) term of the surrogate constraints proposed in (9) is non- k2 ⇒E[Ct(.)]+k (Var[Ct(.)]≤0,η≥ . negative. Thus, for a given k, the surrogate constraints (9) j j 1+k2 p are satisfied when the first term, E[Ct(.)] is sufficiently j wanhdervearEia[nCcjte(.o)]f athned cVoanrst[rCajitn(t.s)]Crejtp(.r)eswenitththreespeexcptecttoatitohne nsmegaaltlivienamndagthneitusdeec.onDduteertmo,(p13()Vaanrd[C(jt1(6.))] iwsesunffiotceienthtlayt random variables xt,yt. This suggests that satisfaction of p(Var[Cjt(.)] isanon-decreasingfunctionofthepositional the deterministic surrogate in 9 ensures satisfaction of the variance at each point of the trajectory (σ2 ,σ2 ). Thus, xt yt original probabilistic constraints with at least a probability k2 . In [14], it is shown that computing an analytical makingp(Var[Cjt(.)] smallisequivalenttominimizingthe 1+k2 positional variance at each point of the trajectory. In light expression for E[Ct(.)] and Var[Ct(.)] in terms of random j j of all these arguments, FOC (5) can be replaced with the variable arguments xt,yt,R etc. is simpler compared to j following simpler problem. computing that for Pr(Ct(.)). j We canfurthersimplify(9) byapproximatingobstaclere- gionsin2Dascircles.Thissimplifiesthecollisionavoidance t=tf tf inequality Ct(.): Jaug =kUk2+ XE[L(Xt,Ut)]+λX(σx2t +σy2t) (17) j t=t0 t=t0 Cjt :−(xt−xj)2−(yt−yj)2+Rj2 ≤0. (10) E[Cjt(.)]+τ ≤0 Because (10) is purely concave in terms of hand position Theoriginaltrajectoryoptimization(5)hasbeenconverted variables xt and yt, an affine upper bound can by obtained to the new formulation (17) by substituting the parameter η by linearizing Cit around an initial trajectory guess (xt∗,y∗t) which represented probability of avoidance in (5) with two [24]: new sets of variables τ and λ. The positive constant τ can be manipulated to make E[Ct(.)] as negative as required j and consequently control the clearance from a given set of Cjt ≈∗ Cjt+▽xtCjt(xt−x∗t)+▽ytCjt(yt−y∗t)≤0, (11) obstacles. Similarly, λ is a positive constant which can be where ∗Ct is obtained by evaluating (10) at (xt,yt). Sim- manipulatedtominimizethepositionalvarianceateachpoint j ∗ ∗ ilarly, ▽ Ct and ▽ Ct represent the partial derivative along the trajectory. Hence, we can manipulate τ and λ to xt j yt j of Ct(.) with respect to xt and yt, evaluated at (xt,yt). achieve a particular probability of avoidance η. Moreover, j ∗ ∗ The affine approximation (11) can be further improved by eachη canbe mappedto variouschoicesof τ andλ leading updating (xt∗,y∗t), during the course of the optimization. to a diverseset of collision avoidancebehaviors.Within this Thissequentiallinearizationofconcaveconstraintsformsthe diverse set, τ determines the geometry of the path, and λ basis of the convex concave procedure [24]. determines the velocity profile along the path. In light of (11), E[Ct(.)] and Var[Ct(.)] take the form The reformulated FOC (17) is very different from those j j typically used in the context of human motion modeling. A central hypothesis in current frameworks is that relative E[Ct(.)]=σ2 (12) j Rj weighting of each term in the cost function can be tuned to +h1(µxt,xt∗,µyt,,y∗t,σx2t,σy2tµxj,µyj,µRj) producea diverse set of trajectories. The FOC (17) takes on a different approach – its parameters appear not only in the Algorithm 1 SequentialQuadraticProgrammingfor solving cost function but also in the constraints. FOC The reformulated FOC (17) can be solved in one shot if Initialization: Initial guess for optimal trajectory xt∗,y∗t. the right set of τ and λ are given. For the cases where such i=0 ,τ =0, λ=1 set is not available, we present a framework for mapping a probability of collision avoidance η to τ and λ and solving while do|Joi+pt1 − Joipt| < ξ and E[Cjt(.)] + (17) in the process. This is discussed next. kp(Var[Cjt(.)]≤0 IV. EFFICIENTLY SOLVING THE PROPOSED FOC if E[Cjt(.)]+kp(Var[Cjt(.)]>0 then τ ←τ +δ Algorithm 1 summarizes a sequential quadratic program- ∆←∆λ ming (SQP) routine for solving FOC (17). The optimization starts with an initial guess trajectory (xt∗,y∗t) and initializa- end if tion of an index counter i and two non-negativevariables τ U ← argminJ aug and λ. The outermost loop checks whether the constraints E[Ct(.)]+τ ≤0 j are satisfied and reduction in the cost function between two Update xt,yt through U ∗ ∗ consecutive iterations is within a specified threshold, ξ. If i←i+1 either of these checks are violated, then the algorithm pro- end while ceedstotheinnerloopwherewecheckwhetherthesurrogate constraints (9) are satisfied. If not, then we increment the value of the τ by δ and λ by a factor of ∆. Thereafter (17) avoidance, η for both trajectories is 0.94. However, both is solvedwith the currentvaluesofτ andλ andthesolution trajectories achieve this probability of collision avoidance obtained is used to update the initial guess trajectory, which through different combinations of τ and λ. The trajec- in turn is used to obtain a better estimate of Ci(.) through j tory resulting from strategy LV was obtained with τ = (11) for the next iteration. 0.0009,λ = 2.28∗106, while that resulting from strategy Algorithm 1 has two important features. Firstly, E[Cjt(.)] HC was obtained with τ = 0.0012,λ = 0.9∗106. These is affine and J is convex quadratic in terms of control aug valuesof τ and λ were obtainedusing differentupdaterates variables.Thus,solving(17) fora givenτ andλ amountsto of of τ andλ in algorithm1. For simulatingstrategyLV we solvinga quadraticprogramming(QP) problem.Thisisturn usedδ =0.00005,∆=10intheupdateruleofτ andλ,and can be accomplishedefficiently throughopen source solvers for simulating strategy HC we used δ = 0.0001, ∆ = 10. like CVX [25]. Secondly, algorithm 1 is different from the Since, τ controls the clearance from the obstacles, setting standard SQP routines used to solve general non-convex higher update rates for τ resulted in trajectories belonging problems in the sense that it does not require a trust region to strategyHC. Onthe otherhand,a lowerupdaterateforτ update. This, in turn, is because the affine approximation puts a higher emphasis on λ and consequentlymanipulation of Ct in (11) acts as a global upper bound for the original j of positional variance through velocity control for collision collision constraints (10) avoidance,thus,resultingintrajectoriesbelongingtostrategy Each η can be mapped to numerous combinations of LV. τ and λ. This redundancy is captured in algorithm 1 by The velocity profiles shown in Fig. 1(b) demonstrate manipulatingthe updateratesof τ and λ. We discuss thisin that a higher λ forces the velocity magnitude along the more detail in Section V with the help of specific examples. trajectory closer to the obstacle (strategy LV) to be small during the initial stages, i.e, while the trajectory is near the V. SIMULATION RESULTS obstacles. Consequently, the positional variance is reduced A. Collision Avoidance Strategies and desired probabilityof collision avoidanceis maintained. To ensure collision avoidance, a human can choose to In contrast, the trajectory with higher clearance from the maintain high clearance from the obstacles resulting in a obstacle (strategy HC) has the liberty to move with faster large deviation from straight line paths. Alternatively, it can velocity and let the variance of the movement grow. The choosetoreducethedeviationbutcompensateforitbymov- velocity magnitude along trajectory resulting from strategy ing with high precisionnear the obstacles(reducepositional LV increaseseventually,butthishappenstowardstheendof variance), which in light of the the signal dependent noise the movement, after crossing the obstacles. (3), requires moving with low velocities near the obstacles. B. Mapping Avoidance Strategies to Control Cost For the ease of exposition,from hereon,we will refer to the slowingdownstrategyas”LowVelocity”orLVandstrategy If we would derive a variant of the optimization (17) of maintaining large clearance from the obstacles as ”high for a system with an additive constant-variance noise, the clearance” or HC. probability of collision avoidance, η would solely depend Boththesestrategiescanbemodeledthrough(17)byusing on the clearance from the obstacles. Thus, increase in η appropriate values for parameters τ and λ. For example, would directly lead to an increase in arc lengths, and con- Fig. 1 shows two solution trajectories of (17) between sequently, control costs. However, to develop a framework the same start and goal configurations. The probability of that is suitable for modeling human arm movements, we 0.2 0.2 Strategy LV, η =0.94 0.8 Strategy LV, total velocity Strategy LV, η=0.86 0.8 Strategy LV, total velocity Strategy HC, η=0.94 Strategy HC, total velocity Strategy HC,η=0.86 Strategy HC, total velocity 0.6 0.6 0.1 y[m]0.1 m/s0.4 y[m] m/s0.4 0 0.2 0 0.2 0 0 −0.1 0 0.1 0.2 0 50 100 0 0.1 0.2 0 50 100 x[m] sim. step x[m] sim. step (a) (b) (a) (b) Fig.1. Demonstrationoftheeffectofthechoiceofτandλonthecollision avoidance strategies. Twosets of trajectories between same start and goal Strategy LV, η =0.95 0.8 Strategy LV, total velocity locations and having same probability of avoidance, η were computed. 0.15 Strategy HC, η=0.95 Strategy HC, total velocity 0.6 However, to generate these two trajectories we used a different set of τ 0.1 sahnodwλn tion gacreheienvewetrheecsopmecpiufiteeddpursoinbgabτilit=y 0o.f0a0v1o2i,dλan=ce.0T.9he∗1tr0aj6e,ctwohriieles y[m] 0.05 m/s0.4 trajectoriesshowninredwerecomputedusingτ =0.0009,λ=2.28∗106. 0 0.2 −0.05 0 0.1 0.2 00 50 100 x[m] sim. step incorporatedsignal dependentnoise [11]. In the presence of (c) (d) signal-dependent noise, control cost of trajectories depends on the probability of avoidance η, and more importantly,on the combination of τ and λ that is used in the optimization (17) to achieve this η. In other words, the control cost 2 JULV/JUHC JULV>JUHC depends on the strategy that is used to achieve a particular probability of collision avoidance. ost1.5 JLV≤ JHC In Fig. 2(a)-(2(d)) we present simulated trajectories that of c U U correspond to both strategy LV and HC for probabilities of o ati collision avoidance η = 0.86 and η = 0.95. The paths that r 1 resulted from strategy HC indeedhas higher clearance from theobstacles.Incontrast,thepathsthatresultedfromstrategy LVhavelowerclearanceandthus,heavilyrelyonmodifying 0.5 thevelocitymagnitudesandconsequentlypositionalvariance 0.7 0.75 0.8 0.85 0.9 0.95 η for collision avoidance. Consequently, paths resulting from (e) strategy HC have higher arc lengths as compared to paths resultingfromstrategyLV.InFig. (4(e))theratioofcontrol Fig.2. Controlcostsvarywithprobabilityofavoidance.(a)-(d)Movements costs for trajectories resulting from both the strategies is withdifferentstrategiesbetweenthesamestartandgoallocations,thesame presentedas a functionof η. Forlowη, pathsresultingfrom obstacleconfigurations,andwithnoiselevelcx,cy=0.15.(a),(c)present thepathswithstandarddeviationellipsesofthetwostrategies.Theobstacles strategy LV which have lower arc lengths are less costly. are represented as blue filled circles and grey shaded region around them But, as η increases, the higher arc length paths resulting represent uncertainty about the size of the obstacle. (b), (d) present the velocityprofiles.(e)theratioofthecontrolcostsbetweenthetwostrategies, from strategy HC become less costly. JULV ,asafunction ofη. The observations discussed above are apparent from the JHC U structure of the optimization (17). Increasing either τ, λ, or both, leads to an increase in the control cost. At low values of η, there is very little restriction on the growth input. Next, we examined how the cost shown in Fig. 2(e) of positional variance and thus the control cost is dictated changes with a reduction in noise. Fig. 3 depicts the ratio by τ which controls the arc length. But as η increases, the of control costs for trajectories resulting from strategy LV effect of λ becomes prominent. This is consistent with the and HC for c = c = 0.05. With lower noise, strategy x y significant reduction in positional variance that is depicted LV becomes less costly even for higher probabilities. This in Fig. 2(c) and the corresponding skewed velocity profile result agreeswith the commonintuition. With a lesser noise showninFig. 2(d).Sincetrajectoriesresultingfromstrategy thereisnoneedtoensurehighclearancefromtheobstacles, LV has lesser clearance from the obstacles, they require a thereby making strategy HC redundant. In fact, for a zero highervalueofλtoachievethesameη (similartotheresult noisesystem,thetrajectorywithleastcontrolcostwouldjust shown in previous section). Thus, at higher probabilities graze the obstacle. trajectories resulting from strategy LV become more costly We would like to highlight that Fig. 2(e) and Fig. 3 are in spite of having lower arc lengths. intended to demonstrate the general trend in ratio of control The results presented above, were obtained with c = costs. An in depth analysis of the exact values and their x c =0.15 in (3). That is, the noise was 15% of the control dependence on the initial conditions of the optimization are y beyond the scope of this current study. accelerationsandjerksandthus,consequentlyhighercontrol costs. To summarize, for collision avoidance strategy LV, main- 1 taininghighηrequireslargerreductioninpositionalvariance 0.95 leading to larger skewness in velocity profiles and conse- 0.9 quently higher control costs. However, since trajectories in ost0.85 JLV<JHC homotopy2requirealargerreductioninpositionalvariance, c U U the control costs along it would increase at a higher rate ratio of 0.07.58 JULV/JUHC tthhaisnltahsattoablsoenrgvattriaojnecitnorFieisg.in4h(oe)mwothoipcyh1s.hoWwesdthemeorantsitoraotef 0.7 control costs along homotopy 1 and homotopy 2 for the various values of η. For lower values (till η =0.9), the cost 0.65 alonghomotopy1andhomotopy2aresimilarowingtotheir 0.6 0.8 0.85 0.9 0.95 similar velocity profiles. However, for higher η, cost along η homotopy 1 is significantly lower than that along homotopy Fig.3. Theratio ofthe control costs between the twostrategies, JULV , 2. asafunction ofη fornoiselevelofcx,cy=0.05. JUHC 2) Strategy HC: Here we re-analyze the cost along ho- motopies for the same configuration as shown in Fig. 4, but with respect to strategy HC where there is a bigger reliance C. Modeling Choice of Homotopies on clearance from the obstacles for collision avoidance. In this section, we discuss how choice of strategy of The trajectories along both the homotopies are shown collision avoidance or in other words, choice of τ and λ in Fig. 5(a) and 5(c). Comparing these trajectories with for a given η affects control cost of trajectories in different Fig. 4(a) and 4(c), demonstrates that there is a significant homotopies. increase in clearance from the obstacles with increase in 1) Strategy LV: In Fig. 4(a) and (4(c)) solution trajecto- η. Thus, a lesser restriction is required on the growth of riesof(17)havingsamestartandgoalpositions,butbelong- positional variance and consequently, the velocity profiles ing to different homotopies and having different probability alongtrajectoriesin boththehomotopybecomeverysimilar of avoidance, η, are depicted. The trajectories in both the evenathigherη (figure5(d)).Thisisverydifferentfromthe homotopies were generated by choosing such values for τ comparisons shown in Fig. 4(d). and λ that ensure collision avoidance by slowing down near The similarity in velocity trajectories in turn results in the obstacles and reducing positional variance (strategy LV) similarcontrolcostsalongboththehomotopies(figure5(e)). rather than taking a large deviation from them. Thus, as η In particular, the control cost along homotopies 1 and 2 are increases from 0.9 (figure 4(a)) to 0.965 (figure 4(c)), we similar for a larger range of η. The lowest ratio of cost is observe only a small change in arc length, but a significant 0.67 in figure 5(e) for η =0.9615. In comparison, the ratio change in the positional variance along the trajectories. was 0.33 in figure 4(e) for the same η. Moreover, since trajectories of homotopy 2 move through a more cluttered environment, the reduction of positional VI. DISCUSSION ANDFUTURE WORK variance along it is higher than that along trajectories of In this paper, we presented a stochastic optimal control homotopy 1. problem with signal dependent noise and probabilistic col- It is possible to relate the change in positional variance lision avoidance constraints as a model of human reaching as η increases to the change in the controlcosts throughthe among obstacles. We then reformulated it into a parameter velocity profiles. Firstly, in contrast to Fig. 4(b), velocity optimization problem. The parameters τ and λ which ap- profiles shown in Fig. 4(d) are skewed; i.e, they have low pearedinthereformulatedoptimizationproblem,(17)served magnitudes during the initial phases and a peak which is asamappingbetweentheprobabilityofcollisionavoidance, shifted towards the right. This is to ensure that velocity η, and possible collision avoidance strategies. magnitudes (and thus positional variance) are low near the The parameterτ modelsthe clearancefromthe obstacles, obstacles and reach peak only after crossing the obstacles. andthe parameterλ modelsthe effectof slowingdownnear Sincetrajectoriesinhomotopy2requirealargerreductionin the obstacles. In our simulations, we demonstrate that effect positional variance, the skewness observed in their velocity ofthesetwoparametersonmovementpathsandvelocitypro- profile is also higher. Finally, the skewness in velocity files is in agreementwith the experimentalfindingsreported profiles is accompanied with higher peak velocities. This is in [13] for reach to grasp movements around obstacles. becauseofthefixedfinaltimeparadigmoftheoptimization, Specifically, they discussed two basic but coupled strategies (17). Since, magnitudes are low during initial phases of the of collision avoidance which consists of moving around the trajectories, it needs to be compensated by moving faster obstacle and slowing down. in obstacle free space to ensure that the goal position is We showed how each avoidance strategy results in a reached in specified time. Now, it is easy to deduce that a unique variation of control costs with respect to η both skewedvelocityprofilewithhigherpeakswouldmeanhigher within and across homotopies. These variation in control 0.2 homotopy1, η=0.90 0.8 homotopy 1, total velocity 0.2 homotopy 1,η=0.90 0.8 homotopy 1, total velocity homotopy2, η=0.90 homotopy2, total velocity homotopy 2 ,η=0.90 homotopy 2, total velocity 0.6 0.6 y[m]0.1 m/s0.4 y[m]0.1 m/s0.4 0 0.2 0 0.2 0 0 −0.1 0 0.1 0.2 0 50 100 0 0.1 0.2 0 50 100 x[m] sim. step x[m] sim. step (a) (b) (a) (b) 0.2 0.2 homotopy1, η =0.9615 0.8 homotopy 1, total velocity homotopy 1, η=0.9615 0.8 homotopy 1, total velocity homotopy2, η =0.9615 homotopy2, total velocity homotopy 2, η=0.9615 homotopy 2, total velocity 0.6 0.6 y[m]0.1 m/s0.4 y[m]0.1 m/s0.4 0 0.2 0 0.2 0 0 −0.1 0 0.1 0.2 0 50 100 0 0.1 0.2 0 50 100 x[m] sim. step x[m] sim. step (c) (d) (c) (d) 1 1 00..89 JUH2≈ JUH1 0.8 JH2≈ JH1 U U ost0.7 ost o of c0.6 JH2>JH1 o of c0.6 rati0.5 U U rati JUH1/JUH2 0.4 JH1/JH2 0.4 U U 0.3 0.2 0.8 0.85 0.9 0.95 0.2 0.8 0.85 0.9 0.95 η η (e) (e) Fig.4. Movementsbetweenthesamestartandgoallocationsandobstacle Fig.5. Movementsbetweenthesamestartandgoallocationsandobstacle configurations but with different probability of avoidance. (a), (c) present configurations but with different probability of avoidance, η . The results the paths with standard deviation ellipses of the two homotopies. The aresimilartothatshowninFig.4,buttrajectories arenowcomputedwith obstaclesarerepresentedasbluefilledcirclesandgreyshadedregionaround respect to strategy HC, which gives higher emphasis on clearance from them represent uncertainty about the size of the obstacle. (b), (d) present obstacles for obstacle avoidance. (e): Ratio of control cost along the two the velocity profiles. (e) the ratio of the control costs between the two JH1 homotopies, JJUHH21, as a function of η. For the chosen avoidance strategy homotopies,JUUH2 withrespect tostrategyHC. U LV,thecontrol costalongthehomotopies issimilarforlowη.Forhigher η,thecontrolcostalonghomotopy1issignificantly less. random behavior. However, as the ratio of control costs departsfromunity,the possibility of selection would incline costs can be used as a basis for predicting user behavior towards the lesser control costs, thus leading to more well between a given start and goal position and for a given defined pattern. obstacle environment. For example, in Fig. 2(a)-2(c), 2(e), Our proposed framework has the following limitations. weshowedthatarisk-seekingbehavior(lowη)ismorelikely Firstly, we used a very simple dynamic model, and thus, to use strategy LV for collision avoidance as it requiresless we necessarily do not capture every aspect of the motion of control effort. In contrast, a risk-averse behavior (high η) the human arm. A second order linear mechanical system would likely choose strategy HC. or a non-linear model of a serial link robotic manipulator We also showed how control cost along different homo- are a betteralternative.Thesecondordermechanicalsystem topies is dictated by the choice of avoidance strategy. This can be easily incorporated because as long as the system variationincontrolcostcanbeusedtopredictthehomotopy is linear, the structure of the optimization (17) would not selection by the human. In particular, if two competing change. In contrast, incorporatingeven a simple planar two- homotopieshavesimilarcontrolcosts,thenthehumanwould linkmanipulatormodelismorechallenging,andmayrequire have equal affinity towards either of it, thus leading to a methodssimilartothatproposedin[26].Secondly,thefixed final time paradigm of optimization (17) is not equipped to [16] Kalakrishnan, Mrinal, Sachin Chitta, Evangelos Theodorou, Peter capture the effect of increase in traversal time of reaching Pastor,andStefanSchaal.”STOMP:Stochastictrajectoryoptimization for motion planning.” In Proc. of Robotics and Automation (ICRA), motionsdue to presence of obstacles. A possible solution to 2011IEEEInternational Conferenceon,pp.4569-4574.IEEE,2011. thiscouldbedevelopedusingthetimescalingconcepts[14]. [17] Schulman, John,YanDuan,JonathanHo,AlexLee,IbrahimAwwal, Fromoursimulationstudy,weconcludethatiftheparam- Henry Bradlow, Jia Pan, Sachin Patil, Ken Goldberg, and Pieter Abbeel. ”Motion planning with sequential convex optimization and eters τ and λ are known, the possible choice of homotopy convex collision checking.” The International Journal of Robotics as well as choice of trajectory within that homotopy can be Research33,no.9(2014):1251-1270. predicted. Thus, currently our efforts are focused towards [18] Du Toit, Noel E., and Joel W. Burdick. ”Robot motion planning in dynamic,uncertainenvironments.”IEEETransactionsonRobotics28, developing an inverse optimization framework which can no.1(2012):101-115. automaticallyrecoverthese parametersfromexampletrajec- [19] Mller,Jrg,andGauravS.Sukhatme.”Risk-awaretrajectorygeneration toriesdemonstratedby the human.Furthermore,our simula- withapplicationtosafequadrotorlanding.”InProc.of2014IEEE/RSJ InternationalConferenceonIntelligentRobotsandSystems,pp.3642- tionyieldedtestablepredictionsabouthomotopyandstrategy 3648.IEEE,2014. choicesthatweplantoconfirminhumanexperiments.These [20] Blackmore, Lars, Masahiro Ono, and Brian C. Williams. ”Chance- two advancements taken together will eventually lay the constrainedoptimalpathplanningwithobstacles.”IEEETransactions onRobotics 27,no.6(2011):1080-1094. ground towards using our framework for motion prediction [21] VanDenBerg,Jur,SachinPatil,andRonAlterovitz.”Motionplanning in realistic scenarios. under uncertainty using iterative local optimization in belief space.” The International Journal of Robotics Research 31, no. 11 (2012): REFERENCES 1263-1278. [22] Hamilton,AntoniaF.deC.,andDanielM.Wolpert.”Controlling the statistics of action: obstacle avoidance.” Journal of neurophysiology [1] Wolpert,DanielM.,andZoubinGhahramani.”Computational princi- 87,no.5(2002):2434-2440. plesofmovementneuroscience.”NatureNeuroscience3(2000):1212- [23] Nemirovski, Arkadi. ”On safe tractable approximations of chance 1217. constraints.” European Journal of Operational Research 219, no. 3 [2] Flash, Tamar, and Neville Hogan. ”The coordination of arm move- (2012):707-718. ments:anexperimentallyconfirmedmathematicalmodel.”Thejournal [24] Boyd, Stephen. ”Sequential convex programming.” Lecture Notes, ofNeuroscience 5,no.7(1985):1688-1703. StanfordUniversity (2008). [3] Flash, Tamar, Yaron Meirovitch, and Avi Barliya. ”Models of hu- [25] Grant, Michael, and Stephen Boyd. ”CVX: Matlab software for manmovement: Trajectory planning and inverse kinematics studies.” disciplined convex programming, version 1.21 (2011).” Available: Robotics andAutonomousSystems61,no.4(2013):330-339. cvxr.com/cvx (2010). [4] Todorov, Emanuel. ”Optimality principles in sensorimotor control.” [26] Todorov, Emanuel, and Weiwei Li. ”A generalized iterative LQG NatureNeuroscience 7,no.9(2004):907-915. methodforlocally-optimal feedback controlofconstrained nonlinear [5] Scott, Stephen H. ”Optimal feedback control and the neural basis stochasticsystems.”InProc.of 2005,AmericanControlConference, of volitional motor control.” Nature Reviews Neuroscience 5, no. 7 2005,pp.300-306.IEEE,2005. (2004):532-546. [6] Diedrichsen, Jrn, Reza Shadmehr, and Richard B.Ivry. ”Thecoordi- nationofmovement:optimalfeedbackcontrolandbeyond.”Trendsin cognitive sciences 14,no.1(2010):31-39. [7] Li,Weiwei,EmanuelTodorov,andDanLiu.”Inverseoptimalitydesign forbiologicalmovementsystems.”IFACProceedingsVolumes44,no. 1(2011):9662-9667. [8] Mainprice,Jim,RafiHayne,andDmitryBerenson.”Predictinghuman reaching motion in collaborative tasks using inverse optimal control anditerative re-planning.” In2015IEEEInternational Conferenceon Robotics andAutomation(ICRA),pp.885-892.IEEE,2015. [9] Chapman, CraigS.,andMelvynA.Goodale. ”Missinginaction: the effect of obstacle position and size on avoidance while reaching.” Experimental BrainResearch191,no.1(2008):83-97. [10] Chapman,CraigS.,andMelvynA.Goodale.”Seeingalltheobstacles inyourway:theeffectofvisualfeedbackandvisualfeedbackschedule onobstacle avoidance whilereaching.” Experimental BrainResearch 202,no.2(2010):363-375. [11] Harris, Christopher M., and Daniel M. Wolpert. ”Signal-dependent noisedeterminesmotorplanning.”Nature394,no.6695(1998):780- 784. [12] Mon-Williams, M., and R. D. McIntosh. ”A test between two hy- potheses and a possible third way for the control of prehension.” Experimental BrainResearch134,no.2(2000):268-273. [13] Tresilian,JamesR.”Attentioninactionorobstructionofmovement?A kinematicanalysisofavoidancebehaviorinprehension.”Experimental BrainResearch120,no.3(1998):352-368. [14] Gopalakrishnan, Bharath, Arun Kumar Singh, and K. Madhava Kr- ishna. ”Closed form characterization of collision free velocities and confidence bounds for non-holonomic robots in uncertain dynamic environments.” In Proc. of Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, pp. 4961-4968. IEEE, 2015. [15] Zucker, Matt, Nathan Ratliff, Anca D. Dragan, Mihail Pivtoraiko, Matthew Klingensmith, Christopher M. Dellin, J. Andrew Bagnell, and Siddhartha S. Srinivasa. ”CHOMP: Covariant Hamiltonian opti- mizationformotionplanning.”TheInternational JournalofRobotics Research32,no.9-10(2013):1164-1193.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.