Table Of Content

ROBUSTANDREAL-TIMEDEEPTRACKINGVIAMULTI-SCALEDOMAINADAPTATION XinyuWang1,HanxiLi1∗,YiLi2,FuminShen3,FatihPorikli4 JiangxiNormalUniversity,China1 ToyotaResearchInstituteofNorthAmerica,USA2 UniversityofElectronicScienceandTechnologyofChina,China3 AustralianNationalUniversity,Australia4 7 1 ABSTRACT 0 2 Visualtrackingisafundamentalproblemincomputervision. Tracking n Recently,somedeep-learning-basedtrackingalgorithmshave Car Domain a beenachievingrecord-breakingperformances. However,due J to the high complexity of deep learning, most deep trackers Human Face 3 suffer from low tracking speed, and thus are impractical in many real-world applications. Some new deep trackers with ] V smallernetworkstructureachievehighefficiencywhileatthe Classification Domain C cost of significant decrease on precision. In this paper, we . proposetotransferthefeatureforimageclassificationtothe s visualtrackingdomainviaconvolutionalchannelreductions. c [ The channel reduction could be simply viewed as an additional convolutional layer with the specific task. It not only Fig. 1. The high level concept of the proposed MSDAT 1 extractsusefulinformationforobjecttrackingbutalsosignif- tracker. Left: most of the deep neural network is pretrained v 1 icantlyincreasesthetrackingspeed. Tobetteraccommodate for image classification, where the learning algorithm focus 6 the useful feature of the target in different scales, the adap- onobjectclasses. Right: anadaptationisperformedtotrans- 5 tation filters are designed with different sizes. The yielded fer the classification features to the visual tracking domain, 0 visualtrackerisreal-timeandalsoillustratesthestate-of-the- where the learning algorithm treats the individual object in- 0 art accuracies in the experiment involving two well-adopted dependently. . 1 benchmarkswithmorethan100testvideos. 0 7 IndexTerms— visualtracking,deeplearning,real-time 1 : v 1. INTRODUCTION i X Visual tracking is one of the long standing computer vision r a tasks. During the last decade, as the surge of deep learning, transfers the features from the classification domain to the moreandmoretrackingalgorithmsbenefitfromdeepneural trackingdomain,wheretheindividualobjects,ratherthanthe networks,e.g. ConvolutionalNeuralNetworks[1,2]andRe- image categories, play as the learning subjects. In addition, current Neural Networks [3, 4]. Despite the well-admitted theadaptationcouldbealsoviewedasadimension-reduction success, a dilemma still existing in the community is that, processthatremovestheredundantinformationfortracking, deep learning increases the tracking accuracy, while at the and more importantly, reduces the channel number signifi- costofhighcomputationalcomplexity.Asaresult,mostwell- cantly. This leads to a considerable improvement on track- performing deep trackers usually suffer from low efficiency ing speed. Figure 1 illustrates the adaptation procedure. To [5,6]. Recently,somereal-timedeeptrackerswereproposed accommodatethevariousfeaturesofthetargetobjectindif- [7, 8]. They achieved very fast tracking speed, but can not ferentscales,wetrainfilterswithdifferentsizes,asproposed beat the shallow methods in some important evaluations, as in the Inception network [9] in the domain adaptation layer. weillustratelatter. Our experiment shows that the proposed MSDAT algorithm In this paper, a simple yet effective domain adaptation runsinaround35FPSwhileachievesveryclosetrackingac- algorithm is proposed. The facilitated tracking algorithm, curacytothestate-of-the-arttrackers. Toourbestknowledge, termed Multi-Scale Domain Adaptation Tracker (MSDAT), ourMSDATisthebest-performingreal-timevisualtracker. 2. RELATEDWORK ispre-trainedusingtheILSVRCdataset[16]forimageclas- sification, where the learning algorithm usually focus on the Similar to other fields of computer vision, in recent years, objectcategories. Thisisdifferentfromvisualtrackingtasks, more and more state-of-the-art visual trackers are built on wheretheindividualobjectsaredistinguishedfromotherones deeplearning.[1]isawell-knownpioneeringworkthatlearns (eventhosefromthesamecategory)andthebackground. In- deep features for visual tracking. The DeepTrack method tuitively,itisbettertotransfertheclassificationfeaturesinto [10,2]learnsadeepmodelfromscratchandupdatesitonline thevisualtrackingdomain. andachieveshigheraccuracy. [11,12]adoptsimilarlearning strategies, i.e., learning the deep model offline with a large number of images while updating it online for the current Multi-Scale Adaption video sequence. [13] achieves real-time speed via replacing Conv3_5 32@56x56 𝜔1 theslowmodelupdatewithafastinferenceprocess. 𝜔2 The HCF tracker [5] extracts hierarchical convolutional 𝜔3 Conv4_5 features from the VGG-19 network [14], then puts the fea- 64@28x28 KCF Respond tures into correlation filters to regress the respond map. It Conv5_5 can be considered as a combination between deep learning 64@14x14 and the fast shallow tracker based on correlation filters. It achieveshightrackingaccuracywhilethespeedisaround10 Input Conv1 Conv2 Conv3 Conv4 Conv5 fps. HyeonseobNametal. proposedtopre-traindeepCNNs 3@224x224 64@224x224 112@128x128 256@56x56 512@28x28 512@14x14 in multi domains, with each domain corresponding to one trainingvideosequence[6]. Theauthorsclaimthatthereex- Fig. 2. The network structure of the proposed MSDAT istssomecommonpropertiesthataredesirablefortargetrep- tracker.Threelayers,namely,conv3 5,conv4 5andconv5 5 resentationsinalldomainssuchasilluminationchanges. To are selected as feature source. The domain adaption (as extractthesecommonfeatures,theauthorsseparatedomain- showninyellowlines)reducesthechannelnumberby8times independent information from domain-specific layers. The and keeps feature map size unchanged. Better viewed in yielded tracker, termed MD-net, achieves excellent tracking color. performancewhilethetrackingspeedisonly1fps. In this work, we propose to perform the domain adapta- Recently, some real-time deep trackers have also been tion in a simple way. A “tracking branch” is “grafted” onto proposed. In[7],DavidHeldetal. learnadeepregressorthat each feature layer, as shown in Fig. 2. The tracking branch canpredictthelocationofthecurrentobjectbasedonitsap- is actually a convolution layer which reduces the channel pearanceinthelastframe. Thetrackerobtainsamuchfaster number by 8 times and keeps feature map size unchanged. trackingspeed(over100fps)comparingtoconventionaldeep Theconvolutionlayeristhenlearnedviaminimizingtheloss trackers. Similarly, in [8] a fully-convolutional siamese net- functiontailoredfortracking,asintroducedbelow. work is learned to match the object template in the current frame. It also achieves real-time speed. Even though these real-timedeeptrackersalsoillustratehightrackingaccuracy, 3.2. Learningstrategy there is still a clear performance gap between them and the The parameters in the aforementioned tracking branch is state-of-the-artdeeptrackers. learnedfollowingasimilarmannerasSingleShotMultiBox Detector (SSD), a state-of-the-art detection algorithm [17]. 3. THEPROPOSEDMETHOD When training, the original layers of VGG-19 (i.e. those onesbeforeconvx 5arefixedandeach“trackingbranch”is In this section, we introduce the details of the proposed trained independently) The flowchart of the learning proce- tracking algorithm, i.e., the Multi-Scale Domain Adaptation dureforonetrackingbranch(basedonconv3 4)isillustrated Tracker(MSDAT). inupperrowofFigure3,comparingwiththelearningstrategy ofMD-net[6](thebottomrow). Toobtainacompletedtrain- ing circle, the adapted feature in conv3 5 is used to regress 3.1. Networkstructure thobjects’locationsandtheirobjectnessscores(showninthe In HCF [5], deep features are firstly extracted from multi- dashedblock). Pleasenotethatthedeeplearningstageinthis plelayersfromtheVGG-19network[14], andasetofKCF work is purely offline and the additional part in the dashed [15] trackers are carried out on those features, respectively. blockwillbeabandonedbeforetracking. The final tracking prediction is obtained in a weighted vot- InSSD,anumberof“defaultboxes”aregeneratedforre- ingmanner. Followingthesettingin[5], wealsoextractthe gressingtheobjectrectangles. Furthermore,toaccommodate deep features from conv3 5, conv4 5 and conv5 5 network the objects in different scales and shapes, the default boxes layersoftheVGG-19model. However,theVGG-19network also vary in size and aspect ratios. Let m ∈ {1,0} be an i,j labelednegativeinonedomaincouldbeselectedasapositive Conv132_@5_1n4ox1rm4_loc Location sample in another domain. Given the training video number Smooth 𝑙1Loss isC andthedimensionofthelastconvolutionlayerisd ,the c SoftmaxLoss MD-netlearnsC independentdc×2fully-connectedalterna- Conv3_5_norm_conf Class tivelyusingC soft-maxlosses,i.e., 12@14x14 Input Conv1 Conv2 Conv3 Conv3_5 Training 3@224x224 64@224x224128@112x112256@56x5632@56x56 Mi :Rdc →R2,∀i=1,2,...,C (4) MSDAT fc Softmax Cross where Mi ,∀i ∈ {1,2,...,C} denotes the C fully- EntropyLoss fc 𝑭2𝒄𝟔𝟏 connectedlayersthattransferringthecommonvisualdomain Softmax Cross totheindividualobjectdomain,asshowninFigure3. EntropyLoss 𝑭𝒄𝟔𝟐 2 DifferingfromtheMD-net,thedomaininthisworkrefers Softmax Cross Input Conv1 Conv2 Conv3 Fc4 Fc5 EntropyLoss toageneralvisualtrackingdomain,ormorespecifically,the 3@107x10796@51x51256@11x11512@3x3512 512 𝑭𝒄𝟔𝟑 MD-Net Tracker 2 KCF domain. It is designed to mimic the KCF input in visual tracking (see Figure 3). In this domain, different track- Fig. 3. The flow-charts of the training process of MSDAT ing targets are treated as one category, i.e., objects. When and MD-net. Note that the network parts inside the dashed training, the object’s location and confidence (with respect blocksareonlyusedfortrainingandwillbeabandonedbefore to the objectness) are regressed to minimize the smoothed tracking. Betterviewedincolor. l loss. Mathematically, we learn a single mapping function 1 M (·)as conv M :Rdc →R4 (5) msdat indicatorformatchingthei-thdefaultboxtothej-thground wheretheR4spaceiscomposedofoneR2spacefordisplace- truthbox. ThelossfunctionofSSDwrites: ment{x,y}andonelabelspaceR2. 1 Compared with Equation 4, the training complexity in L(m,c,l,g)= (L (m,c)+αL (m,l,g)) (1) N conf loc Equation5decreasesandthecorrespondingconvergencebe- comesmorestable. Ourexperimentprovesthevalidityofthe where c is the category of the default box, l is the predicted proposeddomainadaptation. bounding-boxwhilegistheground-truthoftheobjectbox,if applicable. Forthej-thdefaultboxandthei-thground-truth, thelocationlossLi,j iscalculatedas 3.3. Multi-scaledomainadaptation loc (cid:88) As introduced above, the domain adaption in our MSDAT Li,j(l,g)= m ·smooth (lu−gû) (2) loc i,j L1 i j methodisessentiallyaconvolutionlayer. Todesignthelayer, u∈{x,y,w,h} an immediate question is how to select a proper size for the filters. AccordingtoFigure2,thefeaturemapsfromdifferent wheregû,u ∈ {x,y,w,h}isoneofthegeometryparameter layers vary in size significantly. It is hard to find a optimal ofnormalizedground-truthbox. filer size for all the feature layers. Inspired by the success However,thetaskofvisualtrackingdiffersfromdetection ofInceptionnetwork[9],weproposetosimultaneouslylearn significantly. WethustailorthelossfunctionfortheKCFal- the adaptation filters in different scales. The response maps gorithm,whereboththeobjectsizeandtheKCFwindowsize withdifferentfiltersizesarethenconcatenatedaccordingly,as are fixed. Recall that, the KCF window plays a similar role showninFigure4. Inthisway,theinputoftheKCFtracker asdefaultboxesinSSD[15], wethenonlyneedtogenerate involvesthedeepfeaturesfromdifferentscales. one type of default boxes and the location loss Li,j(l,g) is loc Inpractice,weuse3×3and5×5filtersforallthethree simplifiedas featurelayers. GiventheoriginalchannelnumberisK,each (cid:88) typeoffiltergenerate K channelsandthusthechannelreduc- Li,j(l,g)= m ·smooth (lu−gu) (3) 16 loc i,j L1 i j tionratioisstill8:1. u∈{x,y} Inotherwords,onlythedisplacement{x,y}istakenintocon- 3.4. Makethetrackerreal-time siderationandthereisnoneedforground-truthboxnormal- 3.4.1. Channelreduction ization. Note that the concept of domain adaptation in this work Oneimportantadvantageoftheproposeddomainadaptation is different from that defined in MD-net [6], where differ- istheimprovementofthetrackingspeed. Itiseasytoseethat entvideosequencesaretreatedasdifferentdomainsandthus the speed of KCF tracker drops dramatically as the channel multiple fully-connected layers are learned to handle them numberincrease. Inthiswork,aftertheadaptation,thechan- (see Figure 3). This is mainly because in MD-net samples nelnumberisshrunkby8timeswhichacceleratesthetracker thetraininginstancesinasliding-windowmanner,Anobject by2to2.5times. rulesasHCFtracker[5],theinputwindowis10%largerthan 7x7 the KCF window, both in terms of width and height. Facili- tatedbythelazyfeed-forwardstrategy,intheproposedalgo- rithm,feed-forwardisconductedonlyonceinmorethan60% videoframes. Thisgivesusanother50%speedgain. 5x5 4. EXPERIMENT 4.1. Experimentsetting 3x3 Inthissection,wereporttheresultsofaseriesofexperiment involving the proposed tracker and some state-of-the-art ap- Conv3_4 Conv3_5 proaches. OurMSDATmethodiscomparedwithsomewell- 256@56x56 36@56x56 performingshallowvisualtrackersincludingtheKCFtracker [15],TGPR[18],Struck[19],MIL[20],TLD[21]andSCM Fig.4. Learntheadaptationlayerusingthreedifferenttypes [22]. Also, some recently proposed deep trackers including offilters MD-net[6],HCF[5],GOTURN[7]andtheSiamesetracker [8] are also compared. All the experiment is implemented 3.4.2. Lazyfeed-forward in MATLAB with matcaffe [23] deep learning interface, on acomputerequippedwithaInteli74770KCPU,aNVIDIA Anothereffectivewaytoincreasethetrackingspeedistore- GTX1070graphiccardand32GRAM. duce the number of feed-forwards of the VGG-19 network. The code of our algorithm is published in Bitbucket InHCF,thefeed-forwardprocessisconductfortwotimesat viahttps://bitbucket.org/xinke_wang/msdat, eachframe,oneforpredictionandoneformodelupdate[5]. pleaserefertotherepositoryfortheimplementationdetails. However,wenoticethatthedisplacementofthemovingob- jectisusuallysmallbetweentwoframes.Consequently,ifwe 4.2. ResultsonOTB-50 maketheinputwindowslightlylargerthantheKCFwindow, one can reuse the feature maps in the updating stage if the Similartoitsprototype[24],theObjectTrackingBenchmark new KCF window (defined by the predicted location of the 50 (OTB-50) [25] consists 50 video sequences and involves object)stillresidesinsidetheinputwindow. Wethuspropose 51 tracking tasks. It is one of the most popular tracking alazyfeed-forwardstrategy,whichisdepictedinFigure5. benchmarks since the year 2013, The evaluation is based on twometrics: centerlocationerrorandboundingboxoverlap ratio.Theone-passevaluation(OPE)isemployedtocompare our algorithm with the HCF [5], GOTURN [7], the Siamese tracker[8]andtheaforementionedshallowtrackers. There- Last position sultcurvesareshowninFigure6 Current position FromFigure6wecansee,theproposedMSDATmethod beats all the competitor in the overlapping evaluation while margin rankssecondinthelocationerrortest,withatrivialinferiority (around1%)toitsprototype,theHCFtracker. Recallthatthe MSDATbeatstheHCFwiththesimilarsuperiorityandruns 3timesfasterthanHCF,oneconsidertheMSDATasasuper variationoftheHCF,withmuchhigherspeedandmaintains its accuracy. From the perspective of real-time tracking, our method performs the best in both two evaluations. To our Fig. 5. The illustration of lazy feed-forward strategy. To best knowledge, the proposed MSDAT method is the best- predict the location of the object (the boy’s head), a part of performingreal-timetrackerinthiswell-acceptedtest. theimage(greenwindow)iscroppedforgeneratingthenet- workinput. Notethatthegreenwindowisslightlylargerthan 4.3. ResultsonOTB-100 theredblock,i.e.,theKCFwindowforpredictingthecurrent location. Ifthepredictedlocation(showninyellow)stillre- TheObjectTrackingBenchmark100istheextensionofOTB- sides inside the green lines, one can reuse the deep features 50 and contains 100 video sequences. We test our method bycroppingthecorrespondingfeaturemapsaccordingly. underthesameexperimentprotocolasOTB-50andcompar- ingwithalltheaforementionedtrackers. Thetestresultsare Inthiswork,wegeneratetheKCFwindowusingthesame reportedinTable1 Precision plots Success plots 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 e noisicerP0.5 HCFT(11fps) [89.07] tar sseccu0.5 ours(32fps) [61.41] 0.4 ours(32fps) [88.01] S0.4 SiamFC(58fps) [61.22] DeepTrack(3fps) [82.60] HCFT(11fps) [60.47] 0.3 SiamFC(58fps) [81.53] 0.3 DeepTrack(3fps) [58.92] TGPR(0.66fps) [76.61] TGPR(0.66fps) [52.94] KCF(245fps) [74.24] KCF(245fps) [51.64] 0.2 Struck(10fps) [65.61] 0.2 SCM(0.37fps) [49.90] SCM(0.37fps) [64.85] Struck(10fps) [47.37] 0.1 GOTURN(165fps) [62.51] 0.1 GOTURN(165fps) [45.01] TLD(22fps) [60.75] TLD(22fps) [43.75] MIL(28fps) [47.47] MIL(28fps) [35.91] 0 0 0 5 10 15 20 25 30 35 40 45 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Location error threshold Overlap threshold Fig.6. Thelocationerrorplotsandtheoverlappingaccuracyplotsoftheinvolvingtrackers,testedontheOTB-50dataset. Sequence Ours HCF MD-Net SiamFC GOTURN KCF Struck MIL SCM TLD DPrate(%) 83.0 83.7 90.9 75.2 56.39 69.2 63.5 43.9 57.2 59.2 OS(AUC) 0.567 0.562 0.678 0.561 0.424 0.475 0.459 0.331 0.445 0.424 Speed(FPS) 34.8 11.0 1 58 165 243 9.84 28.0 0.37 23.3 Table1. TrackingaccuraciesofthecomparedtrackersonOTB-100 As can be seen in the table, the proposed MSDAT algo- 5. CONCLUSIONANDFUTUREWORK rithmkeepitssuperiorityoveralltheotherreal-timetrackers and keep the similar accuracy to HCF. The best-performing In this work, we propose a simple yet effective algorithm to MD-net(accordingtoourbestknowledge)enjoysaremark- transferring the features in the classification domain to the ableperformancegapoveralltheothertrackerswhilerunsin visual tracking domain. The yielded visual tracker, termed around1fps. MSDAT, is real-time and achieves the comparable tracking accuracies to the state-of-the-art deep trackers. The experi- mentverifiesthevalidityoftheproposeddomainadaptation. Admittedly, updating the neural network online can lift thetrackingaccuracysignificantly[2,6]. However,theexist- 4.4. Thevalidityofthedomainadaptation ingonlineupdatingschemeresultsindramaticalspeedreduc- tion.Onepossiblefuturedirectioncouldbetosimultaneously Tobetterverifytheproposeddomainadaptation,herewerun updatetheKCFmodelandacertainpartoftheneuralnetwork another variation of the HCF tracker. For each feature layer (e.g. thelastconvolutionlayer). Inthisway,onecouldstrike (conv3 4, conv4 4, conv5 4)ofVGG-19, onerandomlyse- the balance between accuracy and efficiency and thus better lectsoneeighthofthechannelsfromthislayer. Inthisway, trackercouldbeobtained. the input channel numbers to KCF are identical to the pro- posedMSDATandthusthealgorithmcomplexityofthe“ran- 6. REFERENCES domHCF”andourmethodarenearlythesame. Thecompar- [1] NaiyanWangandDit-YanYeung, “Learningadeepcompactimage isonofMSDAT,HCFandrandomHCFonOTB-50isshown representationforvisualtracking,”inNIPS,pp.809–817.2013. inFigure7 [2] Hanxi Li, Yi Li, and Fatih Porikli, “Deeptrack: Learning discrimi- Fromthecurvesonecanseealargegapbetweentheran- nativefeaturerepresentationsonlineforrobustvisualtracking,” IEEE TransactionsonImageProcessing(TIP),vol.25,no.4,pp.1834–1848, domized HCF and the other two methods. In other words, 2016. theproposeddomainadaptationnotonlyreducethechannel [3] Anton Milan, Seyed Hamid Rezatofighi, Anthony Dick, Konrad number, but also extract the useful features for the tracking Schindler,andIanReid, “Onlinemulti-targettrackingusingrecurrent task. neuralnetworks,”arXivpreprintarXiv:1604.03635,2016. Precision plots 1 Success plots 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 noisicerP00..45 etar sseccuS00..45 0.3 0.3 0.2 0.2 HCFT [89.07] ours [61.41] 0.1 ours [88.01] 0.1 HCFT [60.47] random [72.54] random [50.68] 0 0 0 10 20 30 40 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Location error threshold Overlap threshold Fig.7. ThelocationerrorplotsandtheoverlappingaccuracyplotsofthethreeversionoftheHCFtracker: theoriginalHCF, theMSDATandtherandomHCFmethod. TestedontheOTB-50dataset,betterviewedincolor. [4] GuanghanNing,ZhiZhang,ChenHuang,ZhihaiHe,XiaoboRen,and [16] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev HaohongWang, “Spatiallysupervisedrecurrentconvolutionalneural Satheesh,SeanMa,ZhihengHuang,AndrejKarpathy,AdityaKhosla, networksforvisualobjecttracking,”arXivpreprintarXiv:1607.05781, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei, “ImageNet 2016. LargeScaleVisualRecognitionChallenge,” InternationalJournalof ComputerVision(IJCV),vol.115,no.3,pp.211–252,2015. [5] ChaoMa,Jia-BinHuang,XiaokangYang,andMing-HsuanYang,“Hi- erarchicalconvolutionalfeaturesforvisualtracking,” inICCV,2015, [17] WeiLiu,DragomirAnguelov,DumitruErhan,ChristianSzegedy,and pp.3074–3082. Scott Reed, “Ssd: Single shot multibox detector,” arXiv preprint arXiv:1512.02325,2015. [6] Hyeonseob Nam and Bohyung Han, “Learning multi-domain convolutional neural networks for visual tracking,” arXiv preprint [18] Jin Gao, Haibin Ling, Weiming Hu, and Junliang Xing, “Transfer arXiv:1510.07945,2015. learningbasedvisualtrackingwithgaussianprocessesregression,” in ECCV,pp.188–203.2014. [7] David Held, Sebastian Thrun, and Silvio Savarese, “Learning to track at 100 fps with deep regression networks,” arXiv preprint [19] SamHare,AmirSaffari,andPhilipHSTorr,“Struck:Structuredoutput arXiv:1604.01802,2016. trackingwithkernels,”inICCV,2011,pp.263–270. [8] Luca Bertinetto, Jack Valmadre, Joaõ F Henriques, Andrea Vedaldi, [20] BorisBabenko,Ming-HsuanYang,andSergeBelongie,“Visualtrack- andPhilipHSTorr, “Fully-convolutionalsiamesenetworksforobject ingwithonlinemultipleinstancelearning,”IEEETransactionsonPat- tracking,”inECCV,2016,pp.850–865. ternAnalysisandMachineIntelligence(TPAMI),pp.1619–1632,2011. [9] ChristianSzegedy,WeiLiu,YangqingJia,PierreSermanet,ScottReed, [21] Zdenek Kalal, Jiri Matas, and Krystian Mikolajczyk, “Pn learning: DragomirAnguelov,DumitruErhan,VincentVanhoucke,andAndrew Bootstrappingbinaryclassifiersbystructuralconstraints,” inCVPR, Rabinovich, “Goingdeeperwithconvolutions,” inCVPR,2015,pp. 2010,pp.49–56. 1–9. [22] WeiZhong,HuchuanLu,andMing-HsuanYang,“Robustobjecttrack- ingviasparsity-basedcollaborativemodel,”inCVPR,2012,pp.1838– [10] HanxiLi,YiLi,andFatihPorikli, “Deeptrack: Learningdiscrimina- 1845. tivefeaturerepresentationsbyconvolutionalneuralnetworksforvisual tracking,”BMVC,2014. [23] Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, JonathanLong, RossGirshick, SergioGuadarrama, andTrevorDar- [11] NaiyanWang,SiyiLi,AbhinavGupta,andDit-YanYeung, “Transfer- rell,“Caffe:Convolutionalarchitectureforfastfeatureembedding,”in ringrichfeaturehierarchiesforrobustvisualtracking,” arXivpreprint ACMMM,2014,pp.675–678. arXiv:1501.04587,2015. [24] YiWu,JongwooLim,andMing-HsuanYang,“Onlineobjecttracking: [12] SeunghoonHong,TackgeunYou,SuhaKwak,andBohyungHan,“On- Abenchmark,”inCVPR,2013,pp.2411–2418. line tracking by learning discriminative saliency map with convolu- tionalneuralnetwork,”inICML,2015,pp.597–606. [25] YiWu,JongwooLim,andMing-HsuanYang,“Objecttrackingbench- mark,” IEEETransactionsonPatternAnalysisandMachineIntelli- [13] KaihuaZhang,QingshanLiu,YiWu,andMing-HsuanYang, “Robust gence(TPAMI),vol.37,no.9,pp.1834–1848,2015. trackingviaconvolutionalnetworkswithoutlearning,” arXivpreprint arXiv:1501.04505,2015. [14] K.SimonyanandA.Zisserman,“Verydeepconvolutionalnetworksfor large-scaleimagerecognition,”CoRR,vol.abs/1409.1556,2014. [15] Joaõ F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista, “High-speedtrackingwithkernelizedcorrelationfilters,” IEEETrans- actionsonPatternAnalysisandMachineIntelligence(TPAMI),vol.37, no.3,pp.583–596,2015.

Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation PDF

2.1 MB·

by Xinyu Wang

#journals #arxiv

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation PDF Free - Full Version

by Xinyu Wang| 2.1

Download Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation by Xinyu Wang in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation

No description available for this book.

Detailed Information

Author:	Xinyu Wang
File Size:	2.1
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation PDF?

Yes, on https://PDFdrive.to you can download Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation by Xinyu Wang completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation on my mobile device?

After downloading Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation?

Yes, this is the complete PDF version of Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation by Xinyu Wang. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.