Table Of Content4 Discussion 47
6
5
f=0.5(cid:176)
4 f=1.0(cid:176)
f=1.5(cid:176)
3
2
λ
Mean flow resistance 1
0
0 2 4 6 8 1 0
Axial distance (Z )
FIGURE3.5
Variationofmeanflowresistancewithaxialdistancefordifferentvaluesofthetaperangle.
1 1
0.5 0.5
0 0
r/R r/R
(cid:150)0.5 (cid:150)0.5
(cid:150)1 (cid:150)1
1 2 3 4 5 1 2 3 4 5
( a ) z ( b ) z
FIGURE3.6
Streamlinesfordifferentvaluesof (cid:4) (a) (cid:4) … 0.5 0 ,(b) (cid:4) … 2 0 .
Theinfluenceofthetaperangle( (cid:4) )onthestream-linepatternhasbeenanalyzed
for a given value of n … 1 :4, K … 1 :2, (cid:5) … 0 :1, A … 0 :5, (cid:1) 2 … 0 :5, Q … 1 :0 and
s
(cid:1) t … 88 ∘,andshownin Figure3.6 .Itisobserved thatthe valueofstream function
decreases asthe taper angle ( (cid:4) )increases.
48 CHAPTER3 BiologicalStudyonPulsatileFlowofHerschel-BulkleyFluid
5 CONCLUSION
Themainobjectiveofthisinvestigationwasstudyingtheproblemofpulsatileflow
ofblood(Herschel-Bulkleyfluid)throughataperedarterialstenosis.Aperturbation
technique was adopted to study the flow. The analytical expressions for velocity,
flowrate,wallshearstress,andmeanflowresistancewereobtained,andtheresults
depicted ingraphs.Using the finite volume technique,the quasi-steady,nonlinear,
coupled,implicit system ofdifferentialequationshasbeensolvednumericallyand
theaxialvelocitycomputed.Itisverifiedthattheerrorbetweentheaxialvelocities
obtainedbythepresentperturbationmethodandthenumericaltechniquebecomes
less than 1.052% for the values of (cid:1) 2 between 0.0 and 1.0. Furthermore, the error
becomesmorethan 9.0% when (cid:1) 2 isgreaterthan 2.10.
enWhhetowflsicstriteacarcheardseesprexinsrmteof (cid:1) 2 ,ethtenesprontibaurrtpe
sltsureeidncicohitwethsltsurendoufinrheotrspepa( rkaanSdan,haatlmaHe,0620
0720 .)e,ncHerousontiicedprdeciincothwirseithdanerthrfuonisarmpcoiss-ceneun
,rysasogonlas (cid:1) 2 < 0.1.r,hertFuirhetsontiicedpreardlivaenwhethsldnoeyRermbnu
islalsm( < )10encsihetlhaoutrSermbnu (cid:1) isyituninirhet.esysalantBurouchoaprapis
leabicplapenevtorgearldooblelssvedanterademosldnoeyRs.ermbnueOnofethstmo
leabrkmaretsiermofethtenesprontibaurrtpeemhescisatthitisryveelabitsutoyan
alicatemthsmaelodmofdooblowflinesubtthwirmfoiundanrmfoninunososcrnsioctse
edarmpcotohetselodmdpelovedebyrsheot( ianuratChdan,mysaaragalnnPo;8619
ystoeArdans,osGra,7219b;7219rkaanSdana,thlamaHe,0620;0720,arnkSa1120 ).
Thetheoreticallycomputedvelocityprofileswerecomparedwiththeexperimen-
taldata,anditwasobservedthatbloodbehaveslikeaHerschel-Bulkleyfluidrather
than Power-law or Bingham fluid. The increase in the taper angle ( (cid:4) ) leads to a
decreaseintheflowrate,wallshearstress,andresistancetoflow.Itisevidentthat
for abnormal hearts, an increase in shear stress on the blood vessel could be very
dangerous, as it can result in paralysis or ultimate death. The resistance to flow is
oneofthephysiologicallyimportantflowvariablestobeinvestigatedbecauseitindi-
cates whether the required amount of blood supply to vital organs is ensured
( Ponalagusamy, 2007, 2012; Chaturani andPonnalagarsamy, 1984 ).
It is well establishedthat hemodynamicfactors (such aswall shear stress, flow
resistance) play a key role in the development and progression of arterial diseases
( Fry, 1973 ). Caro et al . (1971) experimentally demonstrated that during the initial
stageofarterialdisease,theremaybeanimportantintercorrelationbetweenathero-
genesisanddetailedcharacteristicsofbloodflowthroughthedamaged,diseased,or
otherwise affected artery. Keeping in view the importance of hemodynamic and
rheologic factors inthe understanding ofblood flow and arteriosclerostic diseases,
itmaybesaidthattheimportantresultsobtainedinthisanalysiscouldbehelpfulto
acquire knowledge regarding the characteristics of blood flow. Hence, the present
investigationcouldbeusefulforanalyzingthebloodflowthroughatubeofnonuni-
form cross section, which inturn could leadtothe development ofnew diagnostic
tools for the effective treatment of patients suffering from cancer, hypertension,
myocardial infarction, stroke, and paralysis.
References 49
Zamir (2000) pointed out that the oscillatory nature of pulsatile flow of blood
prompts other forces, apart from driving and retarding forces in the case of steady
flow, andothervariablesandtheheatandmasstransportthroughendothelialcells
lying in the inner layer of the vessel wall is very much altered when viscoelastic
properties of blood and its vessel wall have been taken into account. When artery
walls are viscoelastic, a 10% variation in the artery radius over a cardiac cycle is
typicallyobservedandtheshearstressatthewallisprimarilyaffectedbytheradial
wallmotionincomparisonwiththatofrigidarteries. BugliarelloandSevilla(1970)
and BugliarelloandHayden(1963) haveexperimentallyobservedthatthereexistsa
cell-free plasma layer near the wall when blood flows through arteries. It is well
understood that blood consists of a suspension of a variety of cells. Hookes et al .
(1972) pointed out that the microrotation and spinning velocity of blood cells
increase flow resistance and wall shear stress. In view of their experiments and
theaforementionedarguments,itispreferabletorepresenttheflowofbloodthrough
arteries with their viscoelastic nature by a two-layered model instead of one layer
and the rheology of blood as a micropolar viscoelastic fluid while investigating
therealisticmathematicalmodeloninvestigatingbloodflow.Hence,amodesteffort
willbemadetoinvestigate theproblem ofblood flowbyincorporating thefactors
mentioned in this chapter (two or three factors at a time, since it is impossible to
considerallthefactorssimultaneously)andthenumericalfindingswillbepublished
in the future.
REFERENCES
Aroesty,J.,Gross,J.F.,1972a.Themechanicsofpulsatileflowinsmallvessel-I,CassonThe-
ory.Microvasc.Res.4,1(cid:150)12.
Aroesty,J.,Gross,J.F.,1972b.Pulsatileflowinsmallbloodvessels-I,CassonTheory.Bior-
heology9,33(cid:150)43.
Bugliarello,G.,Hayden,J.W.,1963.Detailedcharacteristicsoftheflowofbloodinvitro.J.
Rheol.7,209(cid:150)230.
Bugliarello,G.,Sevilla,J.,1970.Velocitydistributionandothercharacteristicsofsteadyand
pulsatilebloodflowinfineglasstubes.Biorheology17,85(cid:150)107.
Caro,C.G.,Fitzgerald,J.M.,Schroter,R.C.,1971.Atheromaandarterialwall:observation,
correlationandproposalofasheardependentmasstransfermechanismofatherogenesis.
Proc.Roy.Soc.Lond.B177,109(cid:150)159.
Chakravarthy,S.,Mandal,P.K.,2000.Twodimensionalbloodflowthroughtaperedarteries
understenoticconditions.Int.J.NonLinearMech.35,779(cid:150)793.
Chaturani,P.,Ponnalagarsamy,R.,1983.Dilatencyeffectsofbloodonflowthrougharterial
stenosis.In:ProceedingsoftheTwentyEighthCongressoftheIndianSocietyofTheo-
reticalandAppliedMechanics.IITKharagpur,India,pp.87(cid:150)96.
Chaturani,P.,Ponnalagarsamy,R.,1984.Analysisofpulsatilebloodflowthroughstenosed
arteriesanditsapplicationstocardiovasculardiseases.In:Proceedingsof13thNational
Conference on Fluid Mechanics and Fluid Power (FMFP-1984). REC, Tiruchirappalli,
India,pp.463(cid:150)468.
50 CHAPTER3 BiologicalStudyonPulsatileFlowofHerschel-BulkleyFluid
Chaturani, P.,Ponnalagarsamy,R.,1986.PulsatileflowofCasson(cid:146)sfluidthroughstenosed
arterieswithapplicationstobloodflow.Biorheology23,499(cid:150)511.
Dash,R.K.,Jayaraman,G.,Mehta,K.N.,1999.Flowinacatheterizedcurvedarterywithste-
nosis.J.Biomech.32,49(cid:150)61.
Dwivedi,A.P.,Pal,T.S.,Rakesh,L.,1982.Micropolarfluidmodelforbloodflowthrougha
smalltaperedTube.IndianJ.Techn.20,295(cid:150)299.
Fry,D.L.,1973.Responsesofthearterialwalltocertainphysicalfactors:inatherogenesis:
initiatingfactors.CibaFound.Symp.12,93(cid:150)125.
Hookes,L.E.,Nerem,R.M.,Benson,T.J.,1972.Amomentumintegralsolutionforpulsatile
flowinarigidtubewithandwithoutlongitudinalvibration.Int.J.Eng.Sci.10,989(cid:150)1007.
How, T.V., Black, R.A., 1987. Pressure losses in non-Newtonian flow through rigid wall
taperedtubes.Biorheology24,337(cid:150)351.
Mandal,P.K.,2005.Anunsteadyanalysisofnon-Newtonianbloodflowthroughtaperedarter-
ieswithstenosis.Int.J.NonLinearMech.40,151(cid:150)164.
Oka,S.,1973.Pressuredevelopmentinanon-Newtonianflowthroughataperedtube.Bior-
heology10,207(cid:150)212.
Oka,S.,Murata,T.,1969.Theoryofthesteadyslowmotionofnon-Newtonianfluidsthrougha
taperedtube.Jpn.J.Appl.Phys.8,5(cid:150)8.
Ponalagusamy, R., 1986. Blood Flow Throug h Stenosed uTbe. PhD thesis, II,T Bmobay,
India.
Ponalagusamy, R., 2007. Blood flow through an artery with mild stenosis: a two-layered
model,differentshapesofstenosesandslipvelocityatthewall.J.Appl.Sci.7,1071(cid:150)1077.
Ponalagusamy,R.,2012.Mathematicalanalysisoneffectofnon-Newtonianbehaviorofblood
on optimal geometry of microvascular bifurcation system. J. Franklin Inst.
349,2861(cid:150)2874.
Ponnalagarsamy,R.,Kawahara,M.,1989.Afiniteelementanalysisofunsteadyflowsofvis-
coelasticfluidsthroughchannelswithnon-uniformcross-sections.Int.J.Numer.Meth.
Fluid.9,1487(cid:150)1501.
Rohlf,K.,Tenti,G.,2001.TheroleoftheWomersleynumberinpulsatilebloodflow:athe-
oreticalstudyoftheCassonmodel.J.Biomech.34,141(cid:150)148.
Sacks, A.H., Raman, K.R., Burnell, J.A., Tickner, E.G., 1963. Auscultatory Versus Direct
Pressure Measurements for Newtonian Fluids and for Blood in Simulated Arteries,
VIDYAReport#119,Dec.30.
Sankar,D.S.,2011.Two-phasenon-linearmodelforbloodflowinasymmetricandaxisym-
metricstenosedarteries.Int.J.NonLinearMech.46,296(cid:150)305.
Sankar,D.S.,Hemalatha,K.,2006.PulsatileflowofHerschel-Bulkleyfluidthroughstenosed
arteries(cid:150)amathematicalmodel.Int.J.NonLinearMech.41,979(cid:150)990.
Sankar,D.S.,Hemalatha,K.,2007.PulsatileflowofHerschel-Bulkleyfluidthroughcatheter-
izedarteries-amathematicalmodel.Appl.Math.Model.31,1497(cid:150)1517.
ScottBlair,G.W.,Spanner,D.C.,1974.AnIntroductiontoBiorheology.ElsevierScientific
PublishingCompany,Amsterdam,Oxford,pp.1(cid:150)163.
Womersley,J.R.,1955.Methodforthecalculationofvelocity,rateofflowandviscousdragin
thearterieswhenthepressuregradientisknown.J.Physiol.127,553(cid:150)562.
Zamir,A.,2000.ThePhysicsofPulsatileFlow.Springer-Verlag,NewYork.
CHAPTER
4
k
Hierarchical -Means:
A Hybrid Clustering
Algorithm and Its Application
to Study Gene Expression in
Lung Adenocarcinoma
Mohammad ShabbirHasanand Zhong-HuiDuan
DepartmentofComputerScience,CollegeofArtsandSciences,UniversityofAkron,
Akron,USA
1 INTRODUCTION
GeneproductssuchasproteinsorRNAarecreatedfromtheinheritableinforma-
tioncontainedinagene(HunterandHolm,1992).Traditionalmolecularbiology
focuses on studying individualgenes in isolation fordetermininggenefunctions.
However, it is not suitable for determining complex gene interactions or for
explaining the nature of complex biological processes due to the large number
of genes. For this purpose, examining the expression pattern of a large number
of genes in parallel is required (Michaels et al., 1998). With the advancement of
large-scale transcription profiling technology, DNA microarrays have become a
useful tool that allows the analysis of the gene expression pattern at the genome
level (Greshametal., 2008). In genetic-mapping studies, DNAmicroarrays have
beenwidelyusedonpolymorphismsbetweenparentalgenotypesandhavefacili-
tatedthediscoveryofgeneexpressionmarkers(Greshametal.,2008;Wangetal.,
2009). Due to its importance, efficient algorithms are necessary to analyze the
DNA microarray data set accurately (Hasan, 2013). Studies have showed that a
group of genes with similar gene expressions are likely to have related gene
functions (Mount, 2004). Therefore, how to find the genes that share similar
expressionpatternsacrosssamplesisanimportantquestionthatisfrequentlyasked
in the DNA microarray studies (Qin et al., 2014).
Clustering, which is a useful technique to constitute unknown groupings of
objects (Kaufman and Rousseeuw, 2009), has become an important part of gene
expression data analysis (Qin et al., 2014; Eisen et al., 1998). By investigating
theclustersofgeneshavingsimilarexpressionpatternsacrosssamples,researchers
51
EmergingTrendsinComputationalBiology,Bioinformatics,andSystemsBiology
#2015ElsevierInc.Allrightsreserved.
52 CHAPTER4 Hierarchicalk-Means:AHybridClusteringAlgorithmandIts
Application to Study Gene Expression in Lung Adenocarcinoma
can elucidate gene functions, genetic pathways, and regulatory circuits. Clustering
helpstofindadistinctpatternforeachcluster,aswellasmoreinformationabout
functional similarities and gene interactions within the cluster (Hasan and Duan,
2014). For clustering DNA microarray data, a good number of algorithms have
beendevelopedthatincludek-means(Tavazoieetal.,1999),hierarchicalcluster-
ing (Eisen et al., 1998; Luo et al., 2003; Wen et al., 1998), self-organizing maps
(Tamayo et al., 1999; T€or€onen et al., 1999; He et al., 2003), support vector
machines (Brown et al., 2000), Bayesian networks (Friedman et al., 2000), and
fuzzy logic approach (Woolf and Wang, 2000). In addition to these algorithms,
there are others that use genomic information, along with gene expression data,
to improve clustering efficiency. Algorithms that fall into this category include
an ontology-driven clustering algorithm (Wang et al., 2005) and the ones that
use information about TS2 upstream regions of the coding sequences and gene
expression profiles to get more biologically relevant clusters (Holmes and
Bruno, 2000; Barash and Friedman, 2002; Kasturiet al., 2003).
Amongtheexistingclusteringalgorithms,k-meansandhierarchicalclustering
algorithms are the most commonly used. k-means is computationally faster than
hierarchicalclusteringandproducestighterclustersthanthehierarchicalclustering
algorithm. On the other hand, the hierarchical clustering algorithm computes a
complete hierarchy of clusters and hence is more informative than k-means.
Despite these advantages, both of these algorithms suffer from some limitations.
Theperformanceofk-meansclusteringdependsonhoweffectivelytheinitialnum-
berofclusters(i.e.,thevalueofk)isdetermined,andtheadvantageofhierarchical
clustering comes at the cost of low efficiency. Moreover, being computationally
expensive, both of these algorithms impede the wide use of these algorithms in
gene expression data analysis (Garai and Chaudhuri, 2004; Ushizawa et al.,
2004;Bolshakovaetal.,2005).Asasolutiontothisproblem,acombinedapproach
was proposed by Chen et al. (2005), who first applied the k-means algorithm to
determinethekclustersandthenfedtheseclustersintothehierarchicalclustering
techniquetoshortenthemergingclustertimeandgenerateatreelikedendrogram.
However, this solution still suffers from the limitation of determining the initial
value for k (Hasan, 2013; Hasan and Duan, 2014).
In this chapter, we propose a new algorithm, hierarchical k-means, that com-
bines the advantages of both k-means and the hierarchical clustering algorithm
to overcome their limitations. Combining different algorithms to overcome their
own limitations and produce better results is a popular approach in research
(Che et al., 2011, 2012; Hasan et al., 2012). In this proposed algorithm, initially
weappliedthehierarchicalclusteringalgorithmandthenusedtheresulttodecide
the initial number of clusters and fed this information into k-means clustering to
obtainthefinalclusters.Sincesimilargeneexpressionprofilesindicatesimilarity
in their gene functionalities (Azuaje and Dopazo, 2005), after applying the
proposedalgorithmtothemicroarraydatasetoflungadenocarcinomausinggene
ontology(GO)annotations,weexploredthechangeintheenrichmentofmolecular
functionalities of the genes of each cluster for normal tissue and KRAS-positive
2 Methods 53
tissues.Ourresultsshowedthatineachcluster,genesweregroupedtogetherbased
ontheirexpressionpatternandmolecularfunctions,whichindicatethecorrectness
of this proposed algorithm.
2 METHODS
k-means clustering algorithm: For clustering genes, k-means clustering, a well-
knownmethodforclusteranalysispartitionexpressionlevelsofngenesintokclus-
ters,sothatthetotaldistancebetweenthecluster’sgenesanditscorrespondingcen-
troid,representativeofthecluster,isminimized.Inshort,thegoalistopartitionthen
genes into k sets S, i¼1, 2…, k in order to minimize the within-cluster sum of
i
squares(WCSS), defined as
X X
WCSS¼ k n jjxj(cid:2)cjj2, (4.1)
j¼1 i¼1 i j
wherejjxj(cid:2)cjj2 providesthe distancebetween a gene and the cluster’s centroid.
i j
In this clustering algorithm, the initial cluster centroids are selected randomly.
After that, each gene is assigned to the closest cluster centroid. Then each cluster
centroidismovedtothemeanofthepointsassignedtoit.Thisalgorithmconverges
whentheassignmentsnolongerchange.Algorithm4.1showsthepseudocodeofthe
k-means clustering algorithm.
Hierarchicalclusteringalgorithm:Ingeneclustering,hierarchicalclusteringis
amethodofclusteranalysisthatbuildsahierarchyofclusters(asitsnameindicates).
This clusteringmethodorganizes genes intotree structuresbasedontheir relation.
Thebasicideaistoassembleasetofgenesintoatree,wheregenesarejoinedbyvery
shortbranchesiftheyhaveverygreatsimilaritytoeachother,andbyincreasingly
long branches as their similarity decreases.
The approaches for hierarchical clustering can be classified into two groups:
agglomerativeanddivisive.Theagglomerativeapproachisa“bottom-up”approach,
where each gene starts in its own cluster and pairs of clusters are merged as one
moves up the hierarchy. On the other hand, divisive approach is a “top-down”
approach,whereallgenesstartsinoneclusterandsplitsareperformedrecursively
asonemovesdownthehierarchy.Inthischapter,wemainlyfocusontheagglom-
erativeapproach for hierarchical clustering.
Thefirststepinhierarchicalclusteringistocalculatethedistancematrixbetween
thegenesinthedataset.Theclusteringstartsoncethismatrixofdistancesiscom-
puted. The agglomerative hierarchical clustering technique consists of repeated
cycleswherethetwoclosestgeneshavingthesmallestdistancearejoinedbyanode
known as a pseudonode. The two joined genes are removed from the list of genes
being processed and replaced by the pseudonode that represents the new branch.
Thedistancesbetweenthispseudonodeandallotherremaininggenesarecomputed,
54 CHAPTER4 Hierarchicalk-Means:AHybridClusteringAlgorithmandIts
Application to Study Gene Expression in Lung Adenocarcinoma
ALGORITHM4.1
k-means
2 Methods 55
andtheprocessisrepeateduntilonlyonenoderemains.Notethatthereareavariety
of ways to compute distances while dealing with a pseudonode: centroid linkage,
singlelinkage,completelinkage,andaveragelinkage.Inthischapter,weuseaver-
agelinkage,whichdefinesthedistancebetweentwoclustersastheaveragepairwise
distancebetween genes in cluster C andC calculatedusingEq.(4.2):
i j
X X
(cid:2) (cid:3) δðx,yÞ
δ C,C ¼ x2Ci y2Cj , (4.2)
i j n:n
i j
whereδ(x,y)istypicallygivenbytheEuclideandistancecalculatedusingEq.(4.3):
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
utXd
δðx,yÞ¼ ðx (cid:2)yÞ2: (4.3)
i i
i¼1
The pseudocode of agglomerative hierarchical clustering using average linkage is
illustrated inAlgorithm4.2.
ALGORITHM4.2
HierarchicalClustering
56 CHAPTER4 Hierarchicalk-Means:AHybridClusteringAlgorithmandIts
Application to Study Gene Expression in Lung Adenocarcinoma
Hierarchical k-means: In this proposed algorithm, we selected the value of k
(i.e.,thenumberofclusters)inasystematicway.Initially,weusedtheagglomerative
hierarchicalclusteringalgorithmforclusteringthedatasetusingaveragelinkageand
thencheckedatwhatlevelthedistancebetweentwoconsecutivenodesofthehier-
archywasthemaximum.Usingthisinformation,thevalueofkisdetermined,which
isthenfedintothek-meansclusteringalgorithmtoproducethefinalclusters.Inboth
algorithms,thePearsoncorrelationcoefficient(r)wasusedasthesimilaritymetric
betweentwosamplesand1(cid:2)rwasusedasthedistancemetric.Algorithm4.3shows
the pseudocode ofthe proposed algorithm.
ALGORITHM4.3
Hierarchicalk-meansClustering
Description:Emerging Trends in Computational Biology, Bioinformatics, and Systems Biology discusses the latest developments in all aspects of computational biology, bioinformatics, and systems biology and the application of data-analytics and algorithms, mathematical modeling, and simu- lation techniques.• Di