Table Of ContentLecture Notes in Computer Science 6910
CommencedPublicationin1973
FoundingandFormerSeriesEditors:
GerhardGoos,JurisHartmanis,andJanvanLeeuwen
EditorialBoard
DavidHutchison
LancasterUniversity,UK
TakeoKanade
CarnegieMellonUniversity,Pittsburgh,PA,USA
JosefKittler
UniversityofSurrey,Guildford,UK
JonM.Kleinberg
CornellUniversity,Ithaca,NY,USA
FriedemannMattern
ETHZurich,Switzerland
JohnC.Mitchell
StanfordUniversity,CA,USA
MoniNaor
WeizmannInstituteofScience,Rehovot,Israel
OscarNierstrasz
UniversityofBern,Switzerland
C.PanduRangan
IndianInstituteofTechnology,Madras,India
BernhardSteffen
TUDortmundUniversity,Germany
MadhuSudan
MicrosoftResearch,Cambridge,MA,USA
DemetriTerzopoulos
UniversityofCalifornia,LosAngeles,CA,USA
DougTygar
UniversityofCalifornia,Berkeley,CA,USA
MosheY.Vardi
RiceUniversity,Houston,TX,USA
GerhardWeikum
MaxPlanckInstituteforInformatics,Saarbruecken,Germany
Ngoc Thanh Nguyen (Ed.)
Transactions on
Computational
Collective Intelligence V
1 3
VolumeEditor
NgocThanhNguyen
WroclawUniversityofTechnology
Wyb.Wyspianskiego27
50-370Wroclaw,Poland
E-mail:ngoc-thanh.nguyen@pwr.edu.pl
ISSN0302-9743(LNCS) e-ISSN1611-3349(LNCS)
ISSN2190-9288(TCCI)
ISBN978-3-642-24015-7 e-ISBN978-3-642-24016-4
DOI10.1007/978-3-642-24016-4
SpringerHeidelbergDordrechtLondonNewYork
LibraryofCongressControlNumber:2011935943
CRSubjectClassification(1998):I.2,C.2.4,I.2.11,H.3-5,D.2,I.5
©Springer-VerlagBerlinHeidelberg2011
Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis
concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting,
reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication
orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965,
inistcurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable
toprosecutionundertheGermanCopyrightLaw.
Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnotimply,
evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelaws
andregulationsandthereforefreeforgeneraluse.
Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India
Printedonacid-freepaper
SpringerispartofSpringerScience+BusinessMedia(www.springer.com)
Preface
Welcome to the fifth volume of Springer’s Transactions on Computational
Collective Intelligence (TCCI).Itis the thirdissuein2011ofthis journalwhich
isdevotedtoresearchincomputer-basedmethodsofcomputationalcollectivein-
telligence(CCI)andtheirapplicationsinawiderangeoffieldssuchasgroupde-
cision making, knowledge integration,consensus computing, the Semantic Web,
social networks and multi-agent systems. TCCI strives to cover new computa-
tional,methodological,theoreticalandpracticalaspectsofcollectiveintelligence
understood as the form of intelligence that emerges from the collaboration and
competition of many individuals (artificial and/or natural).
This volume of TCCI includes ten interesting and original papers. The first
of them, entitled“ImprovedN-grams Approach for Web Page Language Identi-
fication”by Ali Selamat, presents an improvedN-grams approachfor Web page
language identification,which is based on a combinationof an originalN-grams
approach and a modified N-grams approach that has been used for language
identificationofWebdocuments.Inthesecondpaperwiththetitle“Image-Edge
DetectionUsingVariation-AdaptiveAntColonyOptimization”theauthors,Jing
Tian,WeiyuYu, LiChen,andLihongMa,presenta novelimage-edgedetection
approachusingantcolonyoptimizationtechniques,inwhichapheromonematrix
representingedges atpixel positions of an image is built accordingto the move-
mentsofanumberofantswhicharedispatchedtomoveontheimage.Thenext
paper,“An Iterative Process for Component-Based Software Development Cen-
tered on Agents”by Yves Wautelet, Sodany Kiv, and Manuel Kolp, includes a
formalizationofthe processforcomponent-basedsoftwaredevelopmentthrough
the use of the agent paradigm. In the fourth paper entitled“Cellular Gene Ex-
pressionProgrammingClassifierLearning”theauthors,JoannaJedrzejowiczand
(cid:2)
PiotrJedrzejowicz,presentamethodforintegratingtwocollectivecomputational
(cid:2)
intelligence techniques: gene expression programming and cellular evolutionary
algorithms with a view to inducing expression trees. This paper also includes a
discussionofthevalidatingexperimentresultsconfirmingthehighqualityofthe
proposed ensemble classifiers. The next paper, “A Situation-Aware Computa-
tional Trust Model for Selecting Partners”by JoanaUrbano, Ana Paula Rocha,
and Eugenio Oliveira,contains the description of a model for selecting partners
inasociety,inwhichtheauthorsfocusoncontextualfitness,acomponentofthe
model that adds a contextualdimensional to existing trust aggregationengines.
The sixth paper entitled “Using the Perseus System for Modelling Epistemic
Interactions”by Magdalena Kacprzak et al., includes a model for agent knowl-
edge acquisition, using a logical puzzle in which agents increase their knowl-
edge about the hats they wear and the software tool named Perseus. In the
seventhpaper,“Reduction ofFaulty Detected Shot Cuts and Cross-DissolveEf-
fects in Video Segmentation Process of Different Categories of Digital Videos,”
VI Preface
the author, Kazimierz Choro´s,presents a description of experiments confirming
theeffectivenessoffourmethodsoffaultyvideodetectionreferringtofivediffer-
entcategoriesofmovie:TVtalk-show,documentarymovie,animalvideo,action
and adventure, and pop music video. In the next paper, “Using Knowledge-
Integration Techniques for User Profile Adaptation Methods in Document Re-
trieval Systems”by Bernadetta Mianowska and Ngoc Thanh Nguyen, a model
for integrating the archival knowledge included in a user profile with the new
knowledgedeliveredtoaninformationretrievalsystem,detectingandprovingits
properties, is presented. The ninth paper entitled“Modeling Agents and Agent
Systems,”by Theodor Lettmann et al., contains a universaland formal descrip-
tion for agent systems that can be used as a core model with other existing
models as special cases.The authors show that owing to this core model a clear
specification of agent systems and their properties can be achieved. The last
paper,“Online News Event Extraction for Global Crisis Surveillance”by Jakub
Piskorskiet al.,presentsa real-timeandmultilingualnews eventextractionsys-
tem developed at the Joint Research Centre of the European Commission. The
authors show that with this system it is possible to accurately and efficiently
extract violent and natural disaster events from online news.
TCCIisapeer-reviewedandauthoritativereferencedealingwiththeworking
potentialofCCImethodologiesandapplicationsaswellasemergingissuesofin-
teresttoacademicsandpractitioners.TheresearchareaofCCIhasbeengrowing
significantlyinrecentyearsandweareverythankfultoeveryonewithintheCCI
research community who has supported the TCCI and its affiliated events in-
cluding the International Conferences on Computational Collective Intelligence
(ICCCI). The first ICCCI event was held in Wroclaw,Poland,in October 2009.
ICCCI2010washeldinKaohsiung,Taiwan,inNovember2010andICCCI2011
took place in Gdynia, Poland, in September 2011. For ICCCI 2011 around 300
papers from 25 countries were submitted and only 105 papers were selected for
inclusion in the proceedings published by Springer in LNCS/LNAI series. We
will invite authors of the ICCCI papers to extend them and submit them for
publication in TCCI.
We are very pleased that TCCI and the ICCCI conferences are strongly ce-
mented as high-quality platforms for presenting and exchanging the most im-
portantandsignificantadvancesinCCIresearchanddevelopment.Itisalsoour
pleasuretoannouncethecreationofthenewTechnicalCommitteeonComputa-
tional Collective Intelligence within the Systems, Man and Cybernetics Society
(SMC) of IEEE.
We wouldlike to thank all the authorsfor their contributionsto TCCI. This
issuewouldnothavebeenpossiblewithoutthegreateffortsoftheeditorialboard
and many anonymously acting reviewers. We would like to express our sincere
thanks to all ofthem. Finally,we wouldalsolike to expressour gratitudeto the
LNCS editorial staff of Springer, in particular Alfred Hofmann, Ursula Barth,
Peter Strasser and their team, who supported the TCCI journal.
July 2011 Ngoc Thanh Nguyen
Transactions on Computational Collective
Intelligence
This Springer journalfocuses on researchin applications of the computer-based
methods of computationalcollective intelligence (CCI) andtheir applications in
awiderangeoffieldssuchastheSemanticWeb,socialnetworksandmulti-agent
systems.Itaimstoprovideaforumforthepresentationofscientificresearchand
technologicalachievements accomplished by the international community.
Thetopicsaddressedbythisjournalincludeallsolutionsofreal-lifeproblems
forwhichitisnecessarytousecomputationalcollectiveintelligencetechnologies
toachieveeffectiveresults.Theemphasisofthepaperspublishedisonnoveland
original research and technological advancements. Special features on specific
topics are welcome.
Editor-in-Chief
Ngoc Thanh Nguyen Wroclaw University of Technology, Poland
Co-Editor-in-Chief
Ryszard Kowalczyk Swinburne University of Technology,Australia
Editorial Board
John Breslin NationalUniversity ofIreland,Galway,Ireland
Shi-Kuo Chang University of Pittsburgh, USA
Longbing Cao University of Technology Sydney, Australia
Oscar Cordon European Centre for Soft Computing, Spain
Tzung-Pei Hong National University of Kaohsiung, Taiwan
Gordan Jezic University of Zagreb,Croatia
Piotr Jedrzejowicz Gdynia Maritime University, Poland
(cid:2)
Kang-Huyn Jo University of Ulsan, Korea
Rados(cid:4)law Katarzyniak Wroclaw University of Technology, Poland
Jozef Korbicz University of Zielona Gora, Poland
Hoai An Le Thi Metz University, France
Pierre L´evy University of Ottawa, Canada
Tokuro Matsuo Yamagata University, Japan
Kazumi Nakamatsu University of Hyogo, Japan
ToyoakiNishida Kyoto University, Japan
Manuel Nu´n˜ez Universidad Complutense de Madrid, Spain
VIII Transactions on Computational Collective Intelligence
Julian Padget University of Bath, UK
Witold Pedrycz University of Alberta, Canada
Debbie Richards Macquarie University, Australia
Roman Sl(cid:4)owin´ski Poznan University of Technology, Poland
Edward Szczerbicki University of Newcastle, Australia
Kristinn R. Thorisson Reykjavik University, Iceland
Gloria Phillips-Wren Loyola University Maryland, USA
Sl(cid:4)awomir Zadroz˙ny Institute of Research Systems, PAS, Poland
Table of Contents
Improved N-grams Approach for Web Page Language Identification .... 1
Ali Selamat
Image Edge Detection Using Variation-Adaptive Ant Colony
Optimization .................................................... 27
Jing Tian, Weiyu Yu, Li Chen, and Lihong Ma
An Iterative Process for Component-Based Software Development
Centered on Agents .............................................. 41
Yves Wautelet, Sodany Kiv, and Manuel Kolp
Cellular Gene ExpressionProgramming Classifier Learning............ 66
Joanna J¸edrzejowicz and Piotr J¸edrzejowicz
A Situation-Aware Computational Trust Model for Selecting
Partners ........................................................ 84
Joana Urbano, Ana Paula Rocha, and Eug´enio Oliveira
Using the Perseus System for Modelling Epistemic Interactions ........ 106
Magdalena Kacprzak, Piotr Kulicki, Robert Trypuz,
Katarzyna Budzynska, Pawe(cid:2)l Garbacz, Marek Lechniak, and
Pawe(cid:2)l Rembelski
Reduction of Faulty Detected Shot Cuts and Cross Dissolve Effects in
Video Segmentation Process of Different Categories of Digital Videos ... 124
Kazimierz Choro´s
Using Knowledge Integration Techniques for User Profile Adaptation
Method in Document Retrieval Systems............................. 140
Bernadetta Mianowska and Ngoc Thanh Nguyen
Modeling Agents and Agent Systems ............................... 157
Theodor Lettmann, Michael Baumann, Markus Eberling, and
Thomas Kemmerich
Online News Event Extraction for Global Crisis Surveillance........... 182
Jakub Piskorski, Hristo Tanev, Martin Atkinson,
Eric van der Goot, and Vanni Zavarella
Author Index.................................................. 213
Improved N-grams Approach for Web Page
Language Identification
Ali Selamat
Software Engineering Research Group,
Faculty of Computer Science & Information Systems,
Universiti Teknologi Malaysia, UTM Johor Baharu Campus,
81310, Johor, Malaysia
aselamat@utm.my
Abstract. Language identification has been widely used for machine
translations and information retrieval. In this paper, an improved N-
grams(ING)approachisproposed forwebpagelanguage identification.
The improved N-grams approach is based on a combination of original
N-grams (ONG) approach and a modified N-grams (MNG) approach
that has been used for language identification of web documents. The
features selected from the improved N-grams approach are based on N-
grams frequency and N-grams position. The features selected from the
original N-grams approach are based on a distance measurement and
the features selected from the modified N-grams approach are based
on a Boolean matching rate for language identification of Roman and
Arabic scripts web pages. A large real-world document collection from
British Broadcasting Corporation (BBC) website, which is composed of
1000 documents on each of the languages (e.g., Azeri, English, Indone-
sian, Serbian, Somali, Spanish, Turkish, Vietnamese, Arabic, Persian,
Urdu, Pashto) have been used for evaluations. The precision, recall and
F1 measures have been used to determine the effectiveness of the pro-
posed improved N-grams (ING) approach. From the experiments, we
have found that the improved N-grams approach has been able to im-
prove the language identification of the contents in Roman and Arabic
scripts web page documents from theavailable datasets.
Keywords: Monolingual, multilingual, web page language identifica-
tion, N-grams approach.
1 Introduction
Language identification (LID) is the process of identifying the predefined lan-
guagethathasbeenusedtowritevarioustypesofdocuments.Inordertoidentify
thecontentofwebdocuments,humansarethemostaccuratelanguageidentifier.
Within seconds of reading a passage of a text, humans can determine whether
itis a languagethey canunderstand.If itis a languagethat they areunfamiliar
with,theyoftencanmakesubjectivejudgmentsastoitssimilaritytoalanguage
that they already know. In this research, a term “language” is used to refer to
N.T.Nguyen(Ed.):TransactionsonCCIV,LNCS6910,pp.1–26,2011.
(cid:2)c Springer-VerlagBerlinHeidelberg2011