Please use this identifier to cite or link to this item: http://hdl.handle.net/1822/54569

TitleApplying clickstream data mining to real-time Web crawler detection and containment using ClickTips platform
Author(s)Lourenço, Anália Maria Garcia
Belo, Orlando
Issue date2007
PublisherSpringer-Verlag Berlin
JournalStudies in Classification Data Analysis and Knowledge Organization
Abstract(s)Web crawler uncontrolled widespread has led to undesired situations of server overload and contents misuse. Most programs still have legitimate and useful goals, but standard detection heuristics have not evolved along with Web crawling technology and are now unable to identify most of today's programs. In this paper, we propose an integrated approach to the problem that ensures the generation of up-to-date decision models, targeting both monitoring and clickstream differentiation. The ClickTips platform sustains Web crawler detection and containment mechanisms and its data webhousing system is responsible for clickstream processing and further data mining. Web crawler detection and monitoring helps preserving Web server performance and Web site privacy and clickstream differentiated analysis provides focused report and interpretation of navigational patterns. The generation of up-to-date detection models is based on clickstream data mining and targets riot only well-known Web crawlers, but also camouflaging and previously unknown programs. Experiments with different real-world Web sites are optimistic, proving that the approach is not only feasible but also adequate.
TypeConference paper
URIhttp://hdl.handle.net/1822/54569
ISBN9783540709800
ISSN1431-8814
Peer-Reviewedyes
AccessRestricted access (UMinho)
Appears in Collections:CAlg - Artigos em livros de atas/Papers in proceedings

Files in This Item:
File Description SizeFormat 
2007-BC-GFKL-Lourenco&Belo-CRP.pdf
  Restricted access
174,7 kBAdobe PDFView/Open

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID