Please use this identifier to cite or link to this item: http://hdl.handle.net/1822/52866

Full metadata record
DC FieldValueLanguage
dc.contributor.authorMaia, Franciscopor
dc.contributor.authorPaulo, Joãopor
dc.contributor.authorCoelho, Fábiopor
dc.contributor.authorNeves, Franciscopor
dc.contributor.authorPereira, Josépor
dc.contributor.authorOliveira, Rui Carlos Mendes depor
dc.date.accessioned2018-03-19T20:59:31Z-
dc.date.issued2017-
dc.identifier.isbn9783319596648por
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/1822/52866-
dc.description.abstractWith the increasing number of connected devices, it becomes essential to find novel data management solutions that can leverage their computational and storage capabilities. However, developing very large scale data management systems requires tackling a number of interesting distributed systems challenges, namely continuous failures and high levels of node churn. In this context, epidemic-based protocols proved suitable and effective and have been successfully used to build DataFlasks, an epidemic data store for massive scale systems. Ensuring resiliency in this data store comes with a significant cost in storage resources and network bandwidth consumption. Deduplication has proven to be an efficient technique to reduce both costs but, applying it to a large-scale distributed storage system is not a trivial task. In fact, achieving significant space-savings without compromising the resiliency and decentralized design of these storage systems is a relevant research challenge. In this paper, we extend DataFlasks with deduplication to design DDFlasks. This system is evaluated in a real world scenario using Wikipedia snapshots, and the results are twofold. We show that deduplication is able to decrease storage consumption up to 63% and decrease network bandwidth consumption by up to 20%, while maintaining a fullydecentralized and resilient design.por
dc.description.sponsorshipThe research leading to these results was part-funded by (1) Project TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER-000020 is financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF); (2) the ERDF European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT Portuguese Foundation for Science and Technology as part of project UID/EEA/50014/2013 and by (3) the European Union's Horizon 2020 - The EU Framework Programme for Research and Innovation 2014-2020, under grant agreement No. 732051.por
dc.language.isoengpor
dc.publisherSpringer Verlagpor
dc.rightsrestrictedAccesspor
dc.titleDDFlasks: Deduplicated very large scale data storepor
dc.typeconferencePaperpor
dc.peerreviewedyespor
oaire.citationStartPage51por
oaire.citationEndPage66por
oaire.citationVolume10320por
dc.date.updated2018-03-16T12:13:43Z-
dc.identifier.doi10.1007/978-3-319-59665-5_4por
dc.description.publicationversioninfo:eu-repo/semantics/publishedVersionpor
dc.subject.wosScience & Technologypor
sdum.export.identifier4549-
sdum.journalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)por
sdum.conferencePublicationDISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS, DAIS 2017por
Appears in Collections:HASLab - Artigos em atas de conferências internacionais (texto completo)

Files in This Item:
File Description SizeFormat 
P-00M-V91.pdf
  Restricted access
234,86 kBAdobe PDFView/Open

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID