Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/42615

TítuloProcessing Annotated TMX Parallel Corpora
Autor(es)Brito, Rui Miguel Magalhães
Almeida, J. J.
Simões, Alberto
Palavras-chaveCorpora paralelos
TMX
PLN
Parallel corpora
Annotated corpora
DataNov-2014
CitaçãoBrito, Rui, José João Almeida, e Alberto Simões. 2014. Processing annotated TMX parallel corpora. Em IberSpeech 2014 --- VIII Jornadas en Tecnologías del Habla and IV Iberian SLTech Workshop, pp. 188--197, Las Palmas de Gran Canaria, Spain, November, 2014
Resumo(s)In the later years the amount of freely available multilingual corpora has grown in an exponential way. Unfortunately the way these corpora are made available is very diverse, ranging from simple text files or specific XML schemas to supposedly standard formats like the XML Corpus Encoding Initiative, the Text Encoding Initiative, or even the Translation Memory Exchange formats. In this document we defend the usage of Translation Memory Exchange documents, but we enrich its structure in order to support the annotation of the documents with different information like lemmas, multi-words or entities. To support the adoption of the proposed formats, we present a set of tools to manipulate the different formats in an agile way.
TipoArtigo em ata de conferência
URIhttps://hdl.handle.net/1822/42615
ISBN978-84-617-2862-6
Versão da editorahttp://iberspeech2014.ulpgc.es/images/Iberspeech2014_OnlineProceedings.pdf
Arbitragem científicayes
AcessoAcesso aberto
Aparece nas coleções:CEHUM - Artigos em livros de atas

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
tmxa.pdf373,79 kBAdobe PDFVer/Abrir

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID