Ayuda
Ir al contenido

Dialnet


Representations for multi-document event clustering

  • Autores: Wim De Smet, Marie-Francine Moens
  • Localización: Data mining and knowledge discovery, ISSN 1384-5810, Vol. 26, Nº 3, 2013, págs. 533-558
  • Idioma: inglés
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • We study several techniques for representing, fusing and comparing content representations of news documents. As underlying models we consider the vector space model (both in a term setting and in a latent semantic analysis setting) and probabilistic topic models based on latent Dirichlet allocation. Content terms can be classified as topical terms or named entities, yielding several models for content fusion and comparison. All used methods are completely unsupervised. We find that simple methods can still outperform the current state-of-the-art techniques.We study several techniques for representing, fusing and comparing content representations of news documents. As underlying models we consider the vector space model (both in a term setting and in a latent semantic analysis setting) and probabilistic topic models based on latent Dirichlet allocation. Content terms can be classified as topical terms or named entities, yielding several models for content fusion and comparison. All used methods are completely unsupervised. We find that simple methods can still outperform the current state-of-the-art techniques.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno