Document Clustering using Self-Organizing Maps

dc.contributor.authorRafi, Muhammad
dc.contributor.authorWaqar, Muhammad
dc.contributor.authorAjaz, Hareem
dc.contributor.authorAyub, Umar
dc.contributor.authorDanish, Muhammad
dc.coverage.issue1cs
dc.coverage.volume23cs
dc.date.accessioned2019-06-26T10:18:09Z
dc.date.available2019-06-26T10:18:09Z
dc.date.issued2017-06-01cs
dc.description.abstractCluster analysis of textual documents is a common technique for better ltering, navigation, under-standing and comprehension of the large document collection. Document clustering is an autonomous methodthat separate out large heterogeneous document collection into smaller more homogeneous sub-collections calledclusters. Self-organizing maps (SOM) is a type of arti cial neural network (ANN) that can be used to performautonomous self-organization of high dimension feature space into low-dimensional projections called maps. Itis considered a good method to perform clustering as both requires unsupervised processing. In this paper, weproposed a SOM using multi-layer, multi-feature to cluster documents. The paper implements a SOM usingfour layers containing lexical terms, phrases and sequences in bottom layers respectively and combining all atthe top layers. The documents are processed to extract these features to feed the SOM. The internal weightsand interconnections between these layers features(neurons) automatically settle through iterations with a smalllearning rate to discover the actual clusters. We have performed extensive set of experiments on standard textmining datasets like: NEWS20, Reuters and WebKB with evaluation measures F-Measure and Purity. Theevaluation gives encouraging results and outperforms some of the existing approaches. We conclude that SOMwith multi-features (lexical terms, phrases and sequences) and multi-layers can be very e ective in producinghigh quality clusters on large document collections.en
dc.formattextcs
dc.format.extent111-118cs
dc.format.mimetypeapplication/pdfen
dc.identifier.citationMendel. 2017 vol. 23, č. 1, s. 111-118. ISSN 1803-3814cs
dc.identifier.doi10.13164/mendel.2017.1.111en
dc.identifier.issn2571-3701
dc.identifier.issn1803-3814
dc.identifier.urihttp://hdl.handle.net/11012/179206
dc.language.isoencs
dc.publisherInstitute of Automation and Computer Science, Brno University of Technologycs
dc.relation.ispartofMendelcs
dc.relation.urihttps://mendel-journal.org/index.php/mendel/article/view/61cs
dc.rights.accessopenAccessen
dc.subjectDocument Clusteringen
dc.subjectText Miningen
dc.subjectNeural Networken
dc.subjectUnsupervised Learningen
dc.subjectSelf-Organizing Mapsen
dc.subjectLayered Approachen
dc.titleDocument Clustering using Self-Organizing Mapsen
dc.type.driverarticleen
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen
eprints.affiliatedInstitution.facultyFakulta strojního inženýrstvícs
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
61-Article Text-111-1-10-20190219.pdf
Size:
3.07 MB
Format:
Adobe Portable Document Format
Description:
Collections