Implementasi Algoritma Cosine Similarity pada sistem arsip dokumen di Universitas Islam Sultan Agung
DOI:
https://doi.org/10.26623/transformatika.v17i2.1613Keywords:
Cosine Similarity, Information Systems, precision, recallAbstract
Archiving in University that have not been well organized will cause a problems, the documents need for structuring and archives properly in the systems for the good standard a universities. The most importance of ease in finding the required archives is an important reason why it is necessary to develop an archive search system that can facilitate and improve the process of searching the archived document. Apllying cosine similarity algorithm in Information Systems is a solution for University to organizing archived documents, results from this reserach is the systems can show the relavant document from database list with precision 88.8% and recall 76.1% from all the data in database.
References
R. A. Pascapraharastyan, A. Supriyanto, and P. Sudarmaningtyas, Rancang Bangun Sistem Informasi Manajemen Arsip Rumah Sakit Bedah Surabaya Berbasis Web, Sist. Inf., vol. 3, no. 1, pp. 72 77, 2014.
M. Rifauddin, Pengelolaan Arsip Elektronik Berbasis Teknologi, Khizanah Al- Hikmah J. Ilmu Perpustakaan, Informasi, dan Kearsipan, vol. 4, no. 2, pp. 168 178, 2016.
O. Nurdiana, J. Jumadi, and D. Nursantika, Perbandingan Metode Cosine Similarity Dengan Metode Jaccard Similarity Pada Aplikasi Pencarian Terjemah Al-Qur an Dalam Bahasa Indonesia, J. Online Inform., vol. 1, no. 1, p. 59, 2016.
S. Shum, N. Dehak, R. Dehak, and J. R. Glass, Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification, Proc. Odyssey, 2010.
V. Thada and V. Jaglan, Comparison of jaccard, dice, cosine similarity coefficient to find best fitness value for web retrieved documents using genetic algorithm, Int. J. Innov. Eng. Technol., vol. 2, no. 4, pp. 202 205, 2013.
R. Mihalcea, C. Corley, and C. Strapparava, Corpus-based and knowledge-based measures of text semantic similarity, Proc. Natl. Conf. Artif. Intell., vol. 1, pp. 775 780, 2006.
M. E. Scholar, N. Engineering, T. Nadu, and T. Nadu, a S Urvey on S Imilarity M Easures in T Ext M Ining, vol. 3, no. 1, pp. 19 28, 2016.
J. Ramos, Using TF-IDF to Determine Word Relevance in Document Queries, New Educ. Rev., vol. 42, no. 4, pp. 40 51, 2003.
B. Li and L. Han, Distance weighted cosine similarity measure for text classification, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8206 LNCS, pp. 611 618, 2013.
B. Trstenjak, S. Mikac, and D. Donko, KNN with TF-IDF based framework for text categorization, Procedia Eng., vol. 69, pp. 1356 1364, 2014.
G. Sidorov, A. Gelbukh, H. G ³mez-Adorno, and D. Pinto, Soft similarity and soft cosine measure: Similarity of features in vector space model, Comput. y Sist., vol. 18, no. 3, pp. 491 504, 2014.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Transformatika is licensed under a Creative Commons Attribution 4.0 International License.