Sensitivitas Sistem Pencarian Artikel Bahasa Indonesia Menggunakan Metode n-gram Dan Tanimoto Cosine


  • Candra Supriadi Universitas Kristen Satya Wacana
  • Hidriyanto Dwi Purnomo Universitas Kristen Satya Wacana
  • Irwan Sembiring Universitas Kristen Satya Wacana



Sistem pencarian, Tanimoto cosine, n-gram


The human need for technology and the availability of adequate infrastructure is evidence that technology is now a part of basic human needs. The increasing number of journals and scientific papers, it must be more selective in selecting and sorting even though there are already many online service providers and journal portals. Research on search engines and plagiarism and recommendation systems has been carried out with various methods deemed appropriate to improve the performance of the system itself, this paper has the purpose of calculating the similarity between one article with another article by implementing n-gram and tanimoto cosine. The number of articles tested was forty-three titles and abstracts, tested fifty times with randomly selected keywords, by breaking down each title and abstract sentence into n characters (n = 2 to 8) including spaces and punctuation, then counted similarity with the query or keyword used for system testing. The test was conducted using several threshold variations from n = 2 to 8. After observing fifty times the threshold test of 0.15 has the highest accuracy at n = 4 at 0.92, the highest precision at n = 3 at 0.42 and the highest recall at the test n = 2 = 0.44 .

