Sharapova E.V., Sharapov R.V. System of Fuzzy Duplicates Detection

Sharapova E.V., Sharapov R.V. System of Fuzzy Duplicates Detection // Applied Mechanics and Materials, 2014, Vols. 490-491, pp. 1503-1507.

In the paper we discuss the problem of fuzzy duplicate detecting. There are given the basic approaches to detection of text duplicates. We review the existing methods of fuzzy duplicate detecting. There is presented algorithm of fuzzy duplicate detection. Algorithm is based on method of shingles. We describe modification of algorithm. We propose to consider not all text of document but its processed and filtered copy. There is presented the structure of system for fuzzy duplicates detection. System checks text duplications in the internal database and in Internet

Комментарии закрыты.