№1, 2017

ANALYSIS OF METHODS FOR THE IDENTIFICATION AUTHORSHIP OF THE TEXT IN AZERBAIJANI LANGUAGE

Kamil R. Ayda-zade, Sakhavat G. Talibov

The methods and algorithms used for recognition texts authorship analyzes in the paper. The applied features of recognition are based on n-grams with n = 1, and n = 2. The results of computer experiments to identify the authorship of the texts in the Azerbaijani are presented (pp.14-23).

Keywords: identification, author identification, recognition, п-qram, support vector machine.
DOI : 10.25045/jpit.v08.i1.02
References
  • Mosteller F., Wallace D.L. Applied Bayesian and Classical Inference, The Case of the Federalist Papers. Reading, MA: Addison-Wesley, 1984, 303p.
  • Burrows J.F. Not unless you ask nicely: the interpretative nexus between analysis and information // Literary Linguist Computing, 1992, vol.7, No.2, pp.91–109.
  • Stamatatos E., Fakotakis N., Kokkinakis G. Automatic Text Categorization in Terms of Genre and Author // Computational Linguistics, 2001, vol. 26, No 4, pp.471–495.
  • Morton A.Q. The Authorship of Greek Prose // Journal of the Royal Statistical Society, Series A, 1965, vol. 128, No 2, pp.169–233.
  • Brainerd B. Weighting Evidence in Language and Literature // A Statistical Approach, University of Toronto Press, 1974, 288p.
  • Holmes D.I. Authorship Attribution // Computers and The Humanities, 1994, vol.28, No 2, pp.87–106.
  • Tweedie F., Baayen H. How Variable may a Constant be Measures of Lexical Richness in Perspective // Computers and The Humanities, 1998, vol.32, no.5, pp.323–352.
  • Stamatatos E., Fakotakis N., Kokkinakis G. Computer-Based Authorship Attribution Without Lexical Measures // Computers and The Humanities, 2001, No 35, pp.193–214.
  • Fürnkranz J. A Study using n-gram Features for Text Categorization, Austrian Research Institute for Artifical Intelligence, 1998, 10 p.
  • Tan C.M., Wang Y.F., Lee C.D. The Use of Bigrams to Enhance // Journal Information Processing and Management, 2002, vol.30, no.4, pp.529–546.
  • Çatal Ç., Erbakırcı K., Erenler Y. Computer-based Authorship Attribution for Turkish Documents / Turkish Symposium on Artificial Intelligence and Neural Networks, 2003, pp. 539–541.
  • Aida-zade K.R., Talibov S.G. Analysis of the effectiveness of the methods of recognition of authorship of texts in the Azerbaijani language // The 5th International Conference on Control and Optimization with Industrial Applications (COIA-2015), 27−29 August, 2015, Baku, Azerbaijan, pp.183.
  • Gasimov S., Ibrahimov I. Analysis of sentences and words used in Azerbaijani texts // The Second International Conference Problems of Cybernetics and Informatics, September 10–12, 2008, Baku, pp. 117–119.
  • Doğan S., Diri B. A New Classification Based on N-grams for Turkish Documents // Author, Type and Gender. Turkish Foundation Union for Computer Science and Engineering Publication, 2010, 3, pp.11–20.
  • Biricik G., Diri B., Sönmez A. A New Method For Attribute Extraction with Application on Text Classification / 5th International Conference on Soft Computing, Computing with Words, ICSCCW, North Cyprus, Famagusta, 2009, pp.4.
  • George H. Estimating Continuous Distributions in Bayesian Classifiers / 11th Conference on Uncertainty in Artificial Intelligence, San Mateo, 1995, pp.338–345.
  • Yasdi M., Diri B. M. Authorship Recognition with Abstract Feature Inference / IEEE 20. Signal Processing and Communication Applications Convention, SIU 2012, Fethiye (18–20 April), 2012, p.4.
  • Orlov Y.N., Osminin K.P. The methods of statistical analysis of literary texts, M .: Editorial URSS / Book House “LIBROKOM”, 2012, p.326
  • Khmelev D.V. Text Authorship Recognition using Markov A.A. chains // MGU News, ser.9: Philology, 2000, No2, pp.115–126.
  • Romanov A.S. Text Authorship Recognition Methods based on support vector machine // TUSURReports, No1 (19), part 2, June 2009, pp.36–42.
  • Vapnik V.N. Statistical Learning Theory, New York: Wiley, 1998, 732 p.
  • Vapnik V.N. The nature of statistical learning theory, New York: Springer-Verlag, 2000, 332 p.
  • C.-W. Hsu, C.-C. Chan, C.-J. Lin. A practical guide to support vector classification. // www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
  • Romanov A.S., Mesheryakov R.V. Authorship identification with support vector machine in case of two possible alternatives.dialog-21.ru/digests/dialog2009/materials/pdf/67.pdf