Art. 06 – Vol. 25 – No. 3 – 2015

Knowledge Discovery in Databases: Predictive Methods

Cornel LEPĂDATU
cornel_lepadatu@biblacad.ro

Library of the Romanian Academy – Bucharest

Abstract: The main objective of predictive methods is the search for optimal models for various modeling techniques: classical (multiple regression, discriminant analysis), less classic (classification and regression tree) or machine learning (neural networks, ensemble methods, support vector machines). The article focuses on an uniform and synthetic presentation for the supervised learning methods the most commonly used for knowledge discovery from (very) large amount of data (Big data, KDD) for decision support in various fields of application. For each method were highlighted, as appropriate, a number of specific issues, essential for an data prospector: the fields of the application, the significances of the coefficients , the discrimination power of the characteristics, the methods for selection of variables, the appropriateness of the model with the observed data, the performances measurement, the separation of model estimation error from the prediction estimation errors, over-learning control, characterization and interpretation of results, computational performances.

Keywords: Big data, Classification, Knowledge Discovery in Databases (KDD), Modeling, Prediction, Statistical learning.

REFERENCES

  1. BANCIU, D.; COARDOŞ, D.; LEPĂDATU, C-I.; LEPĂDATU, C.: Enhancement of the Retrospective National Bibliography of the Romanian Book through the Application of the Informational Technologies, Proceedings of BIBLIO 2011 „Innovation en bibliotheque/Innovation within libraries”, Editura Universităţii Transilvania din Braşov, 2011, pp. 131-142.
  2. BESSE, P.; LAURENT, B.: Apprentissage Statistique: modélisation, prévision et data mining, Institut National des Sciences Appliquées de Toulouse, 2014, 159 p.
  3. CIUREA, C.; DUMITRESCU, G.; LEPĂDATU, C.: The impact analysis of implementing virtual exhibitions for mobile devices on the access to national cultural heritage, Proceedings of 2nd International Conference Economic Scientific Research – Theoretical, Empirical and Practical Approaches, ESPERA 2014, Bucharest, Romania.
  4. COARDOŞ, D.; COARDOŞ, V.; LEPĂDATU, C-I.; LEPĂDATU, C.: Support Systems for Libraries Based on Business Intelligence Tools, 2008 IEEE International Conference on Intelligent Computer Communication and Processing – Digital Libraries Workshop, Cluj Napoca, August 2008.
  5. COARDOŞ, D.; COARDOŞ, V.; LEPĂDATU, C-I.; LEPĂDATU, C.: Integrated On-line System for Management of the National Retrospective Bibliography – SIMBNR, 2009 IEEE International Conference on Intelligent Computer Communication and Processing – Workshop on Digital Libraries, e-Content Management and e-Learning”, Cluj Napoca, August 2009.
  6. DUMITRESCU, G.; FILIP, F.-G.; IONIŢĂ, A.; LEPĂDATU, C.: Open Source Eminescu’s Manuscripts: A Digitization Experiment, Studies in Informatics and Control, 19(1), 2010, pp. 79-84.
  7. DUMITRESCU, G.; LEPĂDATU, C.; CIUREA C.: Creating Virtual Exhibitions for Educational and Cultural Development, INFOREC Publishing House, Informatica Economică Journal, 2014, 18(1), pp. 102-110.

View full article


  1. ENĂCHESCU, D.: Data Mining: metode şi aplicaţii, Edit. Academiei Române, 2009, 277 p.
  2. FAYYAD, U.; PIATETSKY-SHAPIRO, G.; SMYTH, P.: From Data Mining to Knowledge Discovery in Databases, AAAI, AI Magazine, 17 (3), 1996, pp. 37-54.
  3. FILIP, F.-G.: Decizie asistată de calculator: decizii, decidenţi – metode de bază şi instrumente informatice asociate, Ed. a 2-a, Bucureşti, Editura Tehnică, 2005, 376 p.
  4. FILIP, F.-G. HERERA-VIEDMA, E.: Big Data in the European Union, National Academy of Engineering (NAE), SUA, Winter Bridge: A Global View of Big Data, 2014, 44(4), pp. 33-37.
  5. HAN, J.; KAMBER M.; PEI, J.: Data Mining: Concepts and Techniques, Third Ed., Elsevier, 2011, 703 p.
  6. HASTIE, T.; TIBSHIRANI, R., FRIEDMAN, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition, Springer-Verlag New York, 2009, 745 p.
  7. IONIŢĂ, A.; LEPĂDATU, C.; DUMITRESCU, G.: Digital Cultural Landscape Content, Hernik, Jozef (ed.) Cultural Landscape – Across Disciplines, Oficyna Wydawnicza BRANTA, Kracow, Poland, 2009, pp. 255-277.
  8. LEPĂDATU, C.: De la descriere bibliografică la web semantic, Academica, 2006, XVI (185-186/48-49), pp 78-81 şi XVI (188/51), pp. 42-85.
  9. LEPĂDATU, C.: Support Systems for Knowledge Culture based on Solution and Tools from the Field of Business Intelligence – SSCBI, Proceedings of the Workshop IST – Multidisciplinary Approaches, Bucharest, Romania, 2006, pp. 7-12.
  10. LEPĂDATU, C.: Acquisition Policy of a Library and Data Mining Techniques, Studies in informatics and control, 16(4), 2007, pp. 413-420.
  11. LEPĂDATU, C.: Explorarea datelor şi descoperirea cunoştinţelor – probleme, obiective şi strategii, Revista Română de Informatică şi Automatică, 2012, 22(4), pp. 5-14.
  12. LEPĂDATU, C.: Metode exploratorii multidimensionale, Revista Română de Informatică şi Automatică, 23(1), 2013, pp. 14-30.
  13. LEPĂDATU, C.: Sisteme suport pentru decizii şi bibliomining, Revista Română de Informatică şi Automatică, 24(2), 2014, pp. 17-30.
  14. LEPĂDATU, C.: Sisteme suport pentru decizii de bibliotecă, Revista Română de Informatică şi Automatică, 24(3), 2014, pp. 5-17.
  15. MAIMON, O. ROKACH, L. (Eds.): Data Mining and Knowledge Discovery Handbook, 2nd Ed., Springer New York Dordrecht Heidelberg London, 2010, 1306 p.
  16. NICULESCU, C.; LEPĂDATU, C.; ŞTEFĂNESCU, D.: SSCBI – A Teleworking Environment of Support Systems for Knowledge Culture. In the CD REV 2007 Proceedings of the International Conference Remote Engeneering Virtual Instrumentation, Porto, Portugal, iunie 2007.
  17. TUFFÉRY, S.: Modélisation Predictive et Apprentissage Statistique avec R, TECHNIP, 2015, 415 p.
  18. VAPNIK, V. N.: Statistical learning theory, Wiley-Interscience, 1998, 768 p.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.