Um estudo de algoritmos para extraÃ§Ã£o de regras baseados em anÃ¡lise formal de conceitos
This work presents a comparative analysis of techniques for extracting rules from databases through Formal Concept Analysis (FCA). The rules considered here are sets of dependencies among attributes of databases. Specifically, the dependencies are: implications, functional dependencies, association rules and classification rules. Those rules are mainly sourcered in databases theory in which they have a fundamental role as a way of helping with the process of decisions' taken case of implications, association rules and classification rules and with normalizing logical models case of functional dependencies. The FCA has a mathematical structure especially adequate for helping in data analysis. Such analysis is done through concept lattices that represent data in a hierachical manner. So, the objective of this work is the analysis and the comparison of methods that use FCA for discovering dependencies among attributes of databases. It has been analyzed ten representative algorithm for extracting the four types of rules mentioned. From those algorithms, four are used in the extraction of functional dependencies and implications. They are: Next Closure, Find Implications, Impec and Aprem-IR. The last six algorithms are useful for extracting association and classification rules. Four algorithms have been analyzed for the extraction of association rules: AClose, Frequent Next Neighbours, Titanic and Galicia. Finally, two algorithms have been analyzed for the extraction of classification rules: GRAND and Rulearner. The algorithms have been implemented and submitted to real and synthetic databases. The databases have been chosen with two criteria: database's size (number of entries) and density. Those criteria try to eliminate a deficiency detected in the literature in choosing databases for algorithms' evaluation. One noted that those algorithms have characteristic behaviors for different databases. In this work, it is suggested the adequacy of each algorithm to databases with different densities and sizes.
Document Full Text
Advisor:Newton Jose Vieira; Luiz Enrique Zarate; Rodolfo Sergio F de Resende
School:Universidade Federal de Minas Gerais
Source Type:Master's Thesis
Keywords:computaã§ã£oâ teses â banco de dados â teses logica simbolica e matematicaâ teses computaã§ã£oâ matematicaâ tesesâ sistemas recuperaã§ã£o da informaã§ã£o mineraã§ã£o computaã§ã£o â tesesâ teoria dos reticulados
Date of Publication:02/16/2007