Sécurisation des entrepôts de données : de la conception à l’exploitation

Les entrepôts des données centralisent des données critiques et sensibles qui sont nécessaires pour l'analyse et la prise de décisions. La centralisation permet une gestion efficace et une administration aisée, par contre de point de vu sécurité, centraliser les données critiques dans un seul...

Full description

Bibliographic Details
Main Author: Triki, Salah
Other Authors: Lyon 2
Language:fr
Published: 2013
Subjects:
Online Access:http://www.theses.fr/2013LYO22026
Description
Summary:Les entrepôts des données centralisent des données critiques et sensibles qui sont nécessaires pour l'analyse et la prise de décisions. La centralisation permet une gestion efficace et une administration aisée, par contre de point de vu sécurité, centraliser les données critiques dans un seul endroit ; l'entrepôt de données, attire la convoitise des pirates. En 2011 par exemple, les entreprises SONY et RSA, ont été victime d'attaques qui ont engendré des pertes considérables. En plus, les entreprises sont de plus en plus dépendantes des entrepôts des données du faite qu'ils génèrent de plus en plus de données. Le cabinet d'analyse IDC indique que les quantités des données générées par les entreprise sont en train d'exploser et que en 2015, la quantité des données atteindra 8 billion TB. La sécurisation des entrepôts de données est donc primordiale. Dans ce contexte, nos travaux de thèse consiste a proposer une architecture pour la sécurisation des entrepôts de données de la conception à l'exploitation. Au niveau conceptuel, nous proposons un profil UML pour la définition des autorisations et les niveaux de sensibilités des données, une méthode pour la prévention des inférences, et des règles pour analyser la cohérence des autorisations. Au niveau exploitation, une méthode pour renforcer les autorisations définis au niveau conception, une méthode pour la prévention des inférences, une méthode pour respecter les contraintes d'additivités.Afin de valider l'architecture que nous proposons et montrer son applicabilité, nous l'avons tester le benchmark Star Schema Benchmark. === Companies have to make strategic decisions that involve competitive advantages. In the context of decision making, the data warehouse concept has emerged in the nineties. A data warehouse is a special kind of database that consolidates and historizes data from the operational information system of a company. Moreover, a company's data are proprietary and sensitive and should not be sold without controls. Indeed, some data are personal and may harm their owners when they are disclosed, for example, medical data, religious or ideological beliefs. Thus, many governments have enacted laws to protect the private lives of their citizens. Faced with these laws, organizations are, therefore, forced to implement strict security measures to comply with these laws. Our work takes place in the context of secure data warehouses that can be addressed at two levels: (i) design that aims to develop a secure data storage level, and (ii) operating level, which aims to strengthen the rights access / user entitlements, and any malicious data to infer prohibited from data it has access to user banned. For securing the design level, we have made three contributions. The first contribution is a specification language for secure storage. This language is a UML profile called SECDW+, which is an extended version of SECDW for consideration of conflicts of interest in design level. SECDW is a UML profile for specifying some concepts of security in a data warehouse by adopting the standard models of RBAC security and MAC. Although SECDW allows the designer to specify what role has access to any part of the data warehouse, it does not take into account conflicts of interest. Thus, through stereotypes and tagged values , we extended SECDW to allow the definition of conflict of interest for the various elements of a multidimensional model. Our second contribution, at this level, is an approach to detect potential inferences from conception. Our approach is based on the class diagram of the power sources to detect inferences conceptual level. Note that prevention inferences at this level reduces the cost of administering the OLAP server used to manage access to a data warehouse. Finally, our third contribution to the design of a secure warehouse consists of rules for analyzing the consistency of authorizations modeled. As for safety operating level, we proposed: an architecture for enhancing the permissions for configuration, a method for the prevention of inferences, and a method to meet the constraints of additive measures. The proposed architecture adds to system access control, typically present in any secure DBMS, a module to prevent inferences. This takes our security methods against inferences and respect for additivity constraints. Our method of preventing inferences operates for both types of inferences: precise and partial. For accurate inferences, our method is based on Bayesian networks. It builds Bayesian networks corresponding to user queries using the MAX and MIN functions, and prohibits those that are likely to generate inferences. We proposed a set of definitions to translate the result of a query in Bayesian networks. Based on these definitions, we have developed algorithms for constructing Bayesian networks to prohibit those that are likely to generate inferences. In addition, to provide a reasonable response time needed to deal with the prevention treatment, we proposed a technique for predicting potential applications to prohibit. The technique is based on the frequency of inheritance queries to determine the most common query that could follow a request being processed. In addition to specific inferences (performed through queries using the MIN and MAX functions), our method is also facing partial inferences made through queries using the SUM function. Inspired by statistical techniques, our method relies on the distribution of data in the warehouse to decide to prohibit or allow the execution of queries ....