This project aims to detect Data Loss in big environments distributing policies to the three type of components the project has:
Host Agent Network Agent Server ProjectOverview
# Project overview to get an idea of the hole system. Introduction
Baraeco is a DLP System that detect and tracks sensitive data across the organization network. The main components are:
host-agent, used to track files and data at kernel level, it should monitor/block usb sticks and other drives, memory of the programs, file access, hooking the most important system calls. network-agent, used to sniff the network data packets to detect/block sensitive data and colaborate with the host-agents. server (or multiple servers), used collect all the generated alerts and display this information in a web interface. Some important features are:
Policies: the policies must be distributed in a hierarchical manner. From servers to network-agents, ending in host-agents. Targets: The targets to monitor must be configured to apply the policies correctly. Entity definition: we are talking about sensitive data in this document, but how do we represent it in our system? This data is synthetized in fingerprints. Fingerprints can be literal string, regexp, or a complex combination with algorithms to filtes false positives. This fingerprints (or patterns) are associated with an Entity and have a relative value (each of them) for that Entity. Entities have also a asset value and a threshold, and this variables are the key to generate alerts and filter fale positives. The Engine is recieves information from the data containers. This data containers can be Files, Apps (memory of the applications), protocols, databases. All of them are analyzed with the connectors to fetch the important data. The engine search for Entities and generate alerts if the score of the entity is greater than the threshlod value. The score can be calculated with different variables, but one of them is entity score is the sum of each pattern value multiply by the number of the pattern matches. For a file, with auto generated fingerprints the score is just the number of matches.