Hadoop Flume was created in the course of incubator Apache project to allow you to flow data from a source into your Hadoop environment. In Flume, the entities you work with are called sources, decorators, and sinks. A source can be any data source, and Flume has many predefined source adapters. A sink is the target of a specific operation (and in Flume, among other paradigms that use this term, the sink of one operation can be the source for the next downstream operation). A decorator is an operation on the stream that can transform the stream in some manner, which could be to compress or uncompress data, modify data by adding or removing pieces of information, and more. Flume allows you a number of different configurations and topologies, allowing you to choose the right setup for your application. Flume is a distributed system which runs across multiple machines. It can collect large volumes of data from many applications and systems. It includes mechanisms for load balancing and failover, and it can be extended and customized in many ways. Flume is a scalable, reliable, configurable and extensible system for management the movement of large volumes of data.