Blog GLOSSARY

What Precision and Recall are?

https://tensor-flow.com

What precision and recall are?   After the predictive model has been finished, the most important question is: How good is it? Does it predict well? Evaluating the model is one of the most important tasks in the data science project,  it indicates how good predictions are. Very often for classification problems we look at metrics …

Continue Reading
Blog GLOSSARY

What is TensorFlow?

https://tensor-flow.com

What is TensorFlow? TensorFlow is an open source software library for machine learning developed by Google –  Google Brain team. Name TensorFlow derives from the operations which neural networks perform on multidimensional data arrays, often referred to as “tensors”. It is using data flow graphs and is capable of building and training variety of different …

Continue Reading
Blog GLOSSARY

What is Hadoop YARN?

https://tensor-flow.com

Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines such as interactive SQL, real-time streaming, data science and batch processing to handle data stored on a single platform, unlocking an entirely new approach to analytics. YARN is the foundation of the new generation of Hadoop and is enabling organizations everywhere …

Continue Reading
Blog GLOSSARY

What is Hadoop Flume?

https://tensor-flow.com

Hadoop Flume was created in the course of incubator Apache project to allow you to flow data from a source into your Hadoop environment. In Flume, the entities you work with are called sources, decorators, and sinks. A source can be any data source, and Flume has many predefined source adapters. A sink is the …

Continue Reading
Blog GLOSSARY

What is Apache Kafka?

https://tensor-flow.com

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a “massively scalable pub/sub message queue architected as a distributed transaction log, making it highly valuable …

Continue Reading
Blog GLOSSARY

What is Hadoop Zookeeper?

https://tensor-flow.com

Hadoop Zookeeper is an open source Apache™ project that provides a centralized infrastructure and services that enable synchronization across a cluster. ZooKeeper maintains common objects needed in large cluster environments. Examples of these objects include configuration information, hierarchical naming space, etc. Applications can leverage these services to coordinate distributed processing across large clusters. Name services, …

Continue Reading
Blog GLOSSARY

What is Hadoop Hbase?

https://tensor-flow.com

Hadoop Hbase is a column-oriented database management system that runs on top of HDFS. It is well suited for sparse data sets, which are common in many big data use cases. An HBase system comprises a set of tables. Each table contains rows and columns, much like a traditional database. Each table must have an …

Continue Reading
Blog GLOSSARY

What is Hadoop Sqoop?

https://tensor-flow.com

Hadoop Sqoop efficiently transfers bulk data between Apache Hadoop and structured datastores such as relational databases. Sqoop helps offload certain tasks (such as ETL processing) from the EDW to Hadoop for efficient execution at a much lower cost. Sqoop can also be used to extract data from Hadoop and export it into external structured datastores. …

Continue Reading
Blog GLOSSARY

What is Hadoop Hive?

https://tensor-flow.com

Hadoop Hive is a runtime Hadoop support structure that allows anyone who is already fluent with SQL (which is commonplace for relational data-base developers) to leverage the Hadoop platform right out of the gate. Hive allows SQL developers to write Hive Query Language (HQL) statements that are similar to standard SQL statements. HQL is limited …

Continue Reading
Blog GLOSSARY

What is Hadoop Pig?

https://tensor-flow.com

Hadoop Pig was initially developed at Yahoo to allow people using Hadoop to focus more on analyzing large datasets and spend less time writing mappers and reduce programs. This would allow people to do what they want to do instead of thinking about mapper and reducer tasks. Name Pig was given to the programming language …

Continue Reading