Redis for Big Data with Hadoop and Elk

本文涉及的产品
检索分析服务 Elasticsearch 版,2核4GB开发者规格 1个月
云数据库 Redis 版,社区版 2GB
推荐场景:
搭建游戏排行榜
简介: Redis helps enterprises make sense out of data by making database scaling more convenient and cost-effective.

We are already living in the era of Big Data. Big Data technology and products are ubiquitous in every aspect of our lives. From online banking to smart homes, Big Data has proven to be enormously useful in their respective use cases.

Redis—a high-performance key value database— has become an essential element in Big Data applications. As a NoSQL database, Redis helps enterprises make sense out of data by making database scaling more convenient and cost-effective. Cloud providers from across the globe, including Alibaba Cloud, are now offering a wide variety of Redis-related products for Big Data applications, such as Alibaba Cloud ApsaraDB for Redis.

This article introduces two methods of combining Redis with other Big Data technologies, specifically Hadoop and ELK.

Redis and Hadoop

Prominent in the world of big data, Hadoop is a distributed computing platform. With its high availability, expandability, fault tolerance, and low costs, it has now become a standard for Big Data systems. However, Hadoop's HDFS storage system makes it difficult to face end user applications (such as using a user’s browser history to recommend news articles or products). Therefore, the common practice is to send offline computing results to user-facing storage systems such as Redis and HBase.

Even though it is not suitable for facing end users, Hadoop is extremely versatile and useful in that it supports custom OutputFormat. If you need a customized output, all you have to do is inherit the OutputFormat by defining Redis OutputFormat in the Redis terminal to complete mapping.

1_1

Of course, there are rare situations where Redis is the output source, but luckily Hadoop also provides custom InputFormat functionality.

2_1

When you choose to use Redis, you can decide whether to use the Master-Slave version or the cluster version according to the scope of your results.

Redis and ELK

ELK is a combination of the three open-source tools ElasticSearch, Logstash, and Kibana. It has found wide-spread use in the field of log processing due to its flexible processing method, simple configuration, efficient search performance, and easy-to-use front-end interface,

Basic workflow is illustrated as below:

  • LogStashAgent is deployed to each target machine, where it collects data based on logstash syntax and then sends it to ElasticSearch.
  • ElasticSearch is then responsible for storing and indexing the data in LogAgent.
  • Kibana interacts directly with ElasticSearch and is responsible for visual log analysis.

However, if there are too many LogStashAgent entries or too many indexes, pushing all data directly into ElasticSearch will generate too much stress. Typically when faced in such situation, a buffer pool is commonly set up between ElasticSearch and LogStash. Redis is typically selected to serve as the buffer pool. This is facilitated by ELK’s default support for Redis integration. The entire process can be completed by simply changing a few settings, as outlined in the image below:

3

Concluding Remarks

Redis is now a major component used in many Big Data applications. Redis is a favorable alternative to traditional relational database services because of its scalability and wide support for various programming languages. Alibaba Cloud ApsaraDB for Redis is a key value database service that offers in-memory caching and high-speed access to applications hosted on the cloud. Try Alibaba Cloud ApsaraDB for Redis for free today with the $300 New User Free Credit.

相关实践学习
基于Redis实现在线游戏积分排行榜
本场景将介绍如何基于Redis数据库实现在线游戏中的游戏玩家积分排行榜功能。
云数据库 Redis 版使用教程
云数据库Redis版是兼容Redis协议标准的、提供持久化的内存数据库服务,基于高可靠双机热备架构及可无缝扩展的集群架构,满足高读写性能场景及容量需弹性变配的业务需求。 产品详情:https://www.aliyun.com/product/kvstore     ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库 ECS 实例和一台目标数据库 RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
目录
相关文章
|
4月前
|
消息中间件 存储 分布式计算
Hadoop学习笔记(HDP)-Part.19 安装Kafka
01 关于HDP 02 核心组件原理 03 资源规划 04 基础环境配置 05 Yum源配置 06 安装OracleJDK 07 安装MySQL 08 部署Ambari集群 09 安装OpenLDAP 10 创建集群 11 安装Kerberos 12 安装HDFS 13 安装Ranger 14 安装YARN+MR 15 安装HIVE 16 安装HBase 17 安装Spark2 18 安装Flink 19 安装Kafka 20 安装Flume
73 0
Hadoop学习笔记(HDP)-Part.19 安装Kafka
|
22天前
|
存储 分布式计算 自然语言处理
bigdata-10-Hadoop内容扩展
bigdata-10-Hadoop内容扩展
4 0
|
4月前
|
SQL 分布式计算 Hadoop
Hadoop学习笔记(HDP)-Part.16 安装HBase
01 关于HDP 02 核心组件原理 03 资源规划 04 基础环境配置 05 Yum源配置 06 安装OracleJDK 07 安装MySQL 08 部署Ambari集群 09 安装OpenLDAP 10 创建集群 11 安装Kerberos 12 安装HDFS 13 安装Ranger 14 安装YARN+MR 15 安装HIVE 16 安装HBase 17 安装Spark2 18 安装Flink 19 安装Kafka 20 安装Flume
82 1
Hadoop学习笔记(HDP)-Part.16 安装HBase
|
4月前
|
分布式计算 关系型数据库 MySQL
Hadoop学习笔记(HDP)-Part.07 安装MySQL
01 关于HDP 02 核心组件原理 03 资源规划 04 基础环境配置 05 Yum源配置 06 安装OracleJDK 07 安装MySQL 08 部署Ambari集群 09 安装OpenLDAP 10 创建集群 11 安装Kerberos 12 安装HDFS 13 安装Ranger 14 安装YARN+MR 15 安装HIVE 16 安装HBase 17 安装Spark2 18 安装Flink 19 安装Kafka 20 安装Flume
126 0
Hadoop学习笔记(HDP)-Part.07 安装MySQL
|
9月前
|
存储 NoSQL MongoDB
mongodb搭建Replica Set
mongodb搭建Replica Set 简单高效
158 0
|
NoSQL Java Redis
Window下配置Redis和Elasticsearch
Window下Redis和Elasticsearch的配置 (一)Window下Redis的配置 1.Redis的Window最新版下载 解压版下载解压之后的目录结构如下图所示: 2.Redis的客户
134 0
|
监控
实战Prometheus-redis_export
Prometheus-redis_export
|
存储 大数据 开发者
Elasticsearch 7.X data stream 深入详解
直接从一个新概念的认知过程说下 elasticsearch data stream。
802 0
Elasticsearch 7.X data stream 深入详解
|
分布式计算 资源调度 Java
安装hadoop伪分布式模式(Single Node Cluster)
目的 本文档介绍如何去安装单节点hadoop集群,以便你可以的了解和使用hadoop的HDFS和MapReduce. 环境: os: CentOS release 6.5 (Final) ip: 172.
721 0
|
分布式计算 JavaScript 大数据
ES-hadoop写数据到阿里云Elasticsearch
ES-Hadoop是一个用于Elasticsearch和Hadoop进行交互的开源独立库,在Hadoop和Elasticsearch之间起到桥梁的作用,本文基于阿里云E-MapReduce和阿里云Elasticsearch,演示如何通过ES-Hadoop连通Hadoop生态系统和Elasticsearch。
4525 0