Crux is an analytics application built using HBase.

Crux development is happening in two branches, master and aggregation. The aggregation branch is the current action branch, where we are providing enhanced querying and reporting capabilities.

Crux has been Tested Against
  1. Cloudera's distribution CDH4
  2. Cloudera's distribution CDH3 - Hadoop 0.20.2-CDH3u5
  3. Apache HBase 0.92.1
  4. Apache HBase 0.90.3 on Apache Hadoop 0.20.2 with Hadoop append.

Crux features

  • Aggregation of HBase data - min, max etc
  • Functions like ceil, round, uppercase etc
  • Advanced querying and filtering
  • Drag and drop report designer
  • Web based front end
  • Support for various HBase versions
  • Support for various datatypes
  • and lots more..give it a try

Crux license

  • Crux license is Apache License v2

Why HBase ?

  • HBase provides a low latency columnar storage for big data. HBase fits perfectly with the Hadoop stack, using HDFS for storage and providing out of the box support for Map Reduce. Data can be ingested into HBase from a traditional Map Reduce application, Pig or Hive, Cascading, Flume, Scribe, Hiho or Sqoop. Data can also be imported using HBase bulk loader

Why Crux ?

  • Once you have collected your data in HBase, there is a need to expose it to business and technical users of your organization. The size of the data as well as the unstructured format makes it difficult to use a traditional reporting application with it. Crux uses native HBase integration to help you query your data. Crux has a web based report designer and viewer, making report creation and sharing easier. Crux comes with built in comparators for long, short, int, double, float, string and boolean datatypes and can create tables, graphs, scatter plots and other visuals for your data. Simple as well as composite rowkeys are supported via mapping Row Key Aliases. One can define filters on the row keys and perform get operations and range scans.Crux works with your schema and your data, there is no predefined schema for you to fit your data in.

Crux Design

  • Crux uses the HBase Java client API, which is a fully featured way to access HBase. There are other clients available for HBase, for example Rest, Thrift and Avro. At the time of writing Crux, these clients do not expose the complete conditional querying capability needed by Crux. Then there are batch clients like Map Reduce, Hive handler, Pig and Cascading. These are great for performing batch analysis using HBase data. However, a reporting application needs faster response time than the batch nature of these. Crux thus uses the HBase Java Client API.

  • Crux also uses MySQL to store the mapping of HBase schemas, connections and reports. The front end is built using Ajax, Dojo, Struts, with Hibernate. Crux uses open source software and comes with Apache License.

Crux Mailing List, Issue Reporting and Support

Crux Documentation and User Guide

  • Crux features, guides and news is available at http://nubetech.co/category/crux-2. Besides this, Crux has an extensive inbuilt guide per page to help you create your reports effortlessly.The mailing list is also a good source of information about Crux.

Using Crux

  • Using Crux, one can query HBase tables and create reports to analyze results.
  • To do this, there are a few simple steps.

Prerequisies:

  • A running HBase
  • A running MySQL instance
  • A servlet container like Tomcat.
  • Java installed. We used JDK 1.6
  • Maven

Once you have the prerequisite

a. Create database for crux in MySQL

mysql>create databse crux;
mysql>use crux;
Create schema by running crux/db/schema.sql file in MySQL prompt,
mysql>source ${CRUX_HOME}/db/schema.sql

This creates the schema required for saving the report definitions.

b. Build crux(See instructions to build crux with Maven). Or download the tar appropriate for your HBase version from the github downloads link.

c. Copy crux.jar to ${HBASE_HOME}/lib or edit ${HBASE_HOME}/conf/hbase-env.sh and add the jars location to the file.

For example,

# Extra Java CLASSPATH elements Optional

export HBASE_CLASSPATH=

export HBASE_CLASSPATH="/home/crux/target/crux.jar"
Restart hbase
Go to Hbase home/bin and then enter start-hbase.sh
$ HBASE_HOME/bin/start-hbase.sh
Then start hbase shell.
$ HBASE_HOME/bin/hbase shell

This is needed as Crux has built in filters which work on the server side to select the data you choose.

d. Drop the war in tomcat/webapps and start tomcat by going to tomcat home/bin and enter startup.sh

$apache-tomcat-home/bin/startup.sh

Alternatively, just run

CRUX_HOME$ mvn jetty:run 

e. Go to http://localhost:8080/crux and define your connection, mapping and report.

Instructions to build Crux with Maven

  1. Update hibernate.properties(crux/) with your MySQL host, port, dbname, testDbName, user and password.
  2. Download struts2-fullhibernatecore-plugin-2.2.2-GA.jar from http://code.google.com/p/full-hibernate-plugin-for-struts2/downloads/detail?name=struts2-fullhibernatecore-plugin-2.2.2-GA.jar&can=2&q=and add to your local repository by executing command given below.

    mvn install:install-file -DgroupId=com.google.code -DartifactId=struts2-fullhibernatecore-plugin -Dversion=2.2.2-GA
    -Dpackaging=jar -Dfile=${PATH_TO_struts2-fullhibernatecore-plugin-2.2.2-GA.jar}
  3. Crux can be built against HBase 0.90.3(default), HBase 0.90.6 or against HBase 0.92.1. Crux artifacts crux.war and crux.jar are created in crux/target/

To build and create war against 0.90.3
Go to the base directory where pom.xml is located and enter

mvn install -DskipTests (in order to skip tests) or
mvn install
to run tests and create war

For CDH4

CRUX_HOME$ mvn -Dcdh4 install

For CDH3

CRUX_HOME$mvn -Dcdh3 install

Instructions to run test cases of Crux with Maven

CRUX_HOME$ mvn test

(For running tests against 0.92.1, set umask 0022 and run tests against hbase0.92 profile)

Instructions to set up the dev environment in Eclipse

Crux Limitations

  • Crux is an HBase application, so the schema and the querying has to be designed accordingly.
  • As far as possible, try to create row filters with equals/greater than equals/less than so as to leverage HBase's Get and Range Scan operations.

Sample data

  • Crux comes with sample data - you can refer testData/BseStock/README.txt for downloading BSE stock data for given list of scrips and populating hbase with it.


Crux是使用HBase构建的分析应用程序。

Crux开发正在两个分支中进行,主要和聚合。聚合分支是当前的操作分支,我们提供增强的查询和报告功能

Crux has been Tested Against
  1. Cloudera's distribution CDH4
  2. Cloudera's distribution CDH3 - Hadoop 0.20.2-CDH3u5
  3. Apache HBase 0.92.1
  4. Apache HBase 0.90.3 on Apache Hadoop 0.20.2 with Hadoop append.

关键特征

  • 汇总HBase数据 - 分,最大等
  • 功能如ceil,round,uppercase等
  • 高级查询和过滤
  • 拖放报表设计器
  • 基于Web的前端
  • 支持各种HBase版本
  • 支持各种数据类型
  • 和更多…试一试

许可证

  • Crux许可证是 Apache License v2

为什么选择HBase? h2>
  • HBase为大数据提供了低延迟柱状存储。 HBase与Hadoop堆叠完美匹配,使用HDFS进行存储,并为Map Reduce提供开箱即用的支持。数据可以从传统的Map Reduce应用程序,Pig或Hive,Cascading,Flume,Scribe,Hiho或Sqoop中吸收到HBase中。也可以使用HBase批量加载器
  • 导入数据

为什么是Crux? h2>
  • 在HBase中收集数据后,需要将其公开给组织的业务和技术用户。 数据的大小以及非结构化格式使得传统报告应用程序难以使用。 Crux使用本机HBase集成来帮助您查询数据。 Crux有一个基于网络的报告设计者和观察者,使报告的创建和共享更容易。 Crux内置了long,short,int,double,float,string和boolean数据类型的比较器,可以为数据创建表,图形,散点图和其他视觉效果。通过映射行密钥别名支持简单的以及复合的行密钥。可以在行键上定义过滤器,并执行获取操作和范围扫描.Crux适用于您的架构和数据,没有预定义的模式来适应您的数据。

Crux Design

  • Crux使用 HBase Java客户端API ,这是访问HBase的全功能方法。还有其他可用于HBase的客户端,例如Rest,Thrift和Avro。在编写Crux时,这些客户端不会暴露Crux所需的完整的条件查询功能。然后有批量客户端,如Map Reduce,Hive处理程序,Pig和Cascading。这些非常适合使用HBase数据进行批量分析。然而,报告应用程序需要比这些批处理更快的响应时间。 Crux因此使用HBase Java Client API。

  • Crux还使用 MySQL 来存储HBase模式,连接和报告的映射。前端是使用Ajax,Dojo,Struts和Hibernate构建的。 Crux使用开源软件,并附带Apache许可证。

Crux邮件列表,问题报告和支持

Crux文档和用户指南

  • Crux功能,指南和新闻可从 获取http://nubetech.co/category/crux-2 。除此之外,Crux还有一个广泛的内置指南,帮助您轻松创建报表。邮件列表也是Crux的一个很好的信息来源。

使用Crux

  • 使用Crux,可以查询HBase表并创建报告以分析结果。
  • 为了做到这一点,有一些简单的步骤。

前提条件:

  • 正在运行的HBase
  • 正在运行的MySQL实例
  • 像Tomcat这样的servlet容器。
  • 安装Java。我们使用JDK 1.6
  • Maven

一旦你有先决条件

a。在MySQL中创建crux的数据库

mysql> create databse crux;
mysql> use crux;
通过在MySQL提示符下运行crux / db / schema.sql文件来创建模式 mysql> source $ {CRUX_HOME} /db/schema.sql

这将创建保存报表定义所需的模式。

b。构建关键(请参阅使用Maven构建关键点的说明)。或者从github下载链接下载适用于您的HBase版本的tar。

c。将crux.jar复制到$ {HBASE_HOME} / lib或编辑$ {HBASE_HOME} /conf/hbase-env.sh,并将jars位置添加到文件中。

例如,

#额外的Java CLASSPATH元素可选

export HBASE_CLASSPATH=

export HBASE_CLASSPATH =/ home / crux / target / crux.jar
重新启动hbase 转到Hbase home / bin,然后输入start-hbase.sh $ HBASE_HOME / bin / start-hbase.sh
然后启动hbase shell。
$ HBASE_HOME / bin / hbase shell

这是需要的,因为Crux内置了在服务器端工作的过滤器,以选择您选择的数据。

d。在tomcat / webapps中删除战争,并通过转到tomcat home / bin并启动tomcat并进入startup.sh

$apache-tomcat-home/bin/startup.sh

或者,只需运行

CRUX_HOME$ mvn jetty:run

e。转到 http:// localhost:8080 / crux 并定义您的连接,映射和报告。

使用Maven构建Crux的说明

  1. Update hibernate.properties(crux/) with your MySQL host, port, dbname, testDbName, user and password.
  2. Download struts2-fullhibernatecore-plugin-2.2.2-GA.jar from http://code.google.com/p/full-hibernate-plugin-for-struts2/downloads/detail?name=struts2-fullhibernatecore-plugin-2.2.2-GA.jar&can=2&q=and add to your local repository by executing command given below.


    mvn install:install-file -DgroupId=com.google.code -DartifactId=struts2-fullhibernatecore-plugin -Dversion=2.2.2-GA
    -Dpackaging=jar -Dfile=${PATH_TO_struts2-fullhibernatecore-plugin-2.2.2-GA.jar}

  3. Crux can be built against HBase 0.90.3(default), HBase 0.90.6 or against HBase 0.92.1. Crux artifacts crux.war and crux.jar are created in crux/target/

根据0.90.3 建立和创造战争 转到pom.xml所在的基本目录,然后输入

mvn install -DskipTests (为了跳过测试)或者 mvn install
运行测试并创建战争

对于CDH4

CRUX_HOME$ mvn -Dcdh4 install

对于CDH3

CRUX_HOME$mvn -Dcdh3 install

使用Maven运行Crux测试用例的说明

CRUX_HOME$ mvn test

(对于0.92.1运行测试,设置umask 0022并运行hbase0.92配置文件的测试)

在Eclipse中设置开发环境的说明

关键限制

  • Crux是一个HBase应用程序,因此必须相应地设计模式和查询。
  • 尽可能尝试创建等于/大于等于/小于的行过滤器 以便利用HBase的获取和范围扫描操作。

样本数据

  • Crux附带样本数据 - 您可以参考testData / BseStock / README.txt下载BSE股票数据,以获得给定的扫描列表和填充hbase。




相关问题推荐