Welcome to hbspark’s documentation!

Overview:

HbSpark is meant to be a pipelining tool that allows for an efficient transfer of data from HBase to Spark. It relies on the thrift API for HBase to directly convert tables into Spark dataframes which can be then be used for distributed computing. Aftewards, results can be piped back to HBase through this same interface.

Contents:

The quickstart contains useful information about getting up and running with HbSpark:

QuickStart:
- Installation:
- Examples:

API

It is highly recommended to utilize the API documentation while building your own projects with HbSpark.

API Documentation
- Embedded Reference:
- Raw Reference:

Check out the QuickStart: section for further information, including how to install the project.