QuickStart:
Installation:
To use hbspark, first install it using pip:
$ pip install hbspark
Note that correct versioning is required. Please review the PyPi repository to determine compatibility.
Examples:
Initialization and imports:
To initialize the connection proper imports and initialization is required. The simplest example is provided:
import hbspark
from pyspark.sql import SparkSession
spark_session = SparkSession.builder.appName('my-spark-app').master('local[1]').getOrCreate()
hbase_host_name = '___.___.___.___'
hbase_host_name2 = 'hostname_at_dns'
hbspark.connect(hbaseHostName, spark)
Get all of the tables:
#Initialize hbspark as in "Initialization and imports"
all_tables = hbspark.tables()
print(all_tables)
# ['table1', 'table2', ...]
Next Steps:
With your initialized hbspark connection, read the API Documentation in order to work with hbase tables.