Happywzy

分布式存储

发表于 2020-04-26 分类于 k8s

CephFS

glusterfs

rook

rowkey字典排序

发表于 2020-04-24 分类于 hadoop

排序规则

rowkey从高位到低位依照ASCII码表排序;如A排在a前面,a排在aa ab前面;
如果rowkey一样,按照column family:qualifier排序;
如果column family:qualifier一样,按照时间戳排序;

充分利用`rowkey`会排序特性

如果热点数据的rowkey前缀一样，则很容易被存储在同一RegionServer上，这样就会造成访问的性能瓶颈;
rowkey前缀提供一个随机字符串,可以更好的分布在集群中，但是失去了排序特性;
rowkey应该设计的精简，过长会加长硬盘和网络IO的开销.

`rowkey`排序

scan返回的数据是按照rowkey排序;
API可以设置StartRow、StopRow查询范围内数据;

如rowkey是时间日期格式,以下可以查询2020年的数据:

Scan scan = new Scan();
scan.setStartRow(Bytes.toBytes("20200101"));
scan.setStopRow(Bytes.toBytes("20210101"));

注意[StartRow,StopRow)左闭右开.

`ASCII`编码

hbase-client java API 操作

发表于 2020-04-23 分类于 hadoop

spring boot集成hbase-client

参考上文使用spring-boot-starter-hbase和RowMapper.

@Autowired
private HbaseTemplate hbaseTemplate;

创建表

/**
* 创建表
* @return
* @throws IOException
*/
public String createTable() throws IOException {
    Admin admin = hbaseTemplate.getConnection().getAdmin();
    HTableDescriptor hTableDescriptor = new HTableDescriptor(TableName.valueOf(table_name));
    hTableDescriptor.addFamily(new HColumnDescriptor(column_family));
    if (admin.tableExists(TableName.valueOf(table_name))) {
        return "tableExists";
    } else {
        admin.createTable(hTableDescriptor);
        return "ok";
    }
}

批量插入数据

/**
* 批量插入数据
* @param i
*/
public void puts(int i) {
    List<Mutation> puts = new ArrayList<>();
    // 设值
    while (i > 0) {
        Put put = new Put(Bytes.toBytes(Long.toString(18752038428L - i)));
        put.addColumn(Bytes.toBytes(column_family), Bytes.toBytes("name"), Bytes.toBytes("JThink" + i));
        put.addColumn(Bytes.toBytes(column_family), Bytes.toBytes("age"), Bytes.toBytes(i));
        puts.add(put);
        i--;
    }
    this.hbaseTemplate.saveOrUpdates(table_name, puts);
}

根据rowkey查询数据

/**
* 根据rowkey查询数据
* @param row
* @return
*/
public PeopleDto get(String row) {
    PeopleDto dto = this.hbaseTemplate.get(table_name, row, new PeopleRowMapper());
    return dto;
}

根据rowkey删除数据

/**
* 根据rowkey删除数据
*/
public void delete(String rk) {
    Mutation delete = new Delete(Bytes.toBytes(rk));
    this.hbaseTemplate.saveOrUpdate(table_name, delete);
}

批量查询数据

/**
* 区间查找 [startRow, stopRow)
* @param startRow
* @param stopRow
* @return
*/
public List<PeopleDto> query(String startRow, String stopRow) {
    Scan scan = new Scan(Bytes.toBytes(startRow), Bytes.toBytes(stopRow));
    scan.setCaching(5000);
    List<PeopleDto> dtos = this.hbaseTemplate.find(table_name, scan, new PeopleRowMapper());
    return dtos;
}

注意查找的结果遵循左闭右开原则.

过滤

// 要查询的表
HTable table = new HTable(conf, "table1");
// 要查询的字段
Scan scan = new Scan();
scan.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("a"));
scan.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("b"));
// where条件
// a = 1
SingleColumnValueFilter a = new SingleColumnValueFilter(Bytes.toBytes("cf"),
        Bytes.toBytes("a"), CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(1)));
filterList.addFilter(filter);
// b = 2
SingleColumnValueFilter b = new SingleColumnValueFilter(Bytes.toBytes("cf"),
        Bytes.toBytes("b"), CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(2)));
// and
FilterList filterList = new FilterList(Operator.MUST_PASS_ALL, a, b);
scan.setFilter(filterList);

参考链接

hbase之namespace

发表于 2020-04-23 分类于 hadoop

查看namespace

hbase(main):008:0> list_namespace
NAMESPACE            
default                
hbase                 
test
3 row(s)
Took 0.0327 seconds

default：创建表时未指定命名空间的话默认挂在default下。

查看namespace所有表

hbase(main):009:0> list_namespace_tables "test"
TABLE
test
user_table
2 row(s)
Took 0.0300 seconds
=> ["test", "user_table"]

创建namespace

hbase(main):010:0> create_namespace "test"
Took 0.2781 seconds

hbase(main):018:0> create_namespace "test", {"author"=>"test", "create_time"=>"2020-01-4 17:51:53"}
Took 0.2262 seconds

查看namespace信息

describe_namespace "test"

修改namespace

alter_namespace "test", {METHOD=>"set", "author"=>"wuzhiyong"}

alter_namespace "test", {METHOD=>"set", "email"=>"1154365135@qq.com"}

alter_namespace "test", {METHOD=>"unset", NAME=>"email"}

删除namespace

drop_namespace "test"

注意要删除的namespace必须是空的，其下没有表，否则会删除失败.

创建表时指定namespace

create "test:user", "f"

参考链接

https://www.cnblogs.com/cc11001100/p/9911730.html

JdbcTemplate中RowMapper用法

发表于 2020-04-23 分类于 java

查询结果和java对象之间的映射.

RowMapper映射Bean容器的用法

class UserRowMapper implements RowMapper<User> {
    public User mapRow(ResultSet rs, int rowNum) throws SQLException {
        User user = new User();
        user.setId(rs.getInt("id"));
        user.setName(rs.getString("name"));
        user.setGender(rs.getString("gender"));
        return user;
    }
}

如此，完成了一个对User类的RowMapper映射。直接jdbcTemplate.query(sql,new UserRowMapper)即可将查询的信息存入java Bean中，靠的是bean中的get/set方法。

参考链接

https://blog.csdn.net/chenyezhou1/article/details/71122570

springboot集成hbase-client,解决包冲突问题

发表于 2020-04-23 分类于 java

spring boot集成hbase

关键配置

参考源码：HbaseProperties,HbaseAutoConfiguration,缺少配置启动报错.

spring:
  data:
    hbase:
      quorum: 192.168.41.128:2181
      rootDir: file:///root/hbase/rootdir
      nodeParent: /hbase

使用

@Autowired
private HbaseTemplate hbaseTemplate;

问题

默认pom配置会存在日志包和servlet包冲突问题：

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>com.spring4all</groupId>
        <artifactId>spring-boot-starter-hbase</artifactId>
        <version>1.0.0.RELEASE</version>
    </dependency>
    
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>1.2.12</version>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
        <exclusions>
            <exclusion>
                <groupId>org.junit.vintage</groupId>
                <artifactId>junit-vintage-engine</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
</dependencies>

解决办法

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>com.spring4all</groupId>
        <artifactId>spring-boot-starter-hbase</artifactId>
        <version>1.0.0.RELEASE</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>1.2.12</version>
        <exclusions>
            <exclusion>
                <groupId>javax.servlet</groupId>
                <artifactId>servlet-api</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>log4j-over-slf4j</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <!--<exclusion>-->
                <!--<groupId>com.google.guava</groupId>-->
                <!--<artifactId>guava</artifactId>-->
            <!--</exclusion>-->
        </exclusions>
    </dependency>


    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
        <exclusions>
            <exclusion>
                <groupId>org.junit.vintage</groupId>
                <artifactId>junit-vintage-engine</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
</dependencies>

CephFS

glusterfs

rook

排序规则

充分利用rowkey会排序特性

rowkey排序

ASCII编码

spring boot集成hbase-client

创建表

批量插入数据

根据rowkey查询数据

根据rowkey删除数据

批量查询数据

过滤

参考链接

查看namespace

查看namespace所有表

创建namespace

查看namespace信息

修改namespace

删除namespace

创建表时指定namespace

参考链接

RowMapper映射Bean容器的用法

参考链接

spring boot集成hbase

关键配置

使用

问题

解决办法

充分利用`rowkey`会排序特性

`rowkey`排序

`ASCII`编码