current position:Home>Cloud lesson | introduction to Clickhouse component of basic principles of MRS

Cloud lesson | introduction to Clickhouse component of basic principles of MRS

2021-08-23 02:24:30 Huawei cloud developer community

image.png
Yuezhifengyun is a big cloud information man of Huawei , Good at presenting complex information in multiple ways , A picture it produced ( The theory of cloud chart )、 A simple blog post ( Cloud class ) Or short video ( Cloud vision Hall ) There is always one that can let you quickly start Huawei cloud . For more highlights, please Click here .

Abstract : stay 2016 High performance of open source 、 Open source online analytical column database management system . Its data compression ratio is high , Based on multi-core parallel computing 、 Vectorization execution and SIMD, Excellent performance . At present ClickHouse It is widely used in Internet advertising 、App and Web Traffic 、 telecom 、 Finance 、 Internet of things and many other fields , It is very suitable for business intelligence application scenarios , There are a lot of applications and practices at home and abroad .

This article is shared from Huawei cloud community 《【 Cloud class 】EI The first 22 course MRS Of basic principles ClickHouse Component is introduced 》, author : Review the situation

ClickHouse It is an open source column database for OLAP , It is independent of Hadoop Big data system , The core features are extreme compression rate and fast query performance . meanwhile ,ClickHouse Support SQL Inquire about , And good query performance , In particular, the aggregate analysis query based on large and wide tables has excellent performance , It is one order of magnitude faster than other analytical databases . At present ClickHouse It is widely used in Internet advertising 、App and Web Traffic 、 telecom 、 Finance 、 Internet of things and many other fields , It is very suitable for business intelligence application scenarios , There are a lot of applications and practices at home and abroad .
image.png

ClickHouse Introduction to key features

complete DBMS function

ClickHouse It has complete database management functions , Have a DBMS(Database Management System, Database management system ) Basic functions , As shown below :

DDL ( Data definition language ): You can create... Dynamically 、 Modify or delete a database 、 Tables and views , Without having to restart the service .

DML( Data operation language ): It can be queried dynamically 、 Insert 、 Modify or delete data .

Access control : You can set the operation authority of database or table according to user granularity , Secure data .

Data backup and recovery : Provides data backup export and import recovery mechanism , Meet the requirements of the production environment .

Distributed management : Provide cluster mode , Can automatically manage multiple database nodes .

Column storage and data compression

ClickHouse Is a database using column storage , Data is organized in columns , Data belonging to the same column will be saved together , Columns are also saved separately by different files .

When executing data query , Column storage can reduce the data scanning range and the size of data transmission , It improves the efficiency of data query .

Vectorization execution engine

ClickHouse utilize CPU Of SIMD Instruction implements vectorization execution .SIMD The full name is Single Instruction Multiple Data, That is to operate multiple data with a single instruction , An implementation of data parallelism to improve performance ( Others are instruction level parallelism and thread level parallelism ), Its principle is CPU Parallel operation of data is realized at register level , Compared with similar OLAP More efficient product execution .

Replica mechanism

ClickHouse utilize ZooKeeper, adopt ReplicatedMergeTree engine (Replicated Series engine ) The replica mechanism is implemented . The replica mechanism is a multi master architecture , Can be INSERT Statement to any copy , The rest of the replicas will be asynchronously replicated .
image.png

The functional advantages of the replica mechanism are as follows :

ClickHouse The design of replica mechanism can minimize the network data transmission , To synchronize in different data centers , It can be used to build multiple data centers 、 A cluster architecture with multiple activities in different places .

Replica mechanism is to achieve high availability 、 Load balancing 、 transfer / The basis of upgrading functions .

The system will monitor the synchronization of replica data , Identify the faulty node , And fault recovery when the node returns to normal , Ensure the overall high availability of the service .

Data fragmentation and distributed query

ClickHouse It provides the ability of linear expansion through fragmentation and distributed table mechanism .

Fragmentation mechanism : Used to solve the performance bottleneck of a single node , By slicing the data horizontally , Split the data in a table into multiple nodes , There is no duplication of data between different nodes , In this way, you can add slice pairs ClickHouse Linear expansion .

Distributed table : When querying partitioned data , Query through distributed tables , The distributed table engine itself does not store any data , Just a layer of agents , It can automatically route to each partition node in the cluster to obtain data , That is, distributed tables need to work together with other data tables .

As shown in the figure below , In the query , We need to query distributed tables table_distributed, The distributed table will automatically route query requests to each partition node , And gather the results .
image.png

So much has been said , How to use ClickHouse Well ? Don't worry. , The following is adopted ClickHouse The use of the client and the basic operation of the database take you to get started quickly .

Preparation before operation

  • Created ClickHouse colony .
  • already installed ClickHouse client .

ClickHouse Client side usage

Install the user as a client , Log in to the node where the client is installed .

  1. Execute the following command , Switch to the client installation directory . The following table of contents is an example , Please refer to the actual installation path for the specific client path .

cd /opt/Bigdata/client

  1. Execute the following command to configure environment variables .

source bigdata_env

4. perform ClickHouse Component client command .

clickhouse client --host ClickHouse Example IP --user Login name --password password --port ClickHouse Port number

clickhouse client The command line parameters are described in the following table :
image.png

ClickHouse Basic operation of database

Create database :

Basic grammar

CREATE DATABASE [IF NOT EXISTS] database_name

Examples of use
image.png

Create table :

Basic grammar

Method 1 : At the designated “database_name” Create a database named “table_name ” Table of .

If the table creation statement does not contain “database_name”, The database selected when the client logs in is used as the default database_name”.

CREATE TABLE [IF NOT EXISTS] [database_name.]table_name [ON CLUSTER Cluster name ]

(

name1 [type1] [DEFAULT|materialized|ALIAS expr1],

name2 [type2] [DEFAULT|materialized|ALIAS expr2],

...

) ENGINE = engine

Method 2 : Create a and table_name2 Tables with the same structure , At the same time, you can specify different table engine declarations .

If there is no table engine declaration , The created table will be the same as database_name2.table_name2 Use the same table engine .

CREATE TABLE [IF NOT EXISTS] [database_name.]table_name AS [database_name2.]table_name2 [ENGINE = engine]

Method 3 : Use the specified engine to create a SELECT Clause results in a table with the same structure , And use SELECT The result of the clause fills it .

CREATE TABLE [IF NOT EXISTS] [database_name.]table_name ENGINE = engine AS SELECT ...

Examples of use
image.png

Insert table data :

Basic grammar

Method 1 : Insert data in standard format .

INSERT INTO [database_name.]table [(c1, c2, c3)] VALUES (v11, v12, v13), (v21, v22, v23), ...

Method 2 : Use SELECT Write the result of .

INSERT INTO [database_name.]table [(c1, c2, c3)] SELECT ...

Examples of use
image.png

Query table data :

Basic grammar

SELECT [DISTINCT] expr_list

[FROM [database_name.]table | (subquery) | table_function] [FINAL]

[SAMPLE sample_coeff]

[ARRAY JOIN ...]

[GLOBAL] [ANY|ALL|ASOF] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER|SEMI|ANTI] JOIN (subquery)|table (ON <expr_list>)|(USING <column_list>)

[PREWHERE expr]

[WHERE expr]

[GROUP BY expr_list] [WITH TOTALS]

[HAVING expr]

[ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr]

[LIMIT [offset_value, ]n BY columns]

[LIMIT [n, ]m] [WITH TIES]

[UNION ALL ...]

[INTO OUTFILE filename]

[FORMAT format]

Examples of use
image.png

Modify table structure :

Basic grammar

ALTER TABLE [database_name].name [ON CLUSTER cluster] ADD|DROP|CLEAR|COMMENT|MODIFY COLUMN ...

ALTER Support only *MergeTree ,Merge as well as Distributed Equal engine table .

Examples of use
image.png

Display database and table information

Basic grammar

show databases

show tables

Examples of use
image.png

Query table structure

Basic grammar

DESC|DESCRIBE TABLE [database_name.]table [INTO OUTFILE filename] [FORMAT format]

Examples of use
image.png

Delete table :

Basic grammar

DROP [TEMPORARY] TABLE [IF EXISTS] [database_name.]name [ON CLUSTER cluster]

Examples of use
image.png
image.png

Okay , That's all for this cloud lesson , Experience it MapReduce(MRS) More features ! Jab here

Click to follow , The first time to learn about Huawei's new cloud technology ~

copyright notice
author[Huawei cloud developer community],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2021/08/20210823022426389Y.html

Random recommended