1500字范文,内容丰富有趣,写作好帮手!
1500字范文 > 【Hive】Hive内部表/外部表

【Hive】Hive内部表/外部表

时间:2019-10-02 11:21:31

相关推荐

【Hive】Hive内部表/外部表

1. 建表语句

Hive官网有详细的建表语句。

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)[(col_name data_type [column_constraint_specification] [COMMENT col_comment], ... [constraint_specification])][COMMENT table_comment][PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)][CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS][SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)[STORED AS DIRECTORIES][[ROW FORMAT row_format] [STORED AS file_format]| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)][LOCATION hdfs_path][TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)[AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)

2. 创建内部表

2.1 创建dept表

create table myhive.dept (id int,dept_name string)row format delimited fields terminated by ' ';

查看内部表属性,可以发现,内部表为:MANAGED_TABLE

hive (myhive)> describe formatted dept;col_name data_type comment# col_name data_typecomment id int dept_namestring # Detailed Table Information Database:myhive Owner: hadoop CreateTime: Fri Jun 12 05:57:53 CST LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location:hdfs://node01:8020/user/hive/warehouse/myhive.db/dept Table Type: **MANAGED_TABLE** Table Parameters:transient_lastDdlTime 1591912673# Storage Information SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: field.delim serialization.format

2.2 导入数据

[hadoop@node03 ~]$ cat data/hiveData/dept.txt 1 sales2 product3 financial##导入数据hive (myhive)> load data local inpath '/home/hadoop/data/hiveData/dept.txt' into table dept;##查询数据hive (myhive)> select * from dept;dept.id dept.dept_name1 sales2 product3 financial

3. 创建外部表

3.1 创建employee表

create external table myhive.employee(id int,name string)row format delimited fields terminated by ' ';

查看内部表属性,可以发现,外部表为:EXTERNAL_TABLE

hive (myhive)> describe formatted employee;col_name data_type comment# col_name data_typecomment id int namestring # Detailed Table Information Database:myhive Owner: hadoop CreateTime: Fri Jun 12 06:05:00 CST LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location:hdfs://node01:8020/user/hive/warehouse/myhive.db/employee Table Type: **EXTERNAL_TABLE** Table Parameters:EXTERNALTRUEtransient_lastDdlTime 1591913100# Storage Information SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: field.delim serialization.format

3.2 导入数据

[hadoop@node03 ~]$ cat data/hiveData/employee.txt 1 lisa2 tom3 john##导入数据hive (myhive)> load data local inpath '/home/hadoop/data/hiveData/employee.txt' into table employee;##查询数据hive (myhive)> select * from employee;employee.idemployee.name1 lisa2 tom3 john

4. 内部表和外部表相互转换

4.1 内部表转换为外部表

alter table dept set tblproperties('EXTERNAL'='TRUE');

4.2 外部表转换为内部表

alter table employee set tblproperties ('EXTERNAL'='FALSE');

5. 内部表和外部表区别

建表语法有区别,外部表在建表时有“EXTERNAL” 关键字;删表后数据文件状态有区别,删除内部表后,相关数据也会被删除,删除外部表只是删除了表结构,数据文件还是存在。

6. 内部表和外部表的选择时机

6.1 何时选择使用内部表
数据和表结构有相同的生命周期,也就是删表时也想把数据删除,此时可以选择使用内部表;数据只有本系统一张表使用,如数仓的dw层。
6.2 何时选择使用外部表
有多个系统或集群同时访问一份数据;希望一份数据被多个表加载;删除表结构无需担心数据被删除。

总结

内部表和外部表的建表语句区别不大,实际生产环境中,一般推荐使用外部表多一些。

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。