HiveQL - Select-Group By
本章详细介绍了 SELECT 语句中的 GROUP BY 子句。GROUP BY 子句用于使用特定集合列对结果集中的所有记录进行分组。它用于查询一组记录。
语法
GROUP BY 子句的语法如下:
SELECT [ALL | DISTINCT] select_expr, select_expr, ... FROM table_reference [WHERE where_condition] [GROUP BY col_list] [HAVING having_condition] [ORDER BY col_list]] [LIMIT number];
示例
让我们举一个 SELECT…GROUP BY 子句的例子。假设员工表如下所示,其中包含 Id、Name、Salary、Designation 和 Dept 字段。生成查询以检索每个部门的员工人数。
+------+--------------+-------------+-------------------+--------+ | ID | Name | Salary | Designation | Dept | +------+--------------+-------------+-------------------+--------+ |1201 | Gopal | 45000 | Technical manager | TP | |1202 | Manisha | 45000 | Proofreader | PR | |1203 | Masthanvali | 40000 | Technical writer | TP | |1204 | Krian | 45000 | Proofreader | PR | |1205 | Kranthi | 30000 | Op Admin | Admin | +------+--------------+-------------+-------------------+--------+
以下查询使用上述场景检索员工详细信息。
hive> SELECT Dept,count(*) FROM employee GROUP BY DEPT;
成功执行查询后,您将看到以下响应:
+------+--------------+ | Dept | Count(*) | +------+--------------+ |Admin | 1 | |PR | 2 | |TP | 3 | +------+--------------+
JDBC 程序
下面给出了针对给定示例应用 Group By 子句的 JDBC 程序。
import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveQLGroupBy { private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { // 注册驱动并创建驱动实例 Class.forName(driverName); // 获取连接 Connection con = DriverManager. getConnection("jdbc:hive://localhost:10000/userdb", "", ""); // 创建语句 Statement stmt = con.createStatement(); // 执行语句 Resultset res = stmt.executeQuery(“SELECT Dept,count(*) ” + “FROM employee GROUP BY DEPT; ”); System.out.println(" Dept count(*)"); while (res.next()) { System.out.println(res.getString(1) + " " + res.getInt(2)); } con.close(); } }
将程序保存在名为 HiveQLGroupBy.java 的文件中。使用以下命令编译并执行此程序。
$ javac HiveQLGroupBy.java $ java HiveQLGroupBy
输出:
Dept Count(*) Admin 1 PR 2 TP 3