Apache Lens 2.7.1 发布了,此版本包含多维数据集分割,跨数据联合等功能,完成源数据检查和其他的错误修复和改进。
更新内容:
Bug 修复:
Wrong Cost Calculation in case of HIVE Dimension Query
Group by promotion not happening with aggregate dim attributes
Instances which are in waiting state while restart are not getting resumed
Saved query table create failure shouldn't stop lens server from starting
Launch Time should be set before executeAsync is called on selected driver
getUpdatedQueryContext() call is missing from QueryExecutionServiceImpl#executeTimeoutInternal
example-job.xml in example schema isn't up to date with recent xsd changes
Fact column start_time and end_time not getting reflected with update fact command
Queries submissions are not getting rejected on sessions marked for close
Wrong hsql query is created when there are multiple facts and no dimension.
execute_with_timout not timing out after timeout time
User config loader database calls not inserting entries
Lens Client doesn't provide the option to pass query conf while submitting the query
TestRemoteHiveDriver#testMultiThreadClient failing in pre-commit builds
session/sessions API returning no data on GUI/API though there are active sessions
下载地址:http://lens.apache.org/releases/download.html
Lens 提供了一个统一数据分析接口。通过提供一个跨多个数据存储的单一视图来实现数据分析任务切分,同时优化了执行的环境。无缝的集成 Hadoop 实现类似传统数据仓库的功能。
该项目主要特性:
简单元数据层为数据存储提供抽象视图层
单一的共享模式服务器,基于 Hive 元存储。模式通过数据管道 HCatalog 和分析应用进行共享:
OLAP Cube QL 类似 SQL 的高级语言用来查询和描述存放在不同数据立方体 (Cubes) 中的数据集
JDBC 驱动和 Java 客户端库来处理查询
Lens 应用服务器 - 这是一个 REST 服务器允许用户查询数据,更改数据模型,调度查询和查询的配额限制
基于驱动的架构 允许在报表系统中进行嵌入,例如 Hive、列数据存储、Redshift 等
基于成本算法的引擎选择 - 该算法可优化资源的使用,通过对查询的复杂度自动选择最佳执行引擎
软件详情:https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315923&version=12336851
来自:开源中国社区