[1]李杨,郝志峰,谢光强,等.质量度量指标驱动的数据聚合与多维数据可视化[J].智能系统学报,2013,8(04):299-304.[doi:10.3969/j.issn.1673-4785.201304039]
 LI Yang,HAO Zhifeng,XIE Guangqiang,et al.Quality-metrics driven multi-dimensional data aggregation and visualization[J].CAAI Transactions on Intelligent Systems,2013,8(04):299-304.[doi:10.3969/j.issn.1673-4785.201304039]
点击复制

质量度量指标驱动的数据聚合与多维数据可视化(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第8卷
期数:
2013年04期
页码:
299-304
栏目:
出版日期:
2013-08-25

文章信息/Info

Title:
Quality-metrics driven multi-dimensional data aggregation and visualization
文章编号:
1673-4785(2013)04-0299-06
作者:
李杨12郝志峰23谢光强12袁淦钊3
1.广东工业大学 自动化学院, 广东 广州 510006; 2.广东工业大学 计算机学院, 广东 广州 510006; 3.华南理工大学 计算机科学与工程学院,广东 广州 510006
Author(s):
LI Yang12 HAO Zhifeng23 XIE Guangqiang12 YUAN Ganzhao 3
1.School of Automation, Guangdong University of Technology, Guangzhou 510006, China; 2.School of Computers, Guangdong University of Technology, Guangzhou 510006, China; 3.School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
关键词:
质量度量数据空间数据聚合K-均值多维数据可视化
Keywords:
quality-metrics data space data aggregation K-means multi-dimensional data visualization
分类号:
TP391
DOI:
10.3969/j.issn.1673-4785.201304039
文献标志码:
A
摘要:
以多维数据可视化为研究对象,在质量度量模型下,采用数据聚合为基本手段,来提高多维数据可视化的图像质量.在质量度量指标驱动的框架下提出了均分 K-means++数据聚合算法,在传统 K-means算法的基础上,专门以数据可视化为目的对算法进行了改进,使得算法聚合得到的数据既能够较好地保持原数据的大部分特性,又能显著地提高可视化后的图像质量.仿真实验证明,在不同的数据抽象级别DAL下,无论是图像质量指标还是质量度量指标HDM(直方图差值度量)、NNM(最近邻距离度量),算法都表现出了较好的仿真结果.
Abstract:
For the purpose of this research paper, we examined multi-dimensional data visualization with the quality metrics model; taking data aggregation as a basic means in order to improve the multi-dimensional visualization image quality. Under the quality-metrics driven framework, we put forward a data aggregation algorithm called equipartition K-means++ based on conventional K-means, and thus, were able to improve the algorithm especially as it pertains to data visualization. The aggregated data obtained by equipartition K-means++ may not only preserve most features of the original data, but also improve the image quality after visualization. Our simulation experiments show that at each value of data abstraction level (DAL), equipartition K-means++ get good results, not only in visualization image quality but also quality metrics of histogram difference measure (HDM) and nearest neighbor measure (NNM).

参考文献/References:

[1]孙扬, 封孝生, 唐九阳,等. 多维可视化技术综述[J].计算机科学, 2008, 35(11): 1-7.
         SUN Yang, FENG Xiaosheng, TANG Jiuyang, et al. Survey on the research of multidimensional and multivariate data visualization [J]. Computer Science, 2008, 35(11): 1-7.
[2]KEIM D A, ANKERST M. Visual data mining and exploration of large databases[C]//PKDD. Freiburg, Germany, 2001: 104-109.
[3]BERTINI E, SANTUCCI G. Quality metrics for 2D scatterplot graphics:automatically reducing visual clutter[C]//Smart Graphics 4th International Symposium. Banff, Canada, 2004: 10-15.
[4]JOHANSSON J, COOPER M. A screen space quality method for data abstraction[J]. Computer Graphics Forum, 2008, 27(3): 1039-1046.
[5]BERTINI E, TATU A, KEIM D. Quality metrics in high-dimensional data visualization: an overview and systematization[J]. IEEE Trans on Visualization and Computer Graphics, 2011, 17(12): 2203-2212.
[6]CUI Q, WARD M, RUNDENSTEINER E, et al. Measuring data abstraction quality in multiresolution visualizations[J]. IEEE Trans on Visualization and Computer Graphics, 2006, 23(12): 709-716.
[7]ALBUQUERQUE T A, EISEMANN G. Combining automated analysis and visualization techniques for effective exploration of high-dimensional data[C]//Proc IEEE Symp Visual Analytics Science and Technology. Atlantic City, USA, 2009: 59-66.
[8]ARTHUR D, VASSILVITSKII S. K-means++: the advantages of careful seeding[C]//Symposium on Discrete Algorithms. Philadelphia, USA, 2007: 1027-1035.
[9]FERDOSI B J, BERNOULLI J. Finding and visualizing relevant subspaces for clustering high-dimensional astronomical data using connected morphological operators[C]//IEEE Conf Visual Analytics Science and Technology. Salt Lake City, USA, 2010: 35-42.
[10]JOHANSSON S, JOHANSSON J. Interactive dimensionality reduction through user-defined combinations of quality metrics[J]. IEEE Trans on Visualization and Computer Graphics, 2009, 15(6): 993-1000.
[11]PENG W, WARD M O, RUNDENSTEINER E A. Clutter reduction in multi-dimensional data visualization using dimension reordering[C]//IEEE Symp Information Visualization. Austin, USA, 2004: 89-96.
[12]WILKINSON L, ANAND A, GROSSMAN R. Graph-theoretic scagnostics[C]//IEEE Symp Information Visualization. Chicago, USA, 2005: 157-164.
[13]SIPS M, NEUBERT B, LEWIS J P, et al. Selecting good views of high-dimensional data using class consistency[J]. Computer Graphics Forum, 2009, 28(3): 30-41.
[14]INSELBERG A. The plane with parallel coordinates[J]. The Visual Computer, 1985, 1(2): 69-91.
[15]HOFFMAN P E, GRINSTEIN G G, MARX K, et a1. DNA visual and analytic data mining[C]//IEEE Visualization Phoenix. Phoenix, USA, 1997: 437-441.
[16]MATRIX S. Scatter plot matrics[EB/OL].[2012-09-20]. http://www.itl.nist.Gov/div898/hand book/eda/section3/eda33qb.html.
[17]ANDREWS D F. Plots of high-dimensional data[J]. Biometrics, 1972, 28(1): 125-136.
[18]KEIM D A, KRIEGEL H P. VisDB: database exploration using multidimensional visualization[J]. Computer Graphics Applications, 1994, 14(5): 40-49.
[19]HOFMAN P E. Table visualizations: a formal model and its applications[D]. Lowell, USA: University of Massachusetts, 1999: 25.
[20]WARD M O, LEBLANC J, TIPNIS R. N-Land: a graphical tool for exploring  n-dimensional data[C]//Computer Graphics International Conference. Melbourne, Australia, 1994: 1-14.
[21]FEINER S, BESHERS C. Worlds within worlds: metaphors for exploring n-dimensional virtual worlds[C]//ACM Proceedings Conference on User Interface Software Design. New York, USA, 1990: 76-83.
[22]LOHNINGER H. INSPECT, a program system to visualize and interpret chemical data[J]. Chemometrics and Intelligent Laboratory Systems, 1994, 22(1): 147-153.
[23]WARD M O. “Xmdvtool”[EB/OL]. [2012-09-23]. Xmdv Users Group. http://davis.wpi.edu/xmdv/datasets.html.

备注/Memo

备注/Memo:
收稿日期:2013-04-15.     网络出版日期:2013-06-03.
基金项目:国家自然科学基金资助项目(61070033);广东省自然科学基金资助项目(9251009001000005);广东省科技计划资助项目(2010B050400011).
通信作者:李杨. E-mail:kitty_llyy@163.com.
作者简介:
李杨,女,1980年生,讲师,博士研究生,主要研究方向为数据可视化、机器学习.获云浮市科技进步奖二等奖1项,发明专利授权1项,实用新型专利1项.发表学术论文8篇. 
郝志峰,男,1968年生,教授,博士生导师,主要研究方向为机器学习、仿生算法、生物信息学. 共主持国家自然科学基金、国家新世纪优秀人才基金、教育部优秀青年教师基金、教育部霍英东基金、广东省自然科学基金、广东省科技攻关项目、广东省省部产学研项目、广东省“千百十人才”基金等省部级以上项目22项.获国家、省部级各类奖项20余项,广东省科技最高个人荣誉 “丁颖科技奖”, 发表学术论文60余篇.
谢光强,男,1979 年生,副教授,硕士生导师,主要研究方向为多智能体、智能控制.主持省部产学研等科研项目11项,获得专利和软件著作权10项,指导学生获各类国家、省级奖项30余项,发表学术论文11篇,其中被EI和ISTP检索4篇.
更新日期/Last Update: 2013-09-23