[1]WANG Chunkai,ZHUANG Fuzhen,SHI Zhongzhi.System resource allocation for variable data streams[J].CAAI Transactions on Intelligent Systems,2019,14(6):1278-1285.[doi:10.11992/tis.201908011]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
14
Number of periods:
2019 6
Page number:
1278-1285
Column:
学术论文—知识工程
Public date:
2019-11-05
- Title:
-
System resource allocation for variable data streams
- Author(s):
-
WANG Chunkai1; 2; ZHUANG Fuzhen2; SHI Zhongzhi2
-
1. Post-doctoral Research Center, China Reinsurance (Group) Corporation, Beijing 100033, China;
2. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
-
- Keywords:
-
large-scale data stream management system; variable data stream; incremental learning; model prediction; parameter configuration; mini-batch processing; system performance; outlier detection
- CLC:
-
TP311
- DOI:
-
10.11992/tis.201908011
- Abstract:
-
A large-scale data stream management system (LSDSMS) usually contains a relational query system (RQS) and a stream processing system (SPS). When users submit queries to the RQS, it is often necessary to set system parameters according to the rate and distribution of the data streams. However, because of the variability of data streams, changing the resource allocation often reduces the performance of the LSDSMS. In view this problem, we propose a framework for automating the characterization deployment in the LSDSMS OrientStream+. First, based on a user-defined query latency threshold, we designed a data stream transmission mechanism for a mini-batch scheme. Then, we introduced a multi-level pipeline cache for processing batch data streams in the same configuration and obtained accurate query results using the timestamp of the data streams. We also propose an incremental leaning technique with outlier detection to improve the prediction accuracy of OrientStream+. Finally, we validated the proposed approach on the open-source SPS–Storm. Our experimental results show that OrientStream+ can reduce processing latency and improve the LSDSMS throughput.