科学研究
学术报告
当前位置: 学院主页 > 科学研究 > 学术报告 > 正文

Model-free global likelihood subsampling for massive data

发布时间:2023-04-07 作者: 浏览次数:
Speaker: 周永道 教授 DateTime: 2023年4月14日(周五)上午10:00-11:30
Brief Introduction to Speaker:

周永道,男,南开大学统计与数据科学学院教授、博导,中组部青年拔尖人才,天津市创新类领军人才、天津市131创新型人才、南开大学百名青年学科带头人。研究方向为试验设计和数据挖掘。主持过四项国家自然科学基金、一项天津市自然科学基金重点项目及其它多项纵横向项目。曾访问加州大学洛杉矶分校、西蒙菲莎大学、曼彻斯特大学、香港大学等高校。在统计学顶级期刊 JASA、Biometrika 及中国科学等国内外重要期刊发表学术论文50多篇;合作出版了两本中英文专著和两本统计学专业教材。曾获国家统计局统计科学研究优秀成果奖一等奖。现为中国数学会均匀设计分会秘书长、泛华统计协会永久会员、美国《数学评论》评论员。

Place: 六号楼二楼报告厅M201
Abstract:Most existing studies for subsampling heavily depend on a specified model. If the assumed model is not correct, the performance of the subsample may be poor. This paper focuses on a model-free subsampling method, called global likelihood subsampling, such that the subsample is robust to different model choices. It leverages the idea of the global likelihood sampler, which is an effective and robust sampling method from a given continuous distribution. Furthermore, we accelerate the algorithm for largescale datasets and extend it to deal with high-dimensional data with relatively low computational complexity. Simulations and real data studies are conducted to apply the proposed method to regression and classification problems. It illustrates that this method is robust against different modeling methods and has promising performance compared with some existing model-free subsampling methods for data compression.