Bilibili Video Downloader

The easiest way to download Bilibili video without watermark or logo

北京大学在读博士生伍书缘:再抽样和刀切法:计算资源有限条件下大规模数据集分析的一种方法

TIP! Right-click and select "Save link as..." to download.

VIDEOS
MP4 N/A 480P Download
MP4 N/A 480P Download
MP4 N/A 480P Download
MP4 N/A 360P Download
MP4 N/A 360P Download
MP4 N/A 360P Download
AUDIO
MP4 N/A mp4a.40.2 Download
MP4 N/A mp4a.40.5 Download
MP4 N/A mp4a.40.2 Download
THUMBNAILS
北京大学在读博士生伍书缘:再抽样和刀切法:计算资源有限条件下大规模数据集分析的一种方法 JPEG Origin Image Download
嘉宾简介
伍书缘,北京大学光华管理学院商务统计与经济计量系在读博士生,主要研究方向为再抽样方法、统计优化算法、大规模数据统计建模等。研究论文发表在Journal of Business and Economic Statistics, Statistica Sinica等期刊上。

相关资料
下载文章:
http://www3.stat.sinica.edu.tw/preprint/SS-2021-0257_Preprint.pdf 

报告摘要
Modern statistical analysis often involves large data sets, for which conventional estimation methods are not suitable, owing to limited computational resources. To solve this problem, we propose a novel subsampling-based method with jackknifing. The key idea is to treat the whole sample as if it were the population. Then, we obtain multiple subsamples with greatly reduced sizes using simple random sampling with replacement. We do not recommend sampling methods without replacement, because this would incur a significant data processing cost when the processing occurs on a hard drive. However, such a cost does not exist if the data are processed in memory. Because subsampled data have relatively small sizes, they can be comfortably read into computer memory and processed. Based on subsampled datasets, jackknife-debiased estimators can be obtained for the target parameter. The resulting estimators are statistically consistent, with an extremely small bias. Finally, the jackknife-debiased estimators from different subsamples are averaged to form the final estimator. We show theoretically that the final estimator is consistent and asymptotically normal. Furthermore, its asymptotic statistical efficiency can be as good as that of the whole sample estimator under very mild conditions. The proposed method is easily implemented on most computer systems, and thus is widely applicable.