我当时正在处理一个非常具体的问题,它要求我读取数十万个从几个字节到几百兆字节的文件.由于大部分操作包括枚举文件和从磁盘移动数据,我求助于重用Vec
个缓冲区来读取文件,希望避免一些内存管理.
这就是我遇到意外的时候:缓冲区的容量越大,file.read_to_end(&mut buffer)?
的速度就越慢.首先读取300MB的文件,然后读取file.read_to_end(&mut buffer)?
0个1KB的文件,这比反过来读取要慢得多(只要我们不截断缓冲区).
令人困惑的是,如果我将文件包装在Take
或read_exact()
中,则不会发生减慢.
有人知道这是怎么回事吗?是否有可能在每次调用时都(重新)初始化整个缓冲区?这是Windows特有的怪癖吗?在处理此类问题时,您会推荐哪些(基于Windows的)分析工具?
以下是一个简单的复制品,它演示了在不考虑磁盘速度的情况下,这两种方法之间的huge(在这台机器上是50倍以上)性能差异:
use std::io::Read;
use std::fs::File;
// with a smaller buffer, there's basically no difference between the methods...
// const BUFFER_SIZE: usize = 2 * 1024;
// ...but the larger the Vec, the bigger the discrepancy.
// for simplicity's sake, let's assume this is a hard upper limit.
const BUFFER_SIZE: usize = 300 * 1024 * 1024;
fn naive() {
let mut buffer = Vec::with_capacity(BUFFER_SIZE);
for _ in 0..100 {
let mut file = File::open("some_1kb_file.txt").expect("opening file");
let metadata = file.metadata().expect("reading metadata");
let len = metadata.len();
assert!(len <= BUFFER_SIZE as u64);
buffer.clear();
file.read_to_end(&mut buffer).expect("reading file");
// do "stuff" with buffer
let check = buffer.iter().fold(0usize, |acc, x| acc.wrapping_add(*x as usize));
println!("length: {len}, check: {check}");
}
}
fn take() {
let mut buffer = Vec::with_capacity(BUFFER_SIZE);
for _ in 0..100 {
let file = File::open("some_1kb_file.txt").expect("opening file");
let metadata = file.metadata().expect("reading metadata");
let len = metadata.len();
assert!(len <= BUFFER_SIZE as u64);
buffer.clear();
file.take(len).read_to_end(&mut buffer).expect("reading file");
// this also behaves like the straight `read_to_end` with a significant slowdown:
// file.take(BUFFER_SIZE as u64).read_to_end(&mut buffer).expect("reading file");
// do "stuff" with buffer
let check = buffer.iter().fold(0usize, |acc, x| acc.wrapping_add(*x as usize));
println!("length: {len}, check: {check}");
}
}
fn exact() {
let mut buffer = vec![0u8; BUFFER_SIZE];
for _ in 0..100 {
let mut file = File::open("some_1kb_file.txt").expect("opening file");
let metadata = file.metadata().expect("reading metadata");
let len = metadata.len() as usize;
assert!(len <= BUFFER_SIZE);
// SAFETY: initialized by `vec!` and within capacity by `assert!`
unsafe { buffer.set_len(len); }
file.read_exact(&mut buffer[0..len]).expect("reading file");
// do "stuff" with buffer
let check = buffer.iter().fold(0usize, |acc, x| acc.wrapping_add(*x as usize));
println!("length: {len}, check: {check}");
}
}
fn main() {
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
println!("usage: {} <method>", args[0]);
return;
}
match args[1].as_str() {
"naive" => naive(),
"take" => take(),
"exact" => exact(),
_ => println!("Unknown method: {}", args[1]),
}
}
try 了在--release
模式的几种组合中,LTO
甚至+crt-static
都没有明显的差别.