我的Rust程序旨在逐行读取一个非常大(高达几GB)的简单文本文件.问题是,这个文件太大,无法一次读取,或者无法将所有行传输到Vec<String>
.
用什么惯用方法来处理这个问题?
我的Rust程序旨在逐行读取一个非常大(高达几GB)的简单文本文件.问题是,这个文件太大,无法一次读取,或者无法将所有行传输到Vec<String>
.
用什么惯用方法来处理这个问题?
具体来说,使用BufReader.lines()
函数:
use std::fs::File;
use std::io::{self, prelude::*, BufReader};
fn main() -> io::Result<()> {
let file = File::open("foo.txt")?;
let reader = BufReader::new(file);
for line in reader.lines() {
println!("{}", line?);
}
Ok(())
}
请注意,正如文档中所述,您返回了换行符.
如果不想为每一行分配一个字符串,下面是一个重用同一缓冲区的示例:
fn main() -> std::io::Result<()> {
let mut reader = my_reader::BufReader::open("Cargo.toml")?;
let mut buffer = String::new();
while let Some(line) = reader.read_line(&mut buffer) {
println!("{}", line?.trim());
}
Ok(())
}
mod my_reader {
use std::{
fs::File,
io::{self, prelude::*},
};
pub struct BufReader {
reader: io::BufReader<File>,
}
impl BufReader {
pub fn open(path: impl AsRef<std::path::Path>) -> io::Result<Self> {
let file = File::open(path)?;
let reader = io::BufReader::new(file);
Ok(Self { reader })
}
pub fn read_line<'buf>(
&mut self,
buffer: &'buf mut String,
) -> Option<io::Result<&'buf mut String>> {
buffer.clear();
self.reader
.read_line(buffer)
.map(|u| if u == 0 { None } else { Some(buffer) })
.transpose()
}
}
}
或者,如果你更喜欢标准迭代器,你可以使用这个Rc
技巧,我无耻地用了from Reddit:
fn main() -> std::io::Result<()> {
for line in my_reader::BufReader::open("Cargo.toml")? {
println!("{}", line?.trim());
}
Ok(())
}
mod my_reader {
use std::{
fs::File,
io::{self, prelude::*},
rc::Rc,
};
pub struct BufReader {
reader: io::BufReader<File>,
buf: Rc<String>,
}
fn new_buf() -> Rc<String> {
Rc::new(String::with_capacity(1024)) // Tweakable capacity
}
impl BufReader {
pub fn open(path: impl AsRef<std::path::Path>) -> io::Result<Self> {
let file = File::open(path)?;
let reader = io::BufReader::new(file);
let buf = new_buf();
Ok(Self { reader, buf })
}
}
impl Iterator for BufReader {
type Item = io::Result<Rc<String>>;
fn next(&mut self) -> Option<Self::Item> {
let buf = match Rc::get_mut(&mut self.buf) {
Some(buf) => {
buf.clear();
buf
}
None => {
self.buf = new_buf();
Rc::make_mut(&mut self.buf)
}
};
self.reader
.read_line(buf)
.map(|u| if u == 0 { None } else { Some(Rc::clone(&self.buf)) })
.transpose()
}
}
}