Skip to content

字符流拷贝文本文件

你是否曾经这样拷贝文本文件:

java
// ❌ 最原始的方式:逐字符拷贝
FileReader fr = new FileReader("source.txt");
FileWriter fw = new FileWriter("dest.txt");
int c;
while ((c = fr.read()) != -1) {
    fw.write(c);
}

1000 次 read() + 1000 次 write() = 2000 次系统调用。 如果文件有 10 万字符,就是 20 万次系统调用。

这不是编程,这是给 CPU 上刑。

从最差到最优:四次演进

第一版:逐字符拷贝(极差)

java
public static void copyTextFileV1(String src, String dst) throws IOException {
    try (
        FileReader fr = new FileReader(src);
        FileWriter fw = new FileWriter(dst)
    ) {
        int c;
        while ((c = fr.read()) != -1) { // 每次读 1 字符
            fw.write(c);                  // 每次写 1 字符
        }
    }
}

性能:1000 字符 = 2000 次系统调用。

第二版:字符数组批量拷贝(好)

java
public static void copyTextFileV2(String src, String dst) throws IOException {
    try (
        FileReader fr = new FileReader(src);
        FileWriter fw = new FileWriter(dst)
    ) {
        char[] buffer = new char[1024];
        int len;
        while ((len = fr.read(buffer)) != -1) {
            fw.write(buffer, 0, len);
        }
    }
}

性能:1000 字符 ≈ 1 次系统调用(假设缓冲区足够)。

第三版:readLine + newLine(推荐)

java
public static void copyTextFileV3(String src, String dst) throws IOException {
    try (
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(
                new FileInputStream(src), StandardCharsets.UTF_8));
        BufferedWriter writer = new BufferedWriter(
            new OutputStreamWriter(
                new FileOutputStream(dst), StandardCharsets.UTF_8))
    ) {
        String line;
        while ((line = reader.readLine()) != null) {
            writer.write(line);
            writer.newLine();
        }
    }
}

这是文本拷贝的标准模式readLine()write()newLine()

第四版:JDK 11+ 极简写法

java
String content = Files.readString(Path.of(src), StandardCharsets.UTF_8);
Files.writeString(Path.of(dst), content, StandardCharsets.UTF_8);

但这会把整个文件加载到内存,不适合大文件。

进阶变体

带行号拷贝

java
public static void copyWithLineNumbers(String src, String dst)
        throws IOException {
    try (
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(
                new FileInputStream(src), StandardCharsets.UTF_8));
        BufferedWriter writer = new BufferedWriter(
            new OutputStreamWriter(
                new FileOutputStream(dst), StandardCharsets.UTF_8))
    ) {
        String line;
        int lineNum = 0;
        while ((line = reader.readLine()) != null) {
            lineNum++;
            writer.write(String.format("%4d: %s", lineNum, line));
            writer.newLine();
        }
    }
}

过滤空行

java
public static void copyWithoutBlankLines(String src, String dst)
        throws IOException {
    try (
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(
                new FileInputStream(src), StandardCharsets.UTF_8));
        BufferedWriter writer = new BufferedWriter(
            new OutputStreamWriter(
                new FileOutputStream(dst), StandardCharsets.UTF_8))
    ) {
        String line;
        while ((line = reader.readLine()) != null) {
            if (!line.trim().isEmpty()) { // 跳过空行
                writer.write(line);
                writer.newLine();
            }
        }
    }
}

大文件处理:按块读取

对于超大文件,readLine() 不适合——它会把整行加载到内存。改用批量读取:

java
public static void copyLargeTextFile(String src, String dst)
        throws IOException {
    try (
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(
                new FileInputStream(src), StandardCharsets.UTF_8));
        BufferedWriter writer = new BufferedWriter(
            new OutputStreamWriter(
                new FileOutputStream(dst), StandardCharsets.UTF_8))
    ) {
        char[] buffer = new char[8192];
        int len;
        while ((len = reader.read(buffer)) != -1) {
            writer.write(buffer, 0, len);
        }
    }
}

大小写转换

java
public static void toUpperCase(String src, String dst) throws IOException {
    try (
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(
                new FileInputStream(src), StandardCharsets.UTF_8));
        BufferedWriter writer = new BufferedWriter(
            new OutputStreamWriter(
                new FileOutputStream(dst), StandardCharsets.UTF_8))
    ) {
        String line;
        while ((line = reader.readLine()) != null) {
            writer.write(line.toUpperCase());
            writer.newLine();
        }
    }
}

性能对比

方式适用场景性能内存占用
read() 逐字符❌ 永远不要用极差
read(char[]) 批量大文本文件
readLine() 逐行普通文本文件
Files.readAllLines()小文件全量加载
Files.readString()小文件(JDK 11+)全量加载

记住这个口诀

文本拷贝不用慌,readLine 来帮忙。 写一行来换一行,缓冲加持心不慌。

基于 VitePress 构建