[aiRust] 软件中的错误处理策略

  • 信息来源:GPT-4o
  • 状态:待人工验证整理

通用建议

In a large software project, error handling strategies depend on various factors such as maintainability, performance, and user experience. Rust encourages robust error handling through the Result and Option types, which make errors explicit and prevent unnoticed failures.

Here are two approaches:

1. Handling Errors at the Location (Localized Handling):

  • Advantages:
    • Errors are dealt with where they occur, which can simplify debugging since the error context is fresh.
    • Local fixes often provide more detailed, precise responses to the problem.
  • Disadvantages:
    • Can lead to code duplication or inconsistent error handling across different parts of the project.
    • May make the code harder to read if multiple small error handlers clutter logic flow.

2. Centralized Error Handling:

  • Advantages:
    • Centralizing error handling provides consistency and simplifies maintenance, especially when similar actions need to be taken for different errors.
    • A common handler can improve code clarity by separating error logic from business logic.
    • Central logging or reporting of errors becomes easier.
  • Disadvantages:
    • It can reduce the granularity of error reporting, making it harder to understand the exact context of an error.
    • There’s a risk of suppressing or generalizing errors too much, potentially losing valuable debugging information.

In Rust:

Rust textbooks often suggest a centralized approach to handle specific classes of errors because:

  • Rust’s pattern matching (match or ? operator) allows flexible propagation of errors.
  • Centralizing the error handling can help you standardize how you convert errors into meaningful responses (e.g., mapping them to HTTP error codes in a web service).

However, a hybrid approach can often be a good solution:

  • Handle critical or domain-specific errors locally where recovery is possible or necessary.
  • Centralize logging, error wrapping, or converting to common error types in broader parts of your application, such as a top-level main function or a central error handler in the logic flow.

Best Practices in Rust:

  1. Use Result and Option appropriately for error propagation.
  2. Leverage the ? operator to propagate errors upward until you reach a point where you can handle them centrally.
  3. Define custom error types that represent the different error cases across your application using enums (with the thiserror crate or anyhow crate for more complex projects).

This hybrid approach lets you balance simplicity, consistency, and error clarity. Does this align with the direction you want to take?

科学计算软件相关建议

In a scientific computing project like quantum chemistry or molecular dynamics, balancing centralized error handling with code readability is crucial because these projects often involve complex numerical operations and multiple layers of abstractions.

Here’s how you can centralize error handling without sacrificing readability:

1. Define a Custom Error Type:

Create a custom error type (using enum) that encompasses different types of errors specific to your domain. This allows for centralized and consistent handling of errors while preserving semantic meaning for different error cases.

For example, in Rust:

   use thiserror::Error;

   #[derive(Error, Debug)]
   pub enum SimulationError {
       #[error("File not found: {0}")]
       FileNotFound(String),

       #[error("Invalid input data: {0}")]
       InvalidInput(String),

       #[error("Computation failed at step {0}: {1}")]
       ComputationError(usize, String),

       #[error("Other error: {0}")]
       Other(String),
   }

By doing this, you create a centralized place where all errors in the project are defined and categorized.

2. Error Propagation Using the ? Operator:

In Rust, you can propagate errors using the ? operator to ensure that error handling doesn’t clutter your core computational logic. This allows you to keep the code readable and concise, while still allowing errors to bubble up to higher levels where they can be handled.

Example:

   fn read_input_file(file_path: &str) -> Result {
       let content = std::fs::read_to_string(file_path)
           .map_err(|_| SimulationError::FileNotFound(file_path.to_string()))?;
       Ok(content)
   }

   fn run_simulation(input_file: &str) -> Result<(), SimulationError> {
       let data = read_input_file(input_file)?;
       // Perform computation...
       Ok(())
   }

Here, the ? operator propagates the error upward, maintaining the readability of the computation logic while centralizing error reporting to higher levels.

3. Handling Errors in a Centralized Way at the Top Level:

Once errors are propagated to the higher level, such as the main function, handle them centrally with a consistent strategy. You can catch errors, log them, and clean up resources if necessary.

Example in the main function:

   fn main() {
       if let Err(e) = run_simulation("input.dat") {
           eprintln!("Error encountered: {}", e);
           // Optionally: log the error or perform cleanup
       }
   }

By doing this, your core simulation code remains clean and focused on the scientific logic, while error handling is centralized at the outer layer.

4. Using Libraries to Simplify Error Handling:

Consider using crates like thiserror for defining custom errors or anyhow for simplifying error propagation when you don’t need granular error handling. This keeps your codebase cleaner.

  • thiserror: A lightweight way to define and derive custom errors.
  • anyhow: Useful for cases where you don’t need to distinguish between error types at every step and want simplified error propagation.

5. Centralized Logging for Errors:

You can maintain readability by centralizing logging, without logging errors in every function. Instead, ensure that the errors are propagated upward (using Result<T, E>) and logged at the highest level where they are managed (often in the main loop or a control module).

   fn main() {
       if let Err(e) = run_simulation("input.dat") {
           log::error!("Simulation failed: {}", e);
       }
   }

Using a centralized logging approach ensures that the computational routines stay clean, while all error-related information is managed in one place.

6. Modularize Error Handling by Subsystem:

If your project is large (e.g., handling molecular dynamics, quantum mechanics calculations, and file I/O separately), you can define error types within each subsystem while still centralizing major error handling at a high level. Each module (e.g., file I/O, integrals, or basis set processing) can handle specific errors locally, then propagate them as generalized errors to the top level.

   mod integrals {
       #[derive(Error, Debug)]
       pub enum IntegralError {
           #[error("Invalid basis set: {0}")]
           InvalidBasisSet(String),
       }

       pub fn compute_integrals() -> Result<(), IntegralError> {
           // Integral computation logic
           Ok(())
       }
   }

When combining everything:

   fn main() {
       if let Err(e) = integrals::compute_integrals() {
           eprintln!("Error during integral computation: {}", e);
       }
   }

7. Meaningful Contextual Errors:

Add context to errors as they propagate to help with debugging while keeping the error handling centralized. In Rust, you can chain errors using libraries like anyhow::Context.

   use anyhow::Context;

   fn read_input_file(file_path: &str) -> Result {
       let content = std::fs::read_to_string(file_path)
           .with_context(|| format!("Failed to read file: {}", file_path))?;
       Ok(content)
   }

This way, even though the error is handled at the top level, the context is preserved to understand the root cause of the error.

Summary of Strategy:

  1. Use a custom error type to define and categorize errors.
  2. Use the ? operator to propagate errors without cluttering logic.
  3. Handle errors at the top level with a centralized strategy.
  4. Consider using libraries like thiserror or anyhow for simplifying error management.
  5. Log errors centrally and avoid logging at every level.
  6. Modularize error types for different parts of the system but centralize reporting at the highest level.
  7. Add contextual information to errors during propagation.

These techniques help keep the core computational logic readable while ensuring consistent and centralized error handling across the codebase.

相关技术

1. The ? Operator

The ? operator in Rust is used for propagating errors in functions that return a Result or Option type. It helps streamline error handling by allowing errors to be forwarded without needing explicit match expressions. If the operation succeeds, the value inside the Result or Option is returned, but if it fails, the error is returned early from the current function.

How It Works:

  • The ? operator can be used on any expression that returns a Result<T, E> or Option<T>.
  • If the expression evaluates to Ok(T) or Some(T), the T value is extracted, and the function proceeds.
  • If the expression evaluates to Err(E) or None, the function returns immediately with the error or None.

Example:

fn parse_number(s: &str) -> Result<i32, std::num::ParseIntError> {
    let num: i32 = s.parse()?; // If parse fails, return the error
    Ok(num)
}

In this example:

  • If s.parse() returns Ok(num), the num is returned.
  • If s.parse() returns Err(e), the parse_number function will return the error e immediately.

Benefits:

  • Reduces boilerplate code.
  • Keeps code more readable by eliminating repetitive error handling (match blocks).
  • Ensures that errors are propagated upwards automatically without requiring manual error handling at every step.

Without ?:

fn parse_number(s: &str) -> Result<i32, std::num::ParseIntError> {
    match s.parse() {
        Ok(num) => Ok(num),
        Err(e) => Err(e),
    }
}

In short, the ? operator simplifies code and reduces the need for explicit match statements by automatically propagating errors.

2. Box<dyn Error>

Box<dyn Error> is used in Rust to represent any type of error that implements the Error trait. The dyn (short for dynamic) indicates that the exact type of the error is not known at compile time, and Box is used to allocate the error on the heap, allowing you to store errors of different types under a common interface.

Key Points:

  • Box<dyn Error> is used for dynamic dispatch of errors, allowing you to handle errors of different types in a uniform way.
  • The Error trait in Rust is the standard trait for error types. Any type that implements the Error trait can be stored inside a Box<dyn Error>.
  • It’s commonly used when you don’t care about the specific type of error but want to be able to return any error from a function.

Example:

use std::error::Error;
use std::fs::File;
use std::io::{self, Read};

fn read_file_content(path: &str) -> Result<String, Box<dyn Error>> {
    let mut file = File::open(path)?; // Might return io::Error
    let mut content = String::new();
    file.read_to_string(&mut content)?; // Might return io::Error
    Ok(content)
}

Here, read_file_content returns Result<String, Box<dyn Error>>, which means:

  • On success, it returns Ok(String).
  • On failure, it returns Err(Box<dyn Error>), where the actual error could be any type that implements Error (e.g., std::io::Error, std::num::ParseIntError, etc.).

Why Use Box<dyn Error>:

  • Flexibility: You can return different kinds of errors (e.g., I/O errors, parsing errors) from the same function without having to define a custom enum to hold all the error types.
  • Simplification: When you don’t need to distinguish between different types of errors, and all you care about is whether the operation succeeded or failed.
  • Dynamic Error Handling: Useful in projects where the exact error types may not be known ahead of time or when you want to centralize error handling.

Trade-offs:

  • Dynamic Dispatch: Since Box<dyn Error> uses dynamic dispatch, it may have a slight performance cost compared to static dispatch (e.g., returning concrete error types).
  • Less Type Information: By using Box<dyn Error>, you lose some specific error information, since different error types are hidden behind the Error trait.

Example Combining ? and Box<dyn Error>:

fn run_simulation(path: &str) -> Result<(), Box<dyn Error>> {
    let input = std::fs::read_to_string(path)?; // Propagate error with `?`
    // Perform other operations...
    Ok(())
}

In this example:

  • The read_to_string(path) function returns a Result<String, io::Error>. Using ?, the error is propagated upwards if it occurs, but it’s converted into a Box<dyn Error> to fit the return type of the function.
  • The Box<dyn Error> allows the function to return any type of error that implements Error, not just io::Error.

Summary:

  • The ? operator is used to propagate errors in a concise way, reducing the need for repetitive error handling code.
  • Box<dyn Error> allows functions to return any error type dynamically, providing flexibility in error handling and enabling more generic error management.

These tools are powerful for managing errors efficiently in Rust, especially in large or complex projects like scientific computation where multiple error types can occur.

发表在 AI老师 | 留下评论

[午休阅读]汉斯•约阿西姆•施杜里希-世界哲学史

最近囫囵吞枣的读了一遍世界哲学史,开始的古代哲学部分读的津津有味,但是随着内容的深入,里面涉及的知识和体系就逐渐超过了我目前可以掌控的能力范围。最后一本书读完,发现几乎什么也没记住。。。
但这样的知识穿脑而过的阅读倒也不能说毫无收获,至少书中提到的由古至今的哲学议题都帮我做了很好的大脑体操,而不同哲学家对同一个议题的不同角度的诠释也让我对于不同的思想更加包容。重新翻开这本书的目录,这些哲学流派和议题对我不再那么陌生了。
整部书读下来,感觉哲学的研究中心从生活中的问题开始,逐渐抽象化体系化,最终又回到生活中的问题,是一个很有趣的过程。我印象中的哲学不再是死板而无用的了,而是活泼的,与日常生活息息相关的。也许后面可以系统的读一些其他哲学的著作,或者更认真的做一下世界哲学史的读书笔记。

发表在 读书 | 留下评论

把主机作为git服务器

把主机当做git服务器使用非常容易

  1. remote主机创建git服务的用户,这里假设叫gituser
    sudo useradd -m gituser
  2. remote主机设置ssh公钥
    su gituser
    cd ~
    mkdir .ssh
    vim .ssh/authorized_keys # 复制公钥内容
    chmod 700 .ssh
    chmod 600 .ssh/authorized_keys
  3. remote主机初始化空想要同步的repo(使用--bare flag)
    mkdir -p ~/path_to/some_test_repo.git
    cd ~/path_to/some_test_repo.git
    git init --bare
  4. 本地电脑添加remote信息并push
    cd some_local/repo_path
    git remote add origin gituser@remote-server-url:/path_to/some_test_repo.git
    git push

参考材料:

发表在 积少成多 | 留下评论

获取python module的路径

每个module都会有__file__属性记录它的路径,以numpy为例

import numpy as np
import os
path = os.path.abspath(np.__file__)

Reference:
https://stackoverflow.com/questions/247770/how-to-retrieve-a-modules-path

发表在 积少成多 | 留下评论

MPI跨节点跑多GPU任务

需要确保资源正确分配,mpirun可以用如下命令,$ngpu$ 是跨节点的总GPU数,$ngpu_per_node是每个节点的GPU数量

mpirun -np $ngpu -npernode $ngpu_per_node \
    --map-by slot:pe=1 \
    --rank-by slot \
    --bind-to core \
    --report-bindings \
    <commond to run>

作业脚本为了跨节点GPU申请也需要使用额外的设置,这里以slurm为例,应该写成

#!/bin/bash
#SBATCH -N 4 # 节点数
#SBATCH --gres=gpu:8 # 每个节点的GPU个数,对应$ngpu_per_nod
#SBATCH --qos=<gpu资源名称> # 这部分需要参考服务器的说明
发表在 积少成多 | 留下评论

[Linux]指定程序运行的cpu id

在HPC上跑计算时,有时需要指定程序运行的CPU id以便获得更好的性能,下列linux命令可以达到这个目的:

taskset -c 0-5 <command> [arguments for command]

Reference:https://unix.stackexchange.com/a/635992

发表在 积少成多 | [Linux]指定程序运行的cpu id已关闭评论

[午休阅读]塞万提斯-堂吉诃德

又有一段时间没有更新博客了,不过最近倒是没有闲着,而是在和一位从人品到学识都无可指摘的善人,拉曼查的绅士吉哈那展开了一场漫长的冒险。这位绅士在冒险中还有另一个为人所熟知的称呼,苦相骑士堂吉诃德。

因为鄙人对西方文化一窍不通,对这位的引经据典和种种奇遇中的隐喻没有什么共鸣,甚至由于午休的时间过于零碎,甚至每段冒险都进行的断断续续,看了后面忘了前面。但是我还是从这份阅读中得到了难得的休闲,得以从日常工作和生活中抽身出来,随着堂吉诃德和桑丘信马由缰;或者从桑丘打趣的俗语中获得一些乐趣;或者从堂吉诃德的除了骑士相关的谈话中接受教导;甚至偶尔只是捧着书发发呆,思考究竟我们的骑士是清醒的还是疯了。

堂吉诃德在游历的时候究竟是真的疯了还是醒着,骑士小说中的故事他是真的按字面意义相信还是仅仅作为信仰,遭遇到的人借他的发疯与他开的玩笑是否适当,这些疑问在阅读中常常伴随着我,到现在我也不知道答案。也许需要等到下次重温这部伟大作品时再来回答了。

发表在 读书 | [午休阅读]塞万提斯-堂吉诃德已关闭评论

macOS Quantum espresso编译hdf5支持

使用macports直接编译QE,会找不到macports的hdf5库,尝试各种flag设置无果后,改用手工编译HDF5解决

编译安装hdf5

./configure --prefix=$HOME/opt/hdf5 --enable-fortran --enable-parallel  --enable-hl 
make && make install

注意configure时需要添加fortran支持等flag

编译安装QE

configure

F90=gfortran CC=gcc \
BLAS_LIBS="-L/opt/local/lib -lopenblas" \
LAPACK_LIBS="-L/opt/local/lib -lopenblas" \
./configure --enable-openmp --disable-parallel --enable-debug \
            --with-hdf5=$HOME/opt/hdf5

删除不支持的编译flag

生成的makefile会包含一个不支持的flag,-lrt,需要从make.inc中删除

make and test

make pw # only pw.x and related modules
cd test-suite
make run-tests-pw
发表在 笔记 | 标签为 | 留下评论

[午休阅读]奥威尔-动物庄园

这本书其实是在罗素的幸福之路之前读完的,不过当时没有什么记录的欲望,所以读完就放下了。现在还是把这个补上吧,虽然整体而言,我对这本书的评价并不是很高。

以动物寓言的形式探讨一些人性或社会问题往往会非常容易引起别人的共鸣,从伊索寓言来时就一直是这样的。经过文学的加工和放大,不同动物特点可以隐隐约约的与社会上某类人的品质相暗合。此类作品的确也有些是在描写如母爱,牺牲精神等高尚品格,但是大多还是通过放大人动物性的阴暗面来达到讽刺的效果。无论是歌颂还是讽刺,此类作品的都需要故事保持一个动物性和人性之间的平衡:即动物要在与他们映射的人群保持足够的相关性;但是又不能丧失动物本身的特性,使读着觉得书里的动物都是嘉年华上的人偶一样是人装的。但是因为作品中的动物毕竟只是“类人”,这种微妙的人兽平衡很难通篇保持,所以很多动物寓言都是短片的,点到为止。

 

大概这也是我不喜欢动物庄园这本书的原因,借助动物来讲述一个革命果实被窃取,或者屠龙勇士最终变成恶龙的故事本身没有什么问题,甚至读起来也会让人觉得很多都和现实有很高的相似度而有一些共鸣。但因为作者过于用力地映射政治了,里面的动物除了一些刻意的描写,基本丧失了动物的本性。在故事结尾,作者的立意是“猪”最终变成了“人”,但以我读起来,在此之前,所有的动物早就变成了人。或者说从来就没有动物,只有一个带了个动物名牌的人在出演这部小说。

 

不只这样的设计是不是作者本意,也许这种间离感也是引起读者思考的一种手段?

 

发表在 读书 | [午休阅读]奥威尔-动物庄园已关闭评论

[午休阅读]罗素-幸福之路

最近天冷了,没有出去闲逛,于是开启了一个新的午休活动,就是看看书。我这个人生性懒散,读书基本就是文字从眼前过一遍,囫囵吞枣的把书翻完就算完事,也不会去做文摘或详细的读书笔记一类。这样的阅读态度对一些平平之作也还好,毕竟它们的作者可能也没有为他们付出太多的心血,但是对于一些优秀的作品,则多少显得对作者和书有些不恭敬了。所以以后但凡遇到我自己为的上乘作品还是在随笔里面记录一些感想吧。

 

今天读完的作品是罗素的幸福之路。最近一段时间,因为工作(无偿加班)时长骤增,我的幸福感在陡然下降,也刚好在这个档口,我和这本幸福之路相遇了。作者首先反面问题出发,分析了种种因个人原因降低幸福感的因素。对于这部分我不禁感叹,罗素在一百年前的英国观察到的不幸的原因与今天人们的遭遇几乎没有两样,人们还是同样的挣扎在工作的疲倦与无所事事的厌烦之间;主流社会依旧将道德高升与忍受苦难挂钩,而对合理的放松和享受冠以躺平的污名;旁人的眼光和流言蜚语依然有可能给普通人带来灾难…… 但这些恼人的外部因素终归要与人内在的不幸福的种子结合才能借出不幸福的苦果。这些内因有些是人的天性使然,比如惧怕无聊,嫉妒等等;有些来自早年教育,比如对不必要的争强好胜,不合实际的负罪感等等;有些甚至是纯粹的为作新赋强说愁。最终是这些内部的不幸福种子在环境的引导下促成了不幸的感觉。当然,书里讨论的不是那些外部环境非常极端的情况,在极端的不行环境下,外部因素才是不幸福感的决定因素。但是对于我们大多数普通人来说,确实内在因素决定了我们是否感到幸福。

 

虽然内在和外在都有那么多不幸福的因素,但是我们不应该在寻找幸福上放弃。罗素给了我们切实可行的获得幸福感的方法。对于种种不幸福的内在因素,一旦我们知道了他们是什么,那么通过直面这些问题,消除它们在内心存在的土壤是可能的。比如恐惧感,如果我们不是永远会比它,而是用理智层层分析,就会发现我们所惧怕的东西原来不过如此。这个用理智直面问题,而不是逃避拖延的心法,已经被很多人从不同的角度阐释过,比如少有人走的路中的自律,佛教修行中的观心。另一方面,我们还应该在内心培养造成幸福的因素,可以是爱情,可以是家庭,也可以是业余爱好。总之,要让自己在生活中充满兴致,保持应该有的好奇和平衡。这些幸福是一种本能,无论是小动物还是我们小时候都是熟练掌握的。大概这与道德经中对婴儿的推崇有某种暗合吧。

 

总之,要获得幸福,只守株待兔是不行的,需要自己在生活中不断修行,这也是为什么这本书叫幸福之路的原因吧。

发表在 读书 | [午休阅读]罗素-幸福之路已关闭评论

[午休闲逛]京张铁路遗址公园

还不错的小公园,等有时间要再逛一逛,详细看看里面的文字介绍

IMG 3016

IMG 3017

发表在 闲逛 | [午休闲逛]京张铁路遗址公园已关闭评论

[午休闲逛]知春公园

一个抗震救灾为主题的公园,有一些雕塑和科普展板

发表在 闲逛 | [午休闲逛]知春公园已关闭评论