Do you really want that data passed back down to the caller of the allocation? From the description of the failure state you'd want to log that data instead: what's the caller of the allocation going to do if you tell it it failed with a crazy size? It already knows the size, it's the one who asked for it.
So, suppose it's a rust library -- you're locking me into whatever logging system the library author chooses? Maybe I'd like to consume the relevant data at the entry point and send it to a logging system of my choice.
A Rust library likely wouldn't be returning an opaque Box<dyn Error> to begin with. Errors are part of a library's API—it's what allows consumers to handle them—so you'd define an enum of possible errors your library could produce and return that, which would be stored on the stack.
You can do better than the errors in other languages. You can provide all the relevant information in the emum variant.
enum MyApiBindingCrateError {
// You didn't provide an
// API key. Maybe we should
// design our interface to
// make this impossible
ApiKeyMissing,
// Client was unauthorized
// to make this request
AuthorizationError,
// The entity you requested
// did not exist (404'd)
NotFoundError,
// You're sending too many
// requests to the server
TooManyRequests,
// That specific error with
// the API
// Maybe users can't delete
// folders until they're empty
// Whatever
SpecificApiIssue1,
// Some other specific error
// with the API
SpecificApiError2,
// Server didn't respond the
// way we expected.
// Here's what it told us
UnexpectedHttpResponse {
// HTTP status code
status_code: StatusCode,
// If it had a
// string-encoded body
body: Option<String>,
},
// Unhandled Issue with IO
IoError(io::Error),
// Unhandled Issue with
// request library
ReqwestError(reqwest::Error),
}
The beauty with Rust is that you can create really detailed concrete errors at the crate level. Your callers will know exactly what the actual error states are.
Your application can be a little less structured if you want. Though with LLMs, I'm using anyhow and thiserror a lot less.
I think this is a clash of terminology: a Rust enum isn't an integer with pretensions of an identity.
You'd describe it as a tagged union in some languages. So when you say you'd return an error with extra information, what that information is is associated with the specific variant of the enum.
Using yuriks AllocError as an example, if the error is SizeTooLarge, it has the size field. Other errors may have no additional data, others may have different data.
When you return an error from your allocating function, it's a known size, the size of the largest enum variant + the discriminant (tag).
There's a few confusing things here. For one, just because the allocator gave you an AllocError, that doesn't mean your function has to return an AllocError: you can return whatever error type you choose. If you want to collect a stack trace at that point, put one in there.
What value would a stack trace that includes internal allocator functions be to you? What do you lose by having to collect the stack trace at the point where your function receives an AllocError?
usually “stdout” is good enough, wrapper/runner routes output to logserver for collation and search. who cares about formats as long as it’s reasonably structured and searchable?
it depends, if the functionality represented by the library is known to require a lot of memory (or simply allocation failures are an expected part of its operation), then it should be pretty much part of the API, probably with some tracing/diagnostics interface to get the required visibility into how much memory goes and where.
but for most libraries I on allocation failure I don't expect any fancy logging system. maybe even panic is fine.
If you let the allocation error panic you will get your stack trace.
You can't have a stack trace on an error in the error path that failed to allocate. If you have a "jumbo sized" error and the error fails to allocate, it won't get reported. The only reporting you will get is that the error failed to allocate and this new allocation error overrides the error that failed to allocate.
if you have an OOM and want to log why, if your logging allocates it will likely fail as well. You could in principle work around this with enough effort, but "properly" handling OOM is typically much more trouble than its worth.
That's because the types of errors where you want a stack trace are a relatively small subset of all possible errors.
Stack traces are only useful for errors that indicate a bug in the program, i.e. something a programmers has to respond to. It's not useful for the vast class of bugs that are a result of wrong input, wrong external state, or infrastructure issues.
Rust projects tend to favor panicking over error handling for programmer bugs (which does indeed give you a stack trace depending on environment variables), or even better encoding the invariants in the type system, but there are cases where an error coming from a library are truly, actually unexpected, so both `anyhow` and `thiserror` do provide support for attaching a stack trace in those situations.
This sounds disingenuous. They explained why the language doesn't force stack traces on all errors, and then explained how to get them if you want them.
I see a very opinionated explanation, not a "this is why the language does not" explanation.
>Stack traces are only useful for errors that indicate a bug in the program, i.e. something a programmers has to respond to. It's not useful for the vast class of bugs that are a result of wrong input, wrong external state, or infrastructure issues.
This is a personal opinion, not something you can declare as the objective truth. There is a lot of value in seeing what path the program took before it encountered a eg. validation error.
>but there are cases where an error coming from a library are truly, actually unexpected, so both `anyhow` and `thiserror` do provide support for attaching a stack trace in those situations.
This is wrong because it's up to the library to attach the stacktrace, not the userland code using the library, so saying "you can get them if you want them" is not true.
If the author of the library did not decide to attach the stacktrace, your only option is wrapping it yourself, which you can only do if you already know up front all the paths that can fail. Also, you are not supposed to expose errors from a library with anyhow, they are only for application/top level code.
I'm curious, what's the value of a stack trace of another person's library functions? As mentioned, you can get a stack trace that includes all of your code, that's what was offered to you.
The only thing a library gathering a stack trace instead of you gives you is that it includes traces through code you didn't write & ostensibly aren't responsible for. If you're going to go to the effort of tracing through a dependencies code, you might as well add the stack trace yourself; it's a single line of code from the standard library to collect it, std::backtrace::Backtrace::capture().
EDIT: capture will only actually grab a trace when env vars say it should, you can use force_capture to ignore those. To get to why this isn't the default for errors you're asking for, here's a line from their documentation:
> Capturing a backtrace can be both memory intensive and slow
Ideally (in my ideal world), it would be Result<T, E> that holds the backtrace. The value is that I don't know up front which method call is going to cause an error that is hard to track down, which is why I don't see how "instrument your calls with backtrace yourself" helps. It requires that I already have some idea about the execution path, otherwise I don't know where to put the backtrace instrumentation.
Since Backtrace::capture() is already tied to an env var, we could have the backtrace on Result without affecting performance, since you would only enable it for debugging. This would allow you to eg. easily track down a situation where you see in your prod logs that you are encountering a lot of "validation error: string is too long" but you can't tell where it is coming from. Flip the env var, redeploy the application, read the backtrace, turn off the env var, fix the problem.
> track down a situation where you see in your prod logs that you are encountering a lot of "validation error: string is too long" but you can't tell where it is coming from.
Capturing a stack trace is a hefty operation: making it happen on _every_ error creation, which would include creating an error in response to another error (like <failure to allocate> causing <failure to create object>) could easily grind a production server to a halt. Especially if there's correctly handled errors happening: every one of them will pay this cost, every time.
It sounds like a really specific problem here; the log line that's happening is generic enough that it doesn't identify which line of code is emitting the log, so you can't just add `capture` to that line (what logging system even does this? printf logging?).
I feel like we are talking past each other, because you ignored the whole part about "it is already tied to an env var, and it would be still tied to an env var" that you would only enable on demand, so who cares if it's a hefty operation? Also what about other languages that capture stacktraces all the time with exceptions, or scripting languages with type errors, where you can't even turn it off? Rust is somehow different?
It is a specific problem, so what? You see that you are sending 500 from an axum handler, and you are logging "serde deserialization error: line 4 invalid", wouldn't it be nice to see where that came from, without instrumenting all the places you are deserializing something?
Rust errors are not exceptions. Catching exceptions is unbelievably expensive in all languages that support them, compared to handling a Rust error value.
Some languages have exceptions as the only error handling mechanism (C#, Java, scripting languages), and it sounds like that's what you're used to. But this is also broadly agreed to be a severely limiting factor of those languages, resulting from being designed at a time when we didn't know better.
If you want to go fast (and Rust does), you cannot be catching exceptions in the hot path, and you certainly can't be throwing exceptions that carry stack traces, because walking the stack to build up the stack trace is many orders of magnitude slower than returning an error value.
Rust's error handling modes are designed with the benefit of hindsight from all those other languages from the last few decades, and reflects the fact that errors broadly fall in two categories: validation failures and programmer errors. The former should be a cheap error code that can be handled, the latter should terminate the program/thread/task and give you enough information to diagnose the problem.
I can't reconcile what you're asking for with the situation you're describing. If every single error everywhere in the program created a stack trace and logged it at creation time, your error would be lost under an avalanche of benign errors that are handled. And if you only want to selectively log _that_ error that's interesting, you need to selectively modify the place that logs it, which you don't want to do (because you don't want to have to find it).
It sounds like what you want is the errors you log to always log stack traces. Which is a fine position, I do something like that. It's just not something that can be the default, because it can't be done everywhere.