Four limitations of Rust's borrow checker

11 comments

A pattern in these is code that compiles until you change a small thing. A closure that works until you capture a variable, or code that works in the main thread but not in a separate thread. Or works until you move the code into an if/else block.

My experience with Rust is like this: make a seemingly small change, it balloons into a compile error that requires large refactoring to appease the borrow checker and type system. I suppose if you repeat this enough you learn how to write code that Rust is happy with first-time. I think my brain just doesn't like Rust.

Your supposition about Rust is correct.

I’ll add that—having paid that upfront cost—I am happily reaping the rewards even when I write code in other languages. It turns out the way that Rust “wants” to be written is overall a pretty good way for you to organize the relationships between parts of a program. And even though the borrow checker isn’t there looking out for you in other languages, you can code as if it is!

I had a similar experience with Erlang/Elixir. The primary codebase I work with in $DAYJOB is C++ but structured very OTP-like with message passing and threads that can crash (via exceptions, we’re not catching defaults for example) and restart themselves.

Because of the way we’ve set up the message passing and by ensuring that we don’t share other memory between threads we’ve virtually eliminated most classes of concurrency bugs.

That's the main use-case of Erlang/Elixir anyway: eliminate concurrency / parallelism bugs by copying data in the dangerous places, and make sure not to use locks but message passing instead. These two alone have eliminated most of the bugs I've wrote in other languages.

So, same experience. And it taught me to be a better Golang and Rust programmer, too.

This is why I think cross-training is so important and I should do more of it. Even something relatively minor, like doing the Advent of Code in elixir (nim last year) has made my python significantly easier to reason about. If you quit mutating state so much, you eliminate whole classes of bugs.

That's also how I tend to program after having read Joe Armstrong's dissertation "Making reliable distributed systems in the presence of sodware errors" and having programmned in Go for a few years.

I'm fortunate enough not to have to often write code in other languages anymore, but my experience that writing code in ways that satisfies the compiler actually ends up being code I prefer anyhow. I was somewhat surprised at the first example because I haven't run into something like that, but it's also not really the style I would write that function personally (I'm not a big fan of repetitions like having `Some(x)` repeated both as a capture pattern and a return value), so on a whim I tried what would have been the way I'd write that function, and it doesn't trigger the same error:

    fn double_lookup_mut(map: &mut HashMap<String, String>, mut k: String) -> Option<&mut String> {
        map.get_mut(&k)?;
        k.push_str("-default");
        map.get_mut(&k)
    }
I wouldn't have guessed that this happened to be a way around a compiler error that people might run into with other ways of writing it; it just genuinely feels like a cleaner way for me to implement a function like that.

Isn't that the opposite of the intended implementation? I don't write Rust, but I think your implementation will always return either `None` or the "fallback" value with the `"-default"` key. In the article, the crucial part is that if the first `map.get_mut()` succeeds, that is what is returned.

Whoops, you're definitely right. This is why I shouldn't try to be productive in the morning.

A great example of how "if it compiles, it runs correctly" is bullshit.

You're reaching pretty hard there. Your assertion is a massive strawman, the implication seeming to be that "every problem in your logic won't exist if it compiles" - no one thinks you can't write bad logic in any language.

Rather it's about a robust type system that gives you tooling to cover many cases at compile time.

Even if we ignore logic, Rust has plenty of runtime tooling and runtime issues can still happen as a result. A complaint i often have about Bevy (despite loving it!) is that it has a lot of runtime based plugins/etc which i'd prefer to be compile time. Axum for example has a really good UX while still being heavily compile time (in my experience at least).

"If it compiles it works" is still true despite my complaints. Because i don't believe the statement even remotely implies you can't write bad logic or bad runtime code in Rust.

This particular example explicitly dodges compile time checking for some ad-hoc (but likely safe) runtime behavior. It’s not a strawman at all. It’s a classic example of how sometimes the compiler can’t help you, and even worse, how programmers can defeat their ability to help you.

Right but their statement (as i parsed it) was that the the "if it compiles it works" phrase is bullshit. Since there's some cases where it obviously won't be true.

At best it's ignorant of what that phrase means, in my view.

> Since there's some cases where it obviously won't be true.

That's not how I've seen it used every that I have seen it used.

> At best it's ignorant of what that phrase means, in my view.

I think your opinion on what the phrase means is a minority opinion.

If it was as wide as the OP said then it means errors, panics, and especially Unsafe wouldn't exist. Even if we ignore unclean sources (say network errors/etc), this isn't a Proofed language where programs in it cannot possibly fail.

Besides, it likely is very possible to write programs that cannot fail in Rust. This usually means encoding state into the type system (enums included), but few go through that work. They know what corners they're cutting. Further proving that they know their program can fail.

Hell, Rust itself can fail. I struggle to imagine how this is perceived as some long-con from the Rust PR team to convince people Rust programs cannot be written with incorrect logic.

I think they were highlighting that that phrase is bullshit. It’s trivial to escape many compile time checks.

Yea, but that's my argument - that they're being dense (i imagine on purpose?). The phrase doesn't mean that nothing can fail at runtime. Of course it doesn't.

Rather that we have many tools to write a program that can be written with many compile time checks. For example many representations of state can be describe in compile time checks via enums, type transitions, etc.

> The phrase doesn't mean that nothing can fail at runtime.

That's exactly what that phrase means, you're twisting the actual words in the phrase to be able to arrive at something more acceptable in your own mind. All you're saying is "oh you shouldn't take it literally". I can guarantee most people take that literally and it's bullshit.

So you think everyone using that phrase thinks it's not possible to fail at all at runtime, in any manner?

Unsafe, Panic and Error would like a word.

Honestly, I think the majority of the times I've said that sentence has been after running code that has an obvious mistake (like the code I posted above)!

As someone who is interested in getting more serious with Rust, could you explain the essence of how you should always approach organizing code in Rust as to minimize refactors as the code grows?

In my experience there are two versions of "fighting the borrow checker". The first is where the language has tools it needs you to use that you might not've seen before, like enums, Option::take, Arc/Mutex, channels, etc. The second is where you need to stop using references/lifetimes and start using indexes: https://jacko.io/object_soup.html

>and start using indexes

So basically raw pointers with extra hoops to jump through.

Sort of. But you still get guaranteed-unaliased references when you need them. And generational indexes (SlotMap etc) let you ask "has this pointer been freed" instead of just hoping you never get it wrong.

Yep. The array index pattern is unsafe code without the unsafe keyword. Amazing how much trouble Rust people go through to make code "safe" only to undermine this safety by emulating unsafe code with safe code.

It’s not the same. The term “safe” has a specific meaning in rust: memory safety. As in:

- no buffer overflows - no use after free - no data races

These problems lead to security vulnerabilities whose scope extends beyond your application. Buffer overflows have historically been the primary mechanism for taking over entire machines. If you emulate pointers with Rust indices and don’t use “unsafe”, those types of attacks are impossible.

What you’re referring to here is correctness. Safe Rust still allows you to write programs which can be placed in an invalid state, and that may have security implications for your application.

It would be great if the compiler could guarantee that invalid states are unreachable. But those types of guarantees exist on a continuum and no language can do all the work for you.

"Safe" as a colloquial meaning: free from danger. The whole reason we care about memory safety is that memory errors become security issues. Rust does nothing to prevent memory leaks and deadlocks, but it does prevent memory errors becoming arbitrary code execution.

Rust programs may contain memory errors (e.g. improper use of interior mutability and out of bounds array access), but the runtime guarantees that these errors don't become security issues.

This is good.

When you start using array indices to manage objects, you give up some of the protections built into the Rust type system. Yes, you're still safe from some classes of vulnerability, but other kinds of vulnerabilities, ones you thought you abolished because "Rust provides memory safety!!!", reappear.

Rust is a last resort. Just write managed code. And if you insist on Rust, reach for Arc before using the array index hack.

I tend to agree w.r.t. managed languages.

Still, being free from GC is important in some domains. Beyond being able to attach types to scopes via lifetimes, it also provides runtime array bounds checks, reference-counting shared pointers, tagged unions, etc. These are the techniques used by managed languages to achieve memory-safety and correctness!

For me, Rust occupies an in-between space. It gives you more memory-safe tools to describe your problem domain than C. But it is less colloquially "safe" than managed languages because ownership is hard.

Your larger point with indices is true: using them throws away some benefits of lifetimes. The issue is granularity. The allocation assigned to the collection as a whole is governed by rust ownership. The structures you choose to put inside that allocation are not. In your user ID example, the programmer of that system should have used a generational arena such as:

https://github.com/fitzgen/generational-arena

It solves exactly this problem. When you `free` any index, it bumps a counter which is paired with the next allocated index/slot pair. If you want to avoid having to "free" it manually, you'll have to devise a system using `Drop` and a combination of command queues, reference-counted cells, locks, whatever makes sense. Without a GC you need to address the issue of allocating/freeing slots for objects within in an allocation in some way.

Much of the Rust ecosystem is libraries written by people who work hard to think through just these types of problems. They ask: "ok, we've solved memory-safety, now how can we help make code dealing with this other thing more ergonomic and correct by default?".

Absolutely. If I had to use an index model in Rust, I'd use that kind of generational approach. I just worry that people aren't going to be diligent enough to take precautions like this.

'generational-arena' is unmaintained and archived now.

Even when you use array indices, I don't think you give those protections up. Maybe a few, sure, but the situation is still overall improved.

Many of the rules references have to live by, are also applied to arrays:

- You cannot have two owners simultaneously hold a mutable reference to a region of the array (unless they are not overlapping)

- The array itself keeps the Sync/Send traits, providing thread safety

- The compiler cannot do provenance-based optimizations, and thus cannot introduce undefined behavior; most other kinds of undefined behavior are still prevented

- Null dereferences still do not exist and other classes of errors related to pointers still do not exist

Logic errors and security issues will still exist of course, but Rust never claimed guarantees against them; only guarantees against undefined behavior.

I'm not going to argue against managed code. If you can afford a GC, you should absolutely use it. But, compared to C++, if you have to make that choice, safety-wise Rust is overall an improvement.

You can still have use-after-free errors when you use array indices. This can happen if you implement a way to "free" elements stored in the vector. "free" should be interpreted in a wide sense. There's no way for Rust to prevent you from marking an array index as free and later using it.

> There's no way for Rust to prevent you from marking an array index as free and later using it.

I 2/3rds disagree with this. There are three different cases:

- Plain Vec<T>. In this case you just can't remove elements. (At least not without screwing up the indexes of other elements, so not in the cases we're talking about here.)

- Vec<Option<T>>. In this case you can make index reuse mistakes. However, this is less efficient and less convenient than...

- SlotMap<T> or similar. This uses generational indexes to solve the reuse problem, and it provides other nice conveniences. The only real downside is that you need to know about it and take a dependency.

The consequences of use-after-free are different for the two.

In rust it is a logic error, which leads to data corruption or program panics within your application. In C it leads to data corruption and is an attack vector for the entire machine.

And yes, while Rust itself doesn’t help you with this type of error, there are plenty of Rust libraries which do.

The difference is that the semantics of your program are still well-defined, even with bugs in index-based arenas.

The semantics of a POSIX program are well-defined under arbitrary memory corruption too --- just at a low level. Even with a busted heap, execution is deterministic and the every interaction with the kernel has defined behavior --- even if they behavior is SIGSEGV.

Likewise, safe but buggy Rust might be well-defined at one level of abstraction but not another.

Imagine an array index scheme for logged-in-user objects. Suppose we grab an index to an unprivileged user and stuff it in some data structure, letting it dangle. The user logs out. The index is still around. Now a privileged user logs in and reuses the same slot. We do an access check against the old index stored in the data structure. Boom! Security problems of EXACTLY the sort we have in C.

It doesn't matter that the behavior is well-defined at the Rust level: the application still has an escalation of privilege vulnerability arising from a use-after-free even if no part of the program has the word u-n-s-a-f-e.

Undefined behavior in C/C++ has a different meaning than you're using. If a compiler encounters a piece of code that does something whose behavior is undefined in the spec, it can theoretically emit code that does anything and still be compliant with the standards. This could include things like setting the device on fire and launching missiles, but more typically is something seemingly innocuous like ignoring that part of the code entirely.

An example I've seen in actual code: You checked for null before dereferencing a variable, but there is one code path that bypasses the null check. The compiler knows that dereferencing a null pointer is undefined so it concludes that the pointer can never be null and removes the null checks from all of the code paths as an "optimization".

That's the C/C++ foot-gun of undefined behavior. It's very different from memory safety and correctness that you're conflating it with.

From the kernel's POV, there's no undefined behavior in user code. (If the kernel knew a program had violated C's memory rules, it could kill it and we wouldn't have endemic security vulnerabilities.) Likewise, in safe Rust, the access to that array might be well defined with respect to Rust's view of the world (just like even UB in C programs is well defined from the kernel POV), but it can still cause havoc at a higher level of abstraction --- your application. And it's hard to predict what kind of breakage at the application layer might result.

and you now have unchecked use-after-decommisioning-the-index and double-decommission-the-index errors, which could be security regressions

That's true only if you use Vec<T> instead of a specialized arena, either append only, maybe growable, or generational, where access invalidation is tracked for you on access.

Yeah if you go with Vec, you have to accept that you can't delete anything until you're done with the whole collection. A lot of programs (including basically anything that isn't long running) can accept that. The rest need to use SlotMap or similar, which is an easy transition that you can make as needed.

> So basically raw pointers with extra hoops to jump through.

That's one way to look at it.

The other way is: raw pointers, but with mechanical sympathy. Array based data structures crush pointer based data structures in performance.

> Array based data structures crush pointer based data structures in performance

Array[5] And *(&array + 5) generates the same code... Heap based non-contiguous data structures definitely are slower than stackbased contiguous data structures.

How you index into them is unrelated to performance.

Effectively pointers are just indexes into the big array which is system memory... I agree with parent, effectively pointers without any of the checks pointers would give you.

> pointers are just indexes into the big array which is system memory...

I’m sure you are aware but for anyone else reading who might not be, pointers actually index into your very own private array.

On most architectures, the MMU is responsible for mapping pages in your private array to pages in system memory or pages on disk (a page is a subarray of fixed size, usually 4 KiB).

Usually you only get a crash if you access a page that is not currently allocated to your process. Otherwise you get the much more insidious behaviour of silent corruption.

>How you index into them is unrelated to performance.

Not true. If you store u32 indices, that can impose less memory/cache pressure than 64-bit pointers.

Also indices are trivially serializable, which cannot be said for pointers.

I'll happily look at a benchmark which shows that the size of the index has any significant performance implications vs the work done with the data stored at said index, never mind the data actually stored there.

I haven't looked closely at the decompiled code but I wouldn't be surprised if iterating through a contiguous data structure has no cache pressure but is rather just incrementing a register without a load at all other than the first one.

And if you aren't iterating sequentially you are likely blowing the cache regardless purely based on jumping around in memory.

This is an optimisation that may be premature.

EDIT:

> Also indices are trivially serializable, which cannot be said for pointers

Pointers are literally 64bit ints... And converting them to an index is extremely quick if you want to store an offset instead when serialising.

I'm not sure if we are missing each other here. If you want an index then use indices. There is no performance difference when iterating through a data structure, there may be some for other operations but that has nothing to do with the fact they are pointers.

Back to the original parent that spurred this discussion... Replacing a reference (which is basically a pointer with some added suger) with an index into an array is effectively just using raw pointers to get around the borrow checker.

> Pointers are literally 64bit ints... And converting them to an index is extremely quick if you want to store an offset instead when serialising.

I'm not them, but they're saying pointer based structures are just less trivial to serialize. For example, to serialize a linked list, you basically need to copy them into an array of nodes, replacing each pointer to a node with a local offset into this array. You can't convert them into indices just with pointer arithmetic because each allocation was made individually. Pointer arithmetic assumes that they already exist in some array, which would make the use of pointers instead of indices inefficient and redundant.

I understand that entirely, a link list is a non-contiguous heap based data structure.

What I am saying is if you store a reference to an item in a Vec or an index to an item to a Vec it is an implementation detail and looking up the reference or the index generates effectively the same machine code.

Specifically in the case that I'm guessing they are referring to which is the optimisation used in patterns like ECS. The optimisation there is the fact that it is stored contiguously in memory and therefore it is trivial to use SIMD or a GPU to do operations on the data.

In that case whether you are storing a u32 or size_t doesn't exactly matter and on a 32bit arch is literally equivalent. It's going to be dwarfed by loading the data into cache if you are randomly accessing the items or by the actual operations done to the data or both.

As I said, sure use an index but that wasn't the initial discussion. The discussion was doing it to get around the borrow check which is effectively just removing the borrow checker from the equation entirely and you may as well have used a different language.

The main benefit from contiguous storage is it can be a better match to the cache. Modern CPUs read an entire cache line in a burst. So if you're iterating through a contiguous array of items then chances are the data is already in the cache. Also the processor tends to prefetch cache lines when it recognizes a linear access pattern, so it can be fetching the next element in the array while it's working on the one before it.

> Pointers are literally 64bit ints... And converting them to an index is extremely quick if you want to store an offset instead when serialising.

This implies serialisation/deserialisation passes, so you can't really let bigger-than-ram data live on your disk.

[deleted]

> stop using references/lifetimes and start using indexes

Aren't arenas a nicer suggestion? https://docs.rs/bumpalo/latest/bumpalo/ https://docs.rs/typed-arena/latest/typed_arena/

Depending on the use case, another pattern that plays very nicely with Rust is the EC part of ECS: https://github.com/Ralith/hecs

Yes, Slab and SlotMap are the next stop on this train, and ECS is the last stop. But a simple Vec can get you surprisingly far. Most small programs never really need to delete anything.

> It turns out the way that Rust “wants” to be written is overall a pretty good way for you to organize the relationships between parts of a program

That's what it promised not to do, though! Zero cost abstractions aren't zero cost when they force you into a particular design. Several of the cases in the linked article involve actual runtime and code size overhead vs. the obvious legacy/unchecked idioms.

> vs. the obvious legacy/unchecked idioms

You can go crazy with legazy/unchecked/unsafe stuff if you want to in Rust. It's less convenient and more difficult than C in some ways, but 1) it's also safer and more convenient in other ways, and 2) "this will be safe and convenient" isn't exactly the reason we dive into legacy/unchecked/unsafe stuff.

And of course the greatest strength of the whole Rust language is that folks who want to do crazy unsafe stuff can package it up in a safe interface for the rest of us to use.

> crazy unsafe stuff

The first example in the linked article is checking if a value is stored in a container and doing something different if it's not than if it is. Hardly "crazy unsafe stuff".

I think there's an important distinction here. A systems programming language needs to have all the max speed / minimum overhead / UB-prone stuff available, but that stuff doesn't need to be the default / most convenient way of doing things. Rust heavily (both syntactically and culturally) encourages safe patterns that sometimes involve runtime overhead, like checked indexing, but this isn't the same as "forcing" you into these patterns.

And I can only repeat verbatim: The first example in the linked article is checking if a value is stored in a container and doing something different if it's not than if it is.

Hardly "max speed / minimum overhead / UB-prone stuff"

It’s sad to see your comment this far in the thread after having had go be coxed to make it and after all the usual defensive arguments from the Rust crowd. It’s a real issue that it’s so hard to make the Rust community admits that obvious flaws actually exist.

As someone who spent a significant time working on static analysers and provers, it annoys me to no end how most of the Rust community will happily take what the borrow checker imposes on them as some form of gospel and never questions the tool. It’s bordering on Stockholm syndrome sometimes.

I think a lot of pushback is because people are talking past each other.

The reality:

- the borrow checker has limitations that doesn't accept some constructs that could be be proved safe, given Rust's own rules

- the borrow checker is a net positive as it does push you towards better constructs, and the (rare) times it forbids you from doing something that could be safe, you have safe (sometimes not zero-cost) and unsafe escape hatches

But these are then understood by the intransigent as:

- the borrow checker is always detrimental

- the borrow checker can do nothing wrong

At that point, no one can understand why the other person is being obtuse, and you end up with... well, the comment section under every Rust article.

Rust absolutely forces you into a particular design - that's what the borrow checker is about.

It's definitely possible to write correct code (in unsafe Rust or other languages) that wouldn't satisfy the borrow checker.

Rust restricts you, and in turn gives you guarantees. Zero cost only means that you don't pay extra (CPU/memory) at runtime, whereas in interpreted/GC languages you do.

> Zero cost only means that you don't pay extra (CPU/memory) at runtime, whereas in interpreted/GC languages you do.

But... again, you do. The examples in the linked articles have runtime overhead. Likewise every time you wrap something in with Box or vec or Arc to massage the lifetime analysis, you're incurring heap overhead that is actually going to be worse than what a GC would see (because GC's don't need the extra layer of indirection or to deal with reference counts).

It's fine to explain this as acceptable or good design or better than the alternatives. It's not fine to call it "Zero Cost" and then engage in apologia and semantic arguments when challenged.

We should distinguish code that's correct using the borrowing metaphor but won't pass borrowck in current Rust (such code will inevitably exist thanks to Rice's Theorem) from code that's not correct under this metaphor but would actually work, under some other model for reference or pointer types.

Because Rust is intended for systems programming it is comfortable (in unsafe) expressing ideas which cannot be modelled at all. Miri has no idea how the MMIO registers for the GPIO controller work, so, too bad, Rust can't help you achieve assurance that your GPIO twiddling code is correct in your 1200 byte firmware. But, it can help you when you're talking about things it does have a model for, such as its own data structures, which obey Rust's rules (in this case the strict provenance rule) not "memory" that's actually a physical device register.

It sounds like the ideal, then, would be to detect the problematic patterns earlier so people wouldn't need to bang their heads against it.

Why would you cling to some cockamamie memory management model, where it is not required or enforced?

That's like Stockholm Syndrome.

Maybe I'm just brainwashed, but most of the time for me, these "forced refactors" are actually a good thing in the long run.

The thing is, you can almost always weasel your way around the borrow checker with some unsafe blocks and pointers. I tend to do so pretty regularly when prototyping. And then I'll often keep the weasel code around for longer than I should (as you do), and almost every time it causes a very subtle, hard-to-figure-out bug.

I think the problem isn't that the forced changes are bad, it's that they're lumpy. If you're doing incremental development, you want to be able to to quickly make a long sequence of small changes. If some of those changes randomly require you to turn your program inside-out, then incremental development becomes painful.

Some people say that after a while, they learn how to structure their program from the start so that these changes do not become necessary. But that is also partly giving up incremental development.

My concern is slightly different; it's the ease of debugging. And I don't mean debugging the code that I (or sb else) wrote, but the ability to freely modify the code to kick some ideas around and see what sticks, etc. which I frequently need to do, given my field.

As an example, consider a pointer to a const object as a function param in C++: I can cast it away in a second and modify it as I go on my experiments.

Any thoughts on this? How much of an extra friction would you say is introduced in Rust?

I would say it's pretty easy to do similar stuff in Rust to skirt the borrow checker. e.g. you can cast a mut ref to a mut ptr, then back to a mut ref, and then you're allowed to have multiple of them.

The problem is Rust (and its community) does a very good job at discouraging things like that, and there are no guides on how to do so (you might get lambasted for writing one. maybe I should try)

I don’t really think it gives up incremental development. I’ve done large and small refactors in multiple Rust code bases, and I’ve never run into one where a tiny change suddenly ballooned into a huge refactor.

Rust definitely forces you to make more deliberate changes in your design. It took me about 6 months to get past hitting that regularly. Once you do get past it, rust is awesome though.

I suppose you haven't had to refactor a large code base yet just because a lifetime has to change?

Nope.

I have worked professionally for several years on what would now be considered a legacy rust code base. Probably hundreds of thousands of lines, across multiple mission critical applications. Few applications need to juggle lifetimes in a way that is that limiting, maybe a module would need some buffing, but not a major code base change.

Most first pass and even refined "in production" code bases I work on do not have deeply intertwined life-times that require immense refactoring to cater to changes. Before someone goes "oh your team writes bad code!", I would say that we had no noteworthy problems with lifetimes and our implementations far surpassed performance of other GC languages in the areas that mattered. The company is successful built by a skeleton crew and the success is owed too an incredibly stable product that scales out really well.

I question how many applications truly "need" that much reference juggling in their designs. A couple allocations or reference counted pointers go a really long way to reducing cognitive complexity. We use arenas and whatever else when we need them, but no I've never dealt with this in a way that was an actual terrible issue.

actually the higher versions of rust actually do need these refactors way less often since more lifetimes can be elided and when using generic code or impl traits you can basically scrap a ton of it. I still sometimes stumble upon the first example tough but most often it happens because I want to encapsulate everything inside the function instead of doing some work outside of it.

> I suppose if you repeat this enough you learn how to write code that Rust is happy with first-time.

But this assumes that your specifications do not change.

Which we know couldn't be further from the truth in the real world.

Perhaps it's just me, but a language where you can never change your mind about something is __not__ a fun language.

Also, my manager won't accept it if I tell him that he can't change the specs.

Maybe Rust is not for me ...

I genuinely don’t know where you’ve gotten the idea that you can “never change your mind” about anything.

I have changed my mind plenty of times about my Rust programs, both in design and implementation. And the language does a damn good job of holding my hand through the process. I have chosen to go through both huge API redesigns and large refactors of internals and had everything “just work”. It’s really nice.

If Rust were like you think it is, you’re right, it wouldn’t be enjoyable to use. Thankfully it is nothing like that.

"Malum est consilium, quod mutari non potest" you might say.

My recommendation is that you do whatever you feel like with ownership when you first write the code, but then if something forces you to come back and change how ownership works, seriously consider switching to https://jacko.io/object_soup.html.

Isn't that just reinventing the heap, but with indexes in a vector instead of with addresses in memory?

You could look at it that way. But C++ programs often use similar strategies, even though they don't have to. Array/Vec based layouts like this give you the option of doing some very fancy high-performance stuff, and they also happen to play nicely with the borrow checker.

It's very basic and not a general solution because the lifetimes of objects are now set equal. And there's no compaction, so from a space perspective it is worse than a heap where the space of deleted objects can be filled up by new objects. It is nice though that you can delete an entire class of objects in one operation. I have used this type of memory management in the context of web requests, where the space could be freed when the request was done.

> make a seemingly small change, it balloons into a compile error that requires large refactoring to appease the borrow checker and type system

Same experience, but this is actually why I like Rust. In other languages, the same seemingly small change could result in runtime bugs or undefined behavior. After a little thought, it's always obvious that the Rust compiler is 100% correct - it's not a small change after all! And Rust helpfully guides me through its logic and won't let my mistake slide. Thanks!

Yo, everyone's interpreting parent's comment in the worst way possible: assuming they're trying to do unsound refactorings. There are plenty of places where a refactoring is fine, but the rust analyzer simply can't verify the change (async `FnOnce` for instance) gives up and forces the user to work around it.

I love Rust (comparatively) but yes, this is a thing, and it's bad.

Yeah, Rust-analyzer's palette of refactorings is woefully underpowered in comparison to other languages / tooling I've used (e.g. Resharper, IntelliJ). There's a pretty high complexity bar to implementing these too unfortunately. I say this as someone that has contributed to RA and who will contribute more in the future.

This is because programming is not a work in a continuous solution space. Think in this way; you're almost guaranteed to introduce obvious bugs by randomly changing just a single bit/token. Assembler, compiler, stronger type system, etc etc all try to limit this by bringing a different view that is more coherent to human reasoning. But computation has an inherently emergent property which is hard to predict/prove at compile time (see Rice's theorem), so if you want safety guarantee by construction then this discreteness has to be much more visible.

I don’t know anyone who has gotten Rust first time around. It’s a new paradigm of thinking, so take your time, experiment, and keep at it. Eventually it will just click and you’ll be back to having typos in syntax overtake borrow checker issues

> A pattern in these is code that compiles until you change a small thing.

I think that's a downstream result of the bigger problem with the borrow checker: nothing is actually specified. In most of the issues here, the changed "small thing" is a change in control flow that is (1) obviously correct to a human reader (or author) but (2) undetectable by the checker because of some quirk of its implementation.

Rust set out too lofty a goal: the borrow checker is supposed to be able to prove correct code correct, despite that being a mathematically undecidable problem. So it fails, inevitably. And worse, the community (this article too) regards those little glitches as "just bugs". So we're treated to an endless parade of updates and enhancements and new syntax trying to push the walls of the language out further into the infinite undecidable wilderness.

I've mostly given up on Rust at this point. I was always a skeptic, but... it's gone too far at this point, and the culture of "Just One More Syntax Rule" is too entrenched.

If you are trying overly hard to abstract things or work against that language, then yes, things can be difficult to refactor. Here's a few things I've found:

- Generics

- Too much Send + Sync

- Trying too hard to avoid cloning

- Writing code in an object oriented way instead of a data oriented way

Most of these have to do with optimizing too early. It's better to leave the more complex stuff to library authors or wait until your data model has settled.

"Trying too hard to avoid cloning"

This is the issue I see a certain type of new rustaceans struggle with. People get so used to being able to chuck references around without thinking about what might actually be happening at run-time. They don't realize that they can clone, and even clone more than what might "look good", and that it is super reasonable to intentionally make a clone, and still get incredibly acceptable performance.

"Writing code in an object oriented way instead of a data oriented way" The enterprise OOP style code habits also seem to be a struggle for some but usually ends up really liberating people to think about what their application is actually doing instead of focusing on "what is the language to describe what we want it to do".

Yeah I think this becomes more true the closer your type system gets to "formal verification" type systems. It's essentially trying to prove some fact, and a single mistake anywhere means it will say no. The error messages also get worse the further along that scale you go (Prolog is infamous).

Not really unique to Rust though; I imagine you would have the same experience with e.g. Lean. I have a similar experience with a niche language I use that has dependent types. Kind of a puzzle almost.

It is more work, but you get lots of rewards in return (including less work overall in the long term). Ask me how much time I've spent debugging segfaults in C++ and Rust...

That’s not what OP is discussing. OP is discussing corner cases in Rust’s typesystem that would be sound if the typesystem were more sophisticated, but are rejected because Rust’s type analysis is insufficiently specific and rejects blanket classes of problems that have possible valid solutions, but would need deeper flow analysis, etc.

Yes I know. You get the same effect with type systems that are closer to formal verification. Something you know is actually fine but the prover isn't quite smart enough to realise until you shift the puzzle pieces around so they are just so.

Ahh, I see what you mean

Lean is far more punishing even for simple imperative code. The following is rejected:

  /- Return the array of forward differences between consecutive
     elements of the input. Return the empty array if the input
     is empty or a singleton.
  -/

  def diffs (numbers : Array Int) : Array Int := Id.run do
    if size_ok : numbers.size > 1 then
      let mut diffs := Array.mkEmpty (numbers.size - 1)
      for index_range : i in [0:numbers.size - 2] do
        diffs := diffs.push (numbers[i+1] - numbers[i])
      return diffs
    else
      return #[]

[deleted]

When this happens to me, it’s mostly because my code is written with too coarse separation of concerns, or I am just mixing layers

And in C++, those changes would likely shoot yourself in the foot without warning. The borrow checker isn't some new weird thing, it's a reification of the rules you need to follow to not end up with obnoxious hard to debug memory/threading issues.

But yeah, as awesome as Rust is in many ways it's not really specialized to be a "default application programming language" as it is a systems language, or a language for thorny things that need to work, as opposed to "work most of the time".

C++ allows both more incorrect and correct programs. That's what can be a little frustrating about the BC. There are correct programs which the BC will block and that can feel somewhat limiting.

While this obviously and uncontroversially true in an absolute sense (the borrowck isn’t perfect), I think in the overwhelming majority of real-world cases its concerns are either actual problems with your design or simple and well-known limitations of the checker that have pretty straightforward and idiomatic workarounds.

I haven’t seen a lot of programs designs in practice that are sound but fundamentally incompatible with the borrow checker. Every time I’ve thought this I’ve come to realize there was something subtly (or not so subtly) wrong with the design.

I have seen some contrived cases where this is true but they’re inevitably approaches nobody sane would actually want to use anyway.

Here's something not contrived that does pretty frequently come up [1]

The gist of it is even though tasks can be somewhat guaranteed to live only for the scope of the current method, you can't guarantee that with the rust type system. The end result is needing to copy immutable data or use ARC needlessly.

It's annoying enough that the language added "scoped_threads" which work great, but can't be translated into rust async tasks.

[1] https://without.boats/blog/the-scoped-task-trilemma/

In most cases, those "correct" C++ are also usually buggy in situations the programmer simply hasn't considered. That's why the C++ core guidelines ban them and recommend programs track ownership with smart pointers that obey essentially the same rules as Rust. The main difference is that C++ smart pointers have more overhead and a bunch of implicit rules you have to read the docs to know. Rust tells you in (occasionally obscure) largely helpful compiler errors at the point where you've violated them, rather than undefined behavior or a runtime sanitizer.

My big complaint about Rust's borrow checking is that back references need to be handled at compile time, somehow.

A common workaround is to put items in a Vec and pass indices around. This doesn't fix the problem. It just escapes lifetime management. Lifetime errors then turn into index errors, referencing the wrong object. I've seen this three times in Rust graphics libraries. Using this approach means writing a reliable storage allocator to allocate array slots. Ad-hoc storage allocators are often not very good.

I'm currently fixing some indexed table code like that in a library crate. It crashes about once an hour, and has been doing that for four years now. I found the bug, and now I have to come up with a conceptually sound fix, which turns out to be a sizable job. This is Not Fun.

Another workaround is Arc<Mutex<Thing>> everywhere. This can result in deadlocks and memory leaks due to circularity. Using strong links forward and weak links back works better, but there's a lot of reference counting going on. For the non-threaded case, Rc<RefCell<Thing>>, with .borrow() and .borrow_mut(), it looks possible to do that analysis at compile time. But that would take extensions to the borrow checker. The general idea is that if the scope of .borrow() results of the same object don't nest, they're safe. This requires looking down the call chain, which is often possible to do statically. Especially if .borrow() result scopes are made as small as possible. The main objection to this is that checking may have to be done after expanding generics, which Rust does not currently do. Also, it's not clear how to extend this to the Arc multi-threaded case.

Then there are unsafe approaches. The "I'm so cool I don't have to write safe code" crowd. Their code tends to be mentioned in bug reports.

> A common workaround is to put items in a Vec and pass indices around. This doesn't fix the problem. It just escapes lifetime management. Lifetime errors then turn into index errors, referencing the wrong object.

That people seriously are doing this is so depressing... if you build what amounts to a VM inside of a safe language so you can do unsafe things, you have at best undermined the point of the safe language and at worse disproved the safe language is sufficient.

That's a good way to put it. I'll keep that in mind when trying to convince the Rust devs.

This is a common pattern everywhere, not just in Rust. Indices, unlike pointers to elements, survive a vector reallocation or serialization to disk. IDs are used to reference items in an SQL database, etc.

I hope you realize that index buffers are an industry standard in graphics APIs.

I do. I am not saying they are not useful... I am saying they are not safe. In a world where you care about the safety guarantees of Rust, you'd take an API that insists on manual index management and you would build an abstraction over it that allowed the compiler to prove its correctness using a proof system of some form, maybe borrow checking, or maybe something much more sophisticated.

Vertex index buffers are not usually updated dynamically.

Vulkan bindless descriptor tables are updated dynamically, and do have pointers to memory addresses. Those require a slot allocator and a deferred, interlocked deletion system to prevent deleting something from underneath a renderer. Not fun. I've been working on that.

Yes the "fake pointer" pattern is a key survival strategy. Another one I use often is the command pattern. You borrow a struct to grab some piece of data, based on it you want to modify some other piece of the struct, but you can't because you have that first immutable borrow still. So you return a command object that expresses the mutation you want, back up the call stack until you're free to acquire a mutable reference and execute the mutation as the command instructs. Very verbose to use frequently, but often good for overall structure for key elements.

Yes. Workarounds in this area exist, but they are all major headaches.

Neat. It's still run-time checking. A good idea, though. The one-owner, N users case is common. The trick is checking that the users don't outlive the owner.

At least based on the comments on lobste.rs [0] and /r/rust, these seem to be actively worked on and/or will be solved Soon (TM):

1. Checking does not take match and return into account: I think this should be addressed by Polonius? https://rust.godbolt.org/z/8axYEov6E

2. Being async is suffering: I think this is addressed by async closures, due to be stabilized in Rust 2024/Rust 1.85: https://rust.godbolt.org/z/9MWr6Y1Kz

3. FnMut does not allow reborrowing of captures: I think this is also addressed by async closures: https://rust.godbolt.org/z/351Kv3hWM

4. Send checker is not control flow aware: There seems to be (somewhat) active work to address this? No idea if there are major roadblocks, though. https://github.com/rust-lang/rust/pull/128846

[0]: https://lobste.rs/s/4mjnvk/four_limitations_rust_s_borrow_ch...

[1]: https://old.reddit.com/r/rust/comments/1hjo0ds/four_limitati...

(Side note) That's odd, lobste.rs seems to be down for me, and has been like that for a couple of months now -- I literally cannot reach the site.

Is that actually just me??

EDIT: just tried some things, very weird stuff: curl works fine. Firefox works fine. But my usual browser, Brave, does not, and complains that "This site can't be reached (ERR_INVALID_RESPONSE)". Very very very weird, anyone else going through this?

They are throwing a bit of a hissy fit over brave. Change the user agent or something and view the site.

Reading the facts of the situation, it seems like a warranted "bit of a hissy fit".

Disagree. Regardless of what Brave is doing you shouldn’t block via User Agent like this.

Especially not simply making the site not load like that. If you really think a browser is so bad you don't want people using it, at least have it redirect to a message explaining what your grievance is. Unless the browser is DDoSing webpages it loads successfully, making the site look broken is pretty worthless as a response.

EDIT: Although, it looks like they tried to do that sometimes? No idea why they would switch from that approach.

Eh, pushcx's is right to disagree with past bad decision Brave made, but I think he's conflating a few grievances together. Someone tried to reason with him on that front: https://lobste.rs/s/iopw1d/what_s_up_with_lobste_rs_blocking...

I sense hidden ideology, but it's his community to own, not mine.

It's not just that he disagrees with the things on an object, what happened level. He actively reads malice into every misstep to paint the organization as abusive.

Yep.

Hello from Finland, I can reach the site all fine. Hope you get your connection issues sorted :)

One approach to solving item 1 is to think about the default as not being a separate key to the HashMap, but being a part of the value for that key, which allows you to model this a little more explicitly:

    struct WithDefault<T> {
        value: Option<T>,
        default: Option<T>,
    }

    struct DefaultMap<K, V> {
        map: HashMap<K, WithDefault<V>>,
    }

    impl<K: Eq + Hash, V> DefaultMap<K, V> {
        fn get_mut(&mut self, key: &K) -> Option<&mut V> {
            let item = self.map.get_mut(key)?;
            item.value.as_mut().or_else(|| item.default.as_mut())
        }
    }
Obviously this isn't a generic solution to splitting borrows though (which is covered in https://doc.rust-lang.org/nomicon/borrow-splitting.html)

The article makes the 'default' key with push_str("-default"), and given that, your approach should work. But i think that's a placeholder, and a bit of an odd one - i think it's more likely to see something like (pardon my rusty Rust) k = if let Some((head, _)) = k.split_once("_") { head.to_owned() } else { k } - so for example a lookup for "es_MX" will default to "es". I don't think your approach helps there.

Yeah, true. But that (assuming you're saying give me es_MX if it exists otherwise es) has a similar possible solution. Model your Language and variants hierarchically rather than flat. So languages.get("es_MX") becomes

    let language = languages.get_language("es");
    let variant = language.get_variant("MX");
There's probably other more general ideas where this can't be fixed (but there's some internal changes to the rules mentioned in other parts of this thread somewhere on (here/reddit/lobsters).

[flagged]

Rust's syntax is basically a bog-standard C/C++ descendant, but with a slightly cleaner and more regular grammar.

Isn’t it actually a Standard ML and C/C++ hybrid? The window dressing is C/C++ sure, but the lack of statements, semicolons, the presence of enums, etc are quite reminiscent of SML.

SML is definitely the next biggest influence, but it's an SML frosting on top of a C-family base.

What was the comment context here? It's flagged so I didn't see it.

I think it’s because the typelevel expressions (e.g. trait constraints, lifetimes, impl<T> for Trait<U>), ubiquitous refs, and other type annotations have a much more complicated and explicit grammar than what we tend to see in other languages.

It is really not that much like C++ syntax in that respect nor is it cleaner. In other respects, it is more like a cleaner C++, but this is an aspect that is in fact more complicated, and harder to read and write.

Rust people need to be more honest about what Rust is really like.

I didn’t get past the first limitation before my brain started itching.

Wouldn’t the approach there be to avoid mutating the same string (and thus reborrowing Map) in the first place? I’m likely missing something from the use case but why wouldn’t this work?

    // Construct the fallback key separately
    let fallback = format!("{k}-default");

    // Use or_else() to avoid a second explicit `if map.contains_key(...)`
    map.get_mut(k)
       .or_else(|| map.get_mut(&fallback))

I see how that helps with the usual case of inserting a value under the original key if it wasn't there, but I don't see how it helps in this case of checking a different key entirely if it wasn't there.

It definitely needs get_mut(k) changed to get_mut(&k), but even after doing that, it still fails to compile, with an error similar to the one the original code gets.

This creates the fallback before knowing that you’ll need it.

Not necessarily. Since the argument to `.or_else` is a function, the fallback value can be lazily evaluated.

I am pretty sure the example 2 doesn't work because of the move and should be fixed in the next release when async closure are stabilized (I am soooo looking forward to that one).

The limitation is the borrow checker itself. I think it restricts too much. clang implements lifetimebound, for example, which is not viral all the way down and solves some typical use cases.

I find that relying on values and restricted references and when not able to do it, in smart pointers, is a good trade-off.

Namely, I find the borrow-checker too restrictive given there are alternatives, even if not zero cost in theory. After all, the 80/20 rule helps here also.

A borrow checker that isn't "viral all the way down" allows use-after-free bugs. Pointers don't stop being dangling just because they're stashed in a deeply nested data structure or passed down in a way that [[lifetimebound]] misses. If a pointer has a lifetime limited to a fixed scope, that limit has to follow it everywhere.

The borrow checker is fine. I usually see novice Rust users create a "viral" mess for themselves by confusing Rust references with general-purpose pointers or reference types in GC languages.

The worst case of that mistake is putting temporary references in structs, like `struct Person<'a>`. This feature is incredibly misunderstood. I've heard people insist it is necessary for performance, even when their code actually returned an address of a local variable (which is a bug in C and C++ too).

People want to avoid copying, so they try to store data "by reference", but Rust's references don't do that! They exist to forbid storing data. Rust has other reference types (smart pointers) like Box and Arc that exist to store by reference, and can be moved to avoid copying.

> Pointers don't stop being dangling just because they're stashed in a deeply nested data structure or passed down in a way that [[lifetimebound]] misses

This is the typical conversation where it is shown what Rust can do by shoehorning: if you want to borrow-borrow-borrow from this data structure and reference-reference-reference from this function, then you need me.

Yes, yes, I know. You can also litter programs with globals if you want. Just avoid those bad practices. FWIW, references break local reasoning in lots of scenarios. But if you really, really need that borrowing, limit it to the maximum and make good use of smart pointers when needed. And you will not have this problem.

It looks to me like Rust sometimes it is a language looking for problems to give you the solution. There are patterns that are just bad or not adviced in most of your code and hence, not a problem in practice. If you code by referencing everything, then Rust borrow-checker might be great. But your program will be a salad of references all around, which is bad in itself. And do not get me started in the refactorings you will need every time you change your mind about a reference deep somewhere. Bc Rust is great, yes, you can do that cool thing. But at what cost? Is it even worth?

I also see all the time people showing off the Send+Sync traits. Yes, very nice, very nice. Magic abilities. And what? I do my concurrent code by sharing as little as possible all the time. So the patterns of code where things can be messed up are quite localized.

Because of this, the borrow checker is basically something that gets a lot in the way but does not add a lot of value. It might have its value in hyper-restricted scenarios where you really need it, and I cannot think of a single scenario where that would be really mandatory and really useful for safety except probably async programming (for which you can do structured concurrency and async scopes still in C++ and I did it successfully myself).

So no, I would say the borrow checker is a solution looking for problems because it promotes programming styles that are not clean from the get go. And only in this style it is where the borrow checker shines actually.

Usually the places where the borrow checker is useful has alternative coding patterns or lifetime techniques and for the few ones where you really want something like that, probably the code spots are small and reviewable anyway.

Also, remember that Rust gives you safety from interfaces when you use libraries, except when not, bc it basically hides unsafe underneath and that makes it as dangerous as any C or C++ code (in theory). However, it should be easier to spot the problems which leads more safety in practice. But still, this is not guaranteed safety.

The borrow checker is a big toll in my opinion and it promotes ways of coding that are very unergonomic by default. I'd rather take something like Swift or even Hylo any day, if it ever reaches maturity.

In general I view the borrow checker as a good friend looking over my shoulder so I don't shoot myself in the foot in production. 99 times out of 100 when the borrow checker complains it's because I did something stupid/wrong. 0.99 times out of 100 I think the borrow checker is wrong when I am in fact wrong. 0.01 times out of 100 the borrow checker fumbles on a design pattern it maybe shouldn't so I change my design. Usually my life is way better for changing the design after anyways.

The thing is, you don't need to have refs of refs of refs of refs of refs. You can clone once in a while or even use a smart pointer. You'll find in 99.99% of cases the performance is still great compared to a GC language. That's a common issue for certain types of people learning how to write Rust. I can't think of any application that needs everything to be a reference all the time in Rust.

As far as "mandatory" goes for choosing a language. We can all use ASM, or C, write everything from scratch. It's a choice. Nothing is mandatory. No one is saying you HAVE to use Rust. Lots of people are saying "when I use it my life is way better", that's different. There was a recent post here where people say they don't use IDE's with LSP or autocomplete. A lot of people are going to grimace at that, but no one is saying they can't do that.

> I also see all the time people showing off the Send+Sync traits. Yes, very nice, very nice. Magic abilities. And what? I do my concurrent code by sharing as little as possible all the time. So the patterns of code where things can be messed up are quite localized.

They check whether your code really shares as little as you think, and prevent nasty to debug surprises.

The markers work across any distance, including 3rd party dependencies and dynamic callbacks, so you can use multi-threading in more situations.

You're not limited to basic data-parallel loops. For example, it's immensely useful in web servers that run multi-threaded request handlers that may be calling arbitrary complex code.

> places where the borrow checker is useful has alternative coding patterns

There's a popular sentiment that smart pointers make borrow checker unnecessary, but that's false. They're definitely helpful and often necessary, but they're not an alternative to borrow checking.

Rust had smart pointers first, and then added borrowing for all the remaining cases that smart pointers can't handle or would be unreasonable to use.

Borrowing checks stack pointers. Checks interior pointers to data nested inside of types managed by smart pointers (so you don't have to wrap every byte you access in a smart pointer). It allows functions safely access data inside unique_ptr without moving it away or switching to shared_ptr. Prevents using data protected by a lock after the lock has been unlocked. Prevents referencing implicitly destroyed temporary objects. Makes types like string_view and span not a footgun.

> the borrow checker is basically something that gets a lot in the way

This is not the case for experienced Rust users.

Borrow checker is a massive obstacle to learning and becoming fluent in Rust. However, once you "get" it, it mostly gets out of the way.

Once you internalise when you can and can't use borrowing, you know how to write code that won't get you "stuck" on it, and avoid borrow checking compilation errors before they happen. And when something doesn't compile, you can understand why and how to fix it. It's a skill. It's not easy to learn, but IMHO worth learning more than C++'s own rules, Core Guidelines, UB, etc. that aren't easy either, and the compiler can't confirm whether you got them correct.

> Borrowing checks stack pointers. Checks interior pointers to data nested inside of types managed by smart pointers (so you don't have to wrap every byte you access in a smart pointer). It allows functions safely access data inside unique_ptr without moving it away or switching to shared_ptr. Prevents using data protected by a lock after the lock has been unlocked. Prevents referencing implicitly destroyed temporary objects. Makes types like string_view and span not a footgun.

I understand part of the value the borrow checker brings. Actually my complaint it is more about having a full borrow checker and viralize everything than about having the analysis itself. For example Swift and Hylo do some borrow-checking analysis but they do no extend that to data structures and use reference counting (with elision I think) and value semantics.

The problem with the borrow checker is not the analysis. It is the virality. Without the virality you cannot express everything. But with the amount of borrow checking that can be done through other conventions (as in Hylo/Swift) and leaving out a part of the story I think things are much more reasonable IMHO.

There are so many ways to workaround/just review code in(assuming the cases left are a bunch of those) the remaining spots that presenting a fully viral borrow checker to be able to represent so many situations (and on top of that promoting references everywhere, which breaks local reasoning) that I question the value of a full borrow checker with full virality. It also sets the bar higher for any refactoring in many situations.

> Borrow checker is a massive obstacle to learning and becoming fluent in Rust. However, once you "get" it, it mostly gets out of the way.

This is just not true for many valid patterns of code. For example, data-oriented programming seems to be a nightmare with a borrow checker. Linked structures are also something that is difficult. So it is not only "getting it", it is also that for certain patterns it is the borrow checker who "gets you", in fact, "kidnaps you away" from your valid coding patterns.

> but IMHO worth learning more than C++'s own rules, Core Guidelines, UB, etc

I admit to be more comfortable with C++ so it is my comfort zone. But there are middle solutions like Swift or (very experimental) Hylo that are worth a try IMHO. A full, embedded borrow checker with lifetime annotations is a big ergonomy problem that brings value if you abuse references, but when you do not, the value of the borrow checker is lower. Same for escaping references several levels up... why do it? I think it is just better to try to avoid certain coding patterns. Not because of Rust itself. Just as general coding style in any language...

> that aren't easy either, and the compiler can't confirm whether you got them correct.

Not all as of today, but a subset yes, there are linters. Also, there is an effort to incrementally increase the value of many analysis. It will never be as perfect as Rust's, I am sure of that. But I am not particularly interested either. What I would be more interested in is if with what can be fixed and improved the delivered software has the same defect rates as Rust lifetime-wise. This is counter-intuitive bc it looks like the better the analysis, the better the outcome, but here two factors also play the game IMHO:

  1. not all defects are evenly distributed. This means that if the things that can be lifetime-checked are a big amount of typical lifetime checks in C++, even if not all kinds such as Rust's can be done, it can get statistically very close.
  2. once the spots for unsafe code are more localized, I expect the defects rate to decrease more than linearly, since now the attention is focused on fewer code spots.
Let us see what comes from this. I am optimistic that the results will be better than many people predict in ways that look to me too academic but without taking into account other factors such as defect density in clusters and reduction of surface to inspect by humans bc it cannot be verified to be safe.

That is your personal decision to make. I would say this, despite this article pointing out 4 issues with certain design patterns there is a lot of good commercial software being written in the language every single day.

My personal experience has been one where after I spent the time learning Rust I am now able to write correct code much faster than GC based languages. Its pretty close to replacing python for me these days. I am also very grateful to not deal with null/nil ever again, error handling is so convenient, simple 1 offs run super fast and don't need me to go back to fix perf/re-write, and my code is way easier to read.

To each their own, but I wouldn't let a niche technical articles sway you from considering Rust as something to learn, use, and enjoy.

My opinion about Rust is not leaning on this article actually. I might need to give it a bigger try to see how it feels but I am pretty sure I am not going to ever like so heavy lifetime annotations for a ton of reasons.

I think a subset of the borrow checking with almost no annotations and promoting other coding patterns would make me happier.

I respect that. Honestly though, for first pass designs of code that don't need immense performance, don't reference juggle through-out the entire application. Use smart pointers when needed, clone once in a while, etc. It'll make life way easier and unless you're in a tight loop or something you'll still gain performance. Arguably you gain some safety, and convenience as well, but again "arguably" :).

The first time I tried Rust I hated it. I couldn't understand why anyone would deal with what I was dealing with. The second time I tried it, I started to get it. After I got it, I struggled to want to write code with anything else. So much of the cognitive complexity is opaque and not tucked away into personal memories of tracing the code. Sometimes that's scary, sometimes its good that its scary it means its time to refactor.

Best wishes.

> The second time I tried it, I started to get it. After I got it, I struggled to want to write code with anything else.

Reminds of the time when metaprogramming in C++ became popular. Because you can do it it does not mean you should do it all the time. Also, people felt smarter (you said once "you got it", right?). I mean, do not take it as negstive feedback against you. Even I wrote some of that template metaprogramming code at times. But not everyone can get it.

We should not forget that programming is also a social activity. The more elitist we turn it, the more barriers to contribution.

This does not mean the borrow checker is bad in itself, but from that point of view (the social one) can get in the way. It does have a steep learning curve.

That is why some of us are not convinced it is the right path even if it does add value to some niche scenarios.

Best wishes as well.

I hear you, no offense taken. I actually worked with one of the people who contributed to the creation of template metaprograming once upon a time. Its definitely not for everyone, definitely abstract at first. I tended to avoid it unless I was working on libraries.

I also agree the social factor is amongst one of the most important things in software engineering, possibly more important than code itself. Most problems are social problems without technical solutions... Anyways, I hear you, but after a bit of toiling with it, it gets way easier.

Its a steep cliff type learning experience for most people especially people who have spent a lot of time in different paradigms. My perspective is, once you get someone up the cliff it's got a really nice view. That view improves objective communication about software and program flow and reduces some fears about pointers/references in review. It's a trade off, but I think the trade off is one where you get more out then you put in. Definitely an opinion though.

And full of gotchas, which remain to be seen if they will ever be fixed, it has hardly been updated after the initial POC implementation.

Using value types for complex objects will reck performance. Why not just use a GCd language at that point?

You usually pass around bigger types through the heap internally encapsulated in an object with RAII in C++, for example. I do not think this is low-perf per se.

Yes in cpp with raii, rust value types do not work this way to my knowledge.

This is not true, the heavy data will be on the heap and you can move the values around. It actually works out very well.

Given the amount of cloning and Arc's in typical Rust code, it just seems to be an exercise in writing illegible Go.

Ironically Go has pretty clean and straightforward guarantees about when heap allocation happens and how to avoid gc.

The one I run into most frequently: Passing field A mutably, and field B immutably, from the same struct to a function. The naive fix is, unfortunately, a clone. There are usually other ways as well that result in verbosity.

Could you change the function to accept the whole struct and make it mutate itself internally without external mutable references?

Yes. Note that this requires a broader restructure that may make the function unusable in other contexts.

Also only if it is under your control. If it’s in the OS or a third-party library, you can’t change the API.

The borrow checker is smart enough to track disjointed field borrows individually and detect that's fine, but if you have two methods that return borrows to a single field, there's no way of communicating to the compiler that it's not borrowing the entire struct. This is called "partial borrows", the syntax is not decided, and would likely only work on methods of the type itself and not traits (because trait analysis doesn't need to account for which impl you're looking at, and partial borrows would break that).

The solution today is to either change the logic to keep the disjointed access in one method, provide a method that returns a tuple of sub-borrows, have a method that takes the fields as arguments, or use internal mutability.

Ah, it wasn't clear from they they wrote that this is what they meant.

The fix is destructuring

[dead]

[flagged]

The difference is very minor when interoperating with methods, but the performance gains of this dual string system are often worth it.

&str is basically a C string allocated on the stack while String is like a Java string, an object on the heap with a reference to a raw string hidden from plain sight. To avoid unnecessary and unintended allocations and other expensive memory operations, operating on &str is usually preferred for performance reasons.

String almost transparently casts down to &str so in practice you rarely care about the difference when calling library code.

If you're coming from a language that doesn't have a distinction between character arrays and string objects, you're probably fine just using &str.

If you're coming from a higher level language like JS or Python, you're probably used to paying the performance price for heap allocation anyway so you might as well use String in Rust as well and only start caring when performance is affected.

&str doesn’t mean stack-allocated. It’s just a pointer [0] (and a len) to a section of memory that’s (required to be) legal utf-8.

A &str can point at stack memory or heap memory (usually the latter, since it’s common for them to point to a String, which allocate on the heap), or static memory.

But yeah, String keeps things simple, and when in doubt just use it… but if you want to understand it more, it’s better to think of who “owns” the data.

Take a String when you need to build something that needs to own it, like if you’re building a struct out of them, or store them in a hash map or something. Because maybe a caller already “owns” the string and is trying to hand over ownership, and you can avoid the clone if it’s just passed by move.

If you’re only using the string long enough to read it and do something based on it (but don’t want to own it), take a &str, and a caller can be flexible of how it produces that (a &'static str, a String ref, a substring, etc.)

The example that always works for me as a way to remember is to think of HashMap.

HashMap.get takes a reference for the key (analogous to &str), because it’s only using your reference long enough to compare to its keys and see which one matches.

HashMap.insert takes a value for the key (analogous to String) because it needs to own the key and store it in the table.

HashMap.insert could take a reference, but then it’d have to clone it, which means you’d miss out on the opportunity to more cheaply move the key (which is a simple memcpy) instead of calling clone() (which often does more calls to clone and can be complicated)… and only would support clone able keys.

[0] yeah yeah, a reference, not a pointer, but the point is it “points to” a place in memory, which may be heap, stack, static, anything.

str can be allocated on the stack, or heap, or static storage.

The difference between a heap allocated string (String), a static string literal embedded in the binary (&str), and a stack allocated string ([char], but this is more common in C than Rust) is the simplest introduction to manually managed memory.

The complications have nothing to do with Rust but with how computers manage and allocate memory. You might as well also skip C, C++, Zig, and every other language which gives you fine-tuned access to the stack and heap, because you'll run into the same concept.

Nit: A &str doesn't mean it has to be static, a &'static str does (which are a subset of &str). A &str can easily point to a dynamic String's heap storage too.

str doesn’t have to be embedded in the binary. It can be that, or it can be on the heap, or it can be on the stack.

Every now and then I worry about the rust ecosystem growing too fast and there being too many JavaScript expats flooding cargo with useless code and poorly thought out abstractions, etc…

Thank you for reminding me that most people don’t have the patience to even learn something that makes them think even the tiniest bit. Most of the JavaScript people won’t even get past a hello world program. I think we’re mostly safe.

rust community hubris at its finest.

Still, I find Scala and Haskell community more elegant and intellectually superior when it comes to gatekeeping.

> Thank you for reminding me that most people don’t have the patience to even learn something that makes them think even the tiniest bit.

You think Rust makes you think "the tiniest bit"?

I mean, sure, there's a lot of excess cognitive burden so that you think more about the language features than about your program logic, but you surely aren't claiming that that is a good thing.

The concept of a string’s primary storage being separate from a view/pointer to its contents, is a necessary distinction to draw if you want to minimize copies and still not require garbage collection. Any language that gives you control over your memory is going to need this distinction. Thinking of ownership is a necessary cognitive burden if you want to avoid unnecessary allocations or garbage collection pauses.

People coming from JS or Python look at these burdens and think “rust sucks because I have to worry about string allocation”, without caring that all languages have to deal with this somehow, and if they’re not making the programmer think of it, the language will have to manage itself in a way that won’t always be optimal.

The analogy I like to make is that if a JS developer thinks it’s bad for a language to make them deal with some abstraction, the should consider that their JavaScript VM itself needs to be written in a language that makes someone care about this stuff. JS only works at all because some engineer writing C++ is thinking deeply about string lifetimes. It can’t just be JavaScript all the way down.

However if you’re using Rust in an environment that could just as easily use a GC’d language, you could definitely make the case that it’s the wrong tool for the job. Not everything needs to be written in a low level language, but for the stuff that does, I’m glad Rust exists.

And I’m also glad that people who don’t understand the tradeoffs are “skipping” rust entirely. It’s not gatekeeping, it’s telling people who don’t want to be here that it’s ok: you don’t need to code in rust, maybe it’s just not for you.

The reason for that is simple though: &String converts to &str, but not the other way around... so you should always use &str so that your code works with either, and notice that literal strings are &str. I think Rust has lots of warts, but I don't see this as one of them (at least it's something you get irritated at only once, but then never have problems with).

I’m barely familiar with rust and forgot about this aspect, if I ever knew it.

Seems pretty sensible though. String is dynamic data on the heap that you own and can modify. str is some data somewhere that you can’t modify.

C has this distinction as well. Of course, in typical C fashion, the distinction isn’t expressed in the type system in any way. Instead, you just have to know that this char* is something you own and can modify and that char* just a reference to some data.

Higher level languages typically unify these ideas and handle the details for you, but that’s not rust’s niche.

>String is dynamic data on the heap that you own and can modify. str is some data somewhere that you can’t modify.

This is not the definition. You can modify both. Being able to modify something depends on whether you can do something with a &mut reference to it, and both &mut String and &mut str provide methods for modifying them.

The difference between the two types is just that String owns its allocation while str doesn't. So modifying a String is allowed to change its bytes as well as add and remove bytes, the latter because the String owns its allocation. Modifying a str only allows changing its bytes.

If you thought that was confusing, you’ll definitely want to skip C++ too!

Why? str and String are different things, why shouldn’t they be different types?

Easy to write bugs in unsafe languages like C / C++.

Rust makes memory management explicit, hence eliminating those bugs. But it also shows how hard memory management actually is.

Systems programming languages like this should be used sparingly, only for stuff like device drivers, OSs and VMs.

Any general purpose programming language should be garbage collected.