Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Security advisory for the Rust programming language (with a nice explanation): https://blog.rust-lang.org/2021/11/01/cve-2021-42574.html

Rust 1.56.1 will be released later today.

> To assess the security of the ecosystem we analyzed all crate versions ever published on crates.io (as of 2021-10-17), and only 5 crates have the affected codepoints in their source code, with none of the occurrences being malicious.

Preview of the new helpful error: https://i.imgur.com/pGpZOnr.png



Their advisory is well-written and explains the problem well. The example code they use:

  if access_level != "user" { // Check if admin
opens up a whole can of worms though. You don't need cunning invisible control codes to break that line, you could just replace any of the letters in 'user' with a different, but almost-identical looking unicode symbol and you'd still have an exploit. Even better, this would be a completely deniable attack ("oops, I must have accidentally pressed alt-R while typing that letter" excuse) - whereas explaining away why you checked in some magical RTL/LTR encodings and hacked up a comment is impossible. Plus, it would render well in far more apps, terminals, command line programs, etc etc


> you could just replace any of the letters in 'user' with a different, but almost-identical looking unicode symbol and you'd still have an exploit.

The post mentions that exploit (and Rust's already existing defense) in the appendix.

Here are the details, as explained in a previous post:

> The compiler will warn about potentially confusing situations involving different scripts. For example, using identifiers that look very similar will result in a warning.

    warning: identifier pair considered confusable between `s` and `s`
https://blog.rust-lang.org/2021/06/17/Rust-1.53.0.html


> warning: identifier pair considered confusable

Note that the lint you mention is about identifiers, while "user" is a literal. The lint does not fire for literals. String literals have always supported non ascii characters since 1.0.0, and there has never been a lint for them, until now with the 1.56.1 release.


Also worth noting that the homoglyph attack isn't linted for in literals or comments, only the bidi codepoints are.


The compiler will warn about potentially confusing situations involving different scripts. For example, using identifiers that look very similar will result in a warning.

Unfortunately, I've little experience of rust, so I don't have experience of that warning. It would certainly help catch a one-liner exploit, but wouldn't it be excessively noisy for code written in non-english languages?


It only warns if there actually are two identifiers that look similar. Even if it's not malicious it's still confusing and is worth renaming.

But if you want to, turning off specific warnings for a file or block of code is really simple in rust, just add "#[allow(confusable_idents)]"


The Unicode homoglyph lint will only trigger if there are multiple identifiers that can look the same, it's not a blanket warning on anything that isn't ASCII. It's close to what browsers do with domain names. And you can always allow lints.


Am I missing something here? The spacing around these homoglyph is almost always noticeably wider than it should be such that I don't understand how you could ever miss it in any half-decent code review.

      if access_level != "user" { // Check if admin

      if access_level != "user" { // Check if admin
Come on, that looks obviously off.


If you were really reviewing that code, Rust has algebraic data types, and access level should be an Enum, not a String.

But it's their example. The problem isn't with homoglyphs, though. It's with bidi control characters, which are invisible to a human but not to the compiler, which is how generated code can end up semantically different from source code, which is the actual problem here. What you see in code review would be the first line, even though that isn't actually what is in the source, because an editor that is bidi-aware would show it that way.


> But it's their example

It's the example that the researchers provided to us, to be clear about it.


> It's with bidi control characters

Sure.. in the original HN submission. I was referring to Rust's built-in homoglyph detection though, which is what the parent comment (and its parent) was about.


I thіnk thаt іt іs possіblе thаt you аre missing а fаіrly important point.

... And that point is that none of the vowels in my previous sentence are latin, I guess.


I think you missed some. I can't seem to paste your fake "i"s back in, but here's what I see:

  $ xxd
  I thіnk thаt іt іs possіblе thаt you аre missing а fаіrly important point.
  00000000: 4920 7468 d196 6e6b 2074 68d0 b074 20d1  I th..nk th..t .
  00000010: 9674 20d1 9673 2070 6f73 73d1 9662 6cd0  .t ..s poss..bl.
  00000020: b520 7468 d0b0 7420 796f 7520 d0b0 7265  . th..t you ..re
  00000030: 206d 6973 7369 6e67 20d0 b020 66d0 b0d1   missing .. f...
  00000040: 9672 6c79 2069 6d70 6f72 7461 6e74 2070  .rly important p
  00000050: 6f69 6e74 2e0a                           oint..


Made you look. :)

I also skipped a bunch of the "I"s.


Yes. What browser did you use to make the comment? I can't get all those characters to paste in.


Firefox 93.0 on Windows 11. Characters copied & pasted from charmap.exe

a: U+0430 "Cyrillic small letter a"

e: U+0435 "Cyrillic small letter e"

i: U+0456 "Cyrillic small letter Byelorussian-Ukranian i"


Ooh, or you could just put in the cyrillic 'а' and even have it look like it's legit :)


This stuff has always been there consider this code:

if (uid = NULL) { // Check if root

And if you’re using clang: if ((uid = NULL)) { // Check if root

I'd venture that this is far more dangerous than unicode in strings...

or how about:

strcpy()

or #include anything with a #DEFINE


> if (uid = NULL) { // Check if root

That's not the same class of error, since here a programmer can see the issue by simple inspection.

> or #include anything with a #DEFINE

This one perhaps is closer to the mark, although not based on unicode.


To me it's the same class of error which is convincing humans and other automated tests that your code is OK when it isn't.

I dealt with a bug that only appeared in release builds, and never in debug. The offending code looked roughly like this:

  if (blah)
    #ifdef DEBUG
    baz();
    #endif
  bar();
The systemic problem was it was a project created by interns, and they'd review each others code. By the time the bug got to me the interns had left and a Sr Dev had spent a day looking for the bug. It took me an hour to find it. In isolation its easy to see but in the mess of all the other code, you really have to look for these things.


Well, if you generalize the statement enough, indeed it's the same class of issue.

In the situation you described:

* you have a fairly easy way to detect the problem

* the interns still have plausible deniability as to whether they intended to leave a defect or not

The discussed problem with unicode is clearly meant to be used as an exploit, its likelihood of occurring by accident seems very close to zero.


Rust doesn't allow assignment in conditionals.

https://locka99.gitbooks.io/a-guide-to-porting-c-to-rust/con...


It does, in fact the article you posted, shows you exactly when rust allows assignment in conditionals.

As long as you're initializing a variable, it's allowed, if you're not initializing you'll have to use a block expression.


Should have just used this sentence - which also directly covers parent's case.

"Rust does not allow assignment within simple expressions so they will fail to compile. This is done to prevent subtle errors with = being used instead of ==."

Better?


Those are all detectable by a programmer's eyes. Unicode attacks are not.


That’s a really impressively written error message.


That's one of Rust's selling points. For all I've used the rust compiler, not once have I ever not known what error it was pointing out: its error messages are incredibly helpful. Occasionally I am unsure why it's an error, but I always know what it's referring to and what I could do to fix it.


I've had the same experience with C#. The error messages always state exactly what's wrong and where in the code it's wrong. Many of them (especially compiler _warnings_ intended to point out syntax that is almost certainly a bug) also tell you how to fix it (e.g. “consider using ‘new’ keyword if hiding was intended”).


Personally, I don't know why the last one (“consider using ‘new’ keyword if hiding was intended”) isn't an error by default in C# . Not overriding the base method is almost always a mistake, and if it's not a mistake, better to be explicit about it, anyways. My $.02...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: