www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Example of the perils of binding rvalues to const ref

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
http://www.slideshare.net/yandex/rust-c

C++ code:

std::string get_url() {
     return "http://yandex.ru";
}

string_view get_scheme_from_url(string_view url) {
     unsigned colon = url.find(':');
     return url.substr(0, colon);
}

int main() {
     auto scheme = get_scheme_from_url(get_url());
     std::cout << scheme << "\n";
     return 0;
}

string_view has an implicit constructor from const string& (see 
"basic_string_view(const basic_string<charT, traits, Allocator>& str) 
noexcept;" in https://isocpp.org/files/papers/N3762.html). The function 
get_url() returns an rvalue, which in turn gets bound to a reference to 
const and implicitly passed to string_view's constructor. The obtained 
view refers to a dead string.


Andrei
Sep 16 2014
next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei 
Alexandrescu wrote:
 http://www.slideshare.net/yandex/rust-c

 C++ code:

 std::string get_url() {
     return "http://yandex.ru";
 }

 string_view get_scheme_from_url(string_view url) {
     unsigned colon = url.find(':');
     return url.substr(0, colon);
 }

 int main() {
     auto scheme = get_scheme_from_url(get_url());
     std::cout << scheme << "\n";
     return 0;
 }

 string_view has an implicit constructor from const string& (see 
 "basic_string_view(const basic_string<charT, traits, 
 Allocator>& str) noexcept;" in 
 https://isocpp.org/files/papers/N3762.html). The function 
 get_url() returns an rvalue, which in turn gets bound to a 
 reference to const and implicitly passed to string_view's 
 constructor. The obtained view refers to a dead string.


 Andrei
Arguably, the issue is not const ref binding to an rvalue itself, but rather taking (and *holding*) the address of a parameter that is passed by const ref. If you want to *hold* that reference, it should be explicitly passed by pointer. That and having the whole thing neatly packaged in an implicit constructor. If you are doing something that dangerous, at the very least, make it explicit. I mean, the example might as well just be: std::string_view get_scheme() { std::string myString = get_url(); return myString; //Boom } Exact same undefined result, without binding to rvalues. I prefered your smoking gun of: const int& a = max(1, 2); But again, the part of the issue here is the passing of references. If we made "auto ref" to mean "pass either an existing object, or binds to an rvalue (at call site, not via template overload)" and in the implementation, made the passed in argument "considered a local variable as if passed by value you may not escape", then I'm pretty sure we can have our cake and eat it. Proper escape analysis would help too.
Sep 16 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/16/14, 11:26 AM, monarch_dodra wrote:
 Arguably, the issue is not const ref binding to an rvalue itself, but
 rather taking (and *holding*) the address of a parameter that is passed
 by const ref
It doesn't do that. It holds pointer/length to the slice, which is different and distinct from the pointer to the string. -- Andrei
Sep 16 2014
prev sibling next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei
Alexandrescu wrote:
 http://www.slideshare.net/yandex/rust-c

 C++ code:

 std::string get_url() {
     return "http://yandex.ru";
 }

 string_view get_scheme_from_url(string_view url) {
     unsigned colon = url.find(':');
     return url.substr(0, colon);
 }

 int main() {
     auto scheme = get_scheme_from_url(get_url());
     std::cout << scheme << "\n";
     return 0;
 }

 string_view has an implicit constructor from const string& (see 
 "basic_string_view(const basic_string<charT, traits, 
 Allocator>& str) noexcept;" in 
 https://isocpp.org/files/papers/N3762.html). The function 
 get_url() returns an rvalue, which in turn gets bound to a 
 reference to const and implicitly passed to string_view's 
 constructor. The obtained view refers to a dead string.
That's a design mistake on someone's part. It shouldn't be legal to construct a view of an implicitly created temporary. Or at least not to escape it. As a more straightforward example of the same problem, I ran into a similar issue when creating constructor functions for a slice class I use for my own work. Those functions can't accept a const ref to a std::string, because you can end up with this: slice<char const> wrap(std::string const &s) { return slice<char const>(s.c_str(), s.length()); } slice<char const> s = wrap("abc"); // s now aliases a discarded temporary I had to choose my overloads very carefully to avoid these issues. In short, these problems crop up the moment you start aliasing objects in C++. To be honest, I'm amazed that the STL contains something like string_view. What kind of guarantees is it supposed to provide?
Sep 16 2014
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei 
Alexandrescu wrote:
 The function get_url() returns an rvalue, which in turn gets 
 bound to a reference to const and implicitly passed to 
 string_view's constructor. The obtained view refers to a dead 
 string.
When I saw this idiom used in D, I asked the same question: how long the temporary should survive? Does it work more reliably in D? It would be ok if it survives till the end of the scope, optimization opportunities lost this way should be small enough.
Sep 17 2014
prev sibling next sibling parent reply "Arjan" <arjan ask.me.to> writes:
On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei 
Alexandrescu wrote:
 http://www.slideshare.net/yandex/rust-c

 C++ code:

 std::string get_url() {
     return "http://yandex.ru";
 }

 string_view get_scheme_from_url(string_view url) {
     unsigned colon = url.find(':');
     return url.substr(0, colon);
 }

 int main() {
     auto scheme = get_scheme_from_url(get_url());
     std::cout << scheme << "\n";
     return 0;
 }

 string_view has an implicit constructor from const string& (see 
 "basic_string_view(const basic_string<charT, traits, 
 Allocator>& str) noexcept;" in 
 https://isocpp.org/files/papers/N3762.html). The function 
 get_url() returns an rvalue, which in turn gets bound to a
Forgive me my ignorance but get_url() returns a std::string (on the stack), so its address can be token. And iirc the explainer Scott Meyers explained once "iff you can take its address its not a rvalue its a lvalue". So isn't the get_scheme_from_url() not simply holding a const ref to temporary? (which most compiler warn about) ...Or am I missing the point?
 reference to const and implicitly passed to string_view's 
 constructor. The obtained view refers to a dead string.


 Andrei
Sep 17 2014
parent reply "Szymon Gatner" <noemail gmail.com> writes:
On Wednesday, 17 September 2014 at 08:52:58 UTC, Arjan wrote:
 On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei 
 Alexandrescu wrote:
 http://www.slideshare.net/yandex/rust-c

 C++ code:

 std::string get_url() {
    return "http://yandex.ru";
 }

 string_view get_scheme_from_url(string_view url) {
    unsigned colon = url.find(':');
    return url.substr(0, colon);
 }

 int main() {
    auto scheme = get_scheme_from_url(get_url());
    std::cout << scheme << "\n";
    return 0;
 }

 string_view has an implicit constructor from const string& 
 (see "basic_string_view(const basic_string<charT, traits, 
 Allocator>& str) noexcept;" in 
 https://isocpp.org/files/papers/N3762.html). The function 
 get_url() returns an rvalue, which in turn gets bound to a
Forgive me my ignorance but get_url() returns a std::string (on the stack), so its address can be token. And iirc the explainer Scott Meyers explained once "iff you can take its address its not a rvalue its a lvalue". So isn't the get_scheme_from_url() not simply holding a const ref to temporary? (which most compiler warn about) ...Or am I missing the point?
 reference to const and implicitly passed to string_view's 
 constructor. The obtained view refers to a dead string.


 Andrei
Sep 17 2014
parent reply "Szymon Gatner" <noemail gmail.com> writes:
On Wednesday, 17 September 2014 at 08:57:36 UTC, Szymon Gatner 
wrote:
 On Wednesday, 17 September 2014 at 08:52:58 UTC, Arjan wrote:
 On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei 
 Alexandrescu wrote:
 http://www.slideshare.net/yandex/rust-c

 C++ code:

 std::string get_url() {
   return "http://yandex.ru";
 }

 string_view get_scheme_from_url(string_view url) {
   unsigned colon = url.find(':');
   return url.substr(0, colon);
 }

 int main() {
   auto scheme = get_scheme_from_url(get_url());
   std::cout << scheme << "\n";
   return 0;
 }

 string_view has an implicit constructor from const string& 
 (see "basic_string_view(const basic_string<charT, traits, 
 Allocator>& str) noexcept;" in 
 https://isocpp.org/files/papers/N3762.html). The function 
 get_url() returns an rvalue, which in turn gets bound to a
Forgive me my ignorance but get_url() returns a std::string (on the stack), so its address can be token. And iirc the explainer Scott Meyers explained once "iff you can take its address its not a rvalue its a lvalue". So isn't the get_scheme_from_url() not simply holding a const ref to temporary? (which most compiler warn about) ...Or am I missing the point?
 reference to const and implicitly passed to string_view's 
 constructor. The obtained view refers to a dead string.


 Andrei
[ Sorry for double posting, i must have double clicked on "reply" button accidentally. ] std::string returned from get_url() is a temporary and hence a "rvalue". In fact it's address cannot be taken. It is often helpful to think of lvalues as things that can appear on the left side of assignment expression.
Sep 17 2014
next sibling parent reply "IgorStepanov" <wazar mail.ru> writes:
BTW. About r-values:

void fun(S s)
{
     fun2(s); //pass s by value.
}

Now, compiler inserts postblit call before func2 call and dtor 
before end of fun. However, it is able to doesn't it, because 
after fun2 call, s isn't used.
Thus, we can implement last postblit optimization:
If compiler want to insert postblit, and object is't used after 
this postblit, compiler is able to doesn't generate postblit and 
last dtor calls.
Is there limitation of this optimization?
It may 90 percent to solve a r-value ref task.
Sep 17 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/17/14, 3:03 AM, IgorStepanov wrote:
 BTW. About r-values:

 void fun(S s)
 {
      fun2(s); //pass s by value.
 }

 Now, compiler inserts postblit call before func2 call and dtor before
 end of fun.
Is the call to dtor part of fun(), or part of fun()'s call sequence? I've always meant to look at that. LMK if you know for sure.
 However, it is able to doesn't it, because after fun2 call,
 s isn't used.
 Thus, we can implement last postblit optimization:
 If compiler want to insert postblit, and object is't used after this
 postblit, compiler is able to doesn't generate postblit and last dtor
 calls.
 Is there limitation of this optimization?
 It may 90 percent to solve a r-value ref task.
I think rvalues are already not postblitted into functions. Indeed more postblits can be optimized away for variables are are last used in a function call. Andrei
Sep 17 2014
parent reply "IgorStepanov" <wazar mail.ru> writes:
On Wednesday, 17 September 2014 at 15:03:12 UTC, Andrei 
Alexandrescu wrote:
 On 9/17/14, 3:03 AM, IgorStepanov wrote:
 BTW. About r-values:

 void fun(S s)
 {
     fun2(s); //pass s by value.
 }

 Now, compiler inserts postblit call before func2 call and dtor 
 before
 end of fun.
Is the call to dtor part of fun(), or part of fun()'s call sequence? I've always meant to look at that. LMK if you know for sure.
Dtor call is a part of fun(). I've written test code... struct Foo { this(int i) { } this(this) { } ~this() { } } void calledFunc(Foo probe) { } void main() { calledFunc(Foo(42)); } ... add a trace output into Statement_toIR::visit(ExpStatement *s) ... printf("ExpStatement::toIR(), exp = %s in %s\n", s->exp ? s->exp->toChars() : "", irs->symbol ? irs->symbol->toChars() : "NULL"); ... and got: ExpStatement::toIR(), exp = probe.~this() in calledFunc
 However, it is able to doesn't it, because after fun2 call,
 s isn't used.
 Thus, we can implement last postblit optimization:
 If compiler want to insert postblit, and object is't used 
 after this
 postblit, compiler is able to doesn't generate postblit and 
 last dtor
 calls.
 Is there limitation of this optimization?
 It may 90 percent to solve a r-value ref task.
I think rvalues are already not postblitted into functions. Indeed more postblits can be optimized away for variables are are last used in a function call. Andrei
Yes, rvalues aren't posblitted. However I want and I can't deliver without postblits rvalue from first call (of constructor or factory function) through intermediate calls to the final place of stay. struct AA(Key, Value) { this(T...)(T args) { buckets.length = T.length; foreach (i; Step2Tuple!(T.length)) { alias key = args[i]; alias value = args[i+1]; size_t key_hash = hashOf(key); size_t idx = key_hash % T.length; auto e = new Entry(key_hash, key, value, impl.buckets[idx]); buckets[idx] = e; } } ... private static struct Entry { size_t hash; Key key; Value value; Entry* next; } Entry*[] buckets; } AA!(Key, Value) aaLiteral(Key, Value, T...)(auto ref T args) { return AA!(Key, Value)(args); } .... //somewhere in user code: auto aa = aaLiteral!(Foo, int)(Foo(1), 1, Foo(2), 2); //postblits aren't called. I want to place Foo(1) to buckets[nn].key without postblit call. Compiler can't help me now, however, I think, It can do it without language change.
Sep 17 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/17/14, 12:28 PM, IgorStepanov wrote:
 I want to place Foo(1) to buckets[nn].key without postblit call.
 Compiler can't help me now, however, I think, It can do it without
 language change.
File an enhancement request with explanations and sample code, the works. This will be good. Thanks! -- Andrei
Sep 17 2014
next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 18 September 2014 at 00:53:40 UTC, Andrei 
Alexandrescu wrote:
 On 9/17/14, 12:28 PM, IgorStepanov wrote:
 I want to place Foo(1) to buckets[nn].key without postblit 
 call.
 Compiler can't help me now, however, I think, It can do it 
 without
 language change.
File an enhancement request with explanations and sample code, the works. This will be good. Thanks! -- Andrei
I think it's this one: https://issues.dlang.org/show_bug.cgi?id=12684 Kind of required when you requested emplace from rvalues: https://issues.dlang.org/show_bug.cgi?id=12628
Sep 18 2014
parent "IgorStepanov" <wazar mail.ru> writes:
On Thursday, 18 September 2014 at 10:50:31 UTC, monarch_dodra 
wrote:
 On Thursday, 18 September 2014 at 00:53:40 UTC, Andrei 
 Alexandrescu wrote:
 On 9/17/14, 12:28 PM, IgorStepanov wrote:
 I want to place Foo(1) to buckets[nn].key without postblit 
 call.
 Compiler can't help me now, however, I think, It can do it 
 without
 language change.
File an enhancement request with explanations and sample code, the works. This will be good. Thanks! -- Andrei
I think it's this one: https://issues.dlang.org/show_bug.cgi?id=12684 Kind of required when you requested emplace from rvalues: https://issues.dlang.org/show_bug.cgi?id=12628
Yes, your 12684 issue is a case of new 13492. I've tried to classify the problem and describe a danger cases.
Sep 18 2014
prev sibling parent "IgorStepanov" <wazar mail.ru> writes:
On Thursday, 18 September 2014 at 00:53:40 UTC, Andrei 
Alexandrescu wrote:
 On 9/17/14, 12:28 PM, IgorStepanov wrote:
 I want to place Foo(1) to buckets[nn].key without postblit 
 call.
 Compiler can't help me now, however, I think, It can do it 
 without
 language change.
File an enhancement request with explanations and sample code, the works. This will be good. Thanks! -- Andrei
Done: https://issues.dlang.org/show_bug.cgi?id=13492
Sep 18 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/17/14, 2:01 AM, Szymon Gatner wrote:
 std::string returned from get_url() is a temporary and hence a "rvalue".
 In fact it's address cannot be taken.
It can in C++, after bound (implicitly!) to a const &. "That was the joke." That's why I'm opposed to adding the same conversion to D without restriction. -- Andrei
Sep 17 2014
prev sibling parent "Olivier Grant" <olivier.grant gmail.com> writes:
On Tuesday, 16 September 2014 at 15:30:49 UTC, Andrei
Alexandrescu wrote:
 http://www.slideshare.net/yandex/rust-c

 C++ code:

 std::string get_url() {
     return "http://yandex.ru";
 }

 string_view get_scheme_from_url(string_view url) {
     unsigned colon = url.find(':');
     return url.substr(0, colon);
 }

 int main() {
     auto scheme = get_scheme_from_url(get_url());
     std::cout << scheme << "\n";
     return 0;
 }

 string_view has an implicit constructor from const string& (see 
 "basic_string_view(const basic_string<charT, traits, 
 Allocator>& str) noexcept;" in 
 https://isocpp.org/files/papers/N3762.html). The function 
 get_url() returns an rvalue, which in turn gets bound to a 
 reference to const and implicitly passed to string_view's 
 constructor. The obtained view refers to a dead string.


 Andrei
I would say the problem with this code is not so much in this usage example but in the implementation of string_view. Given string_view aims to be a non-owning reference to a string, it should prevent such assignments, which is pretty straights forward to ensure in C++11. Just delete the corresponding constructor: string_view( std::string && ) = delete; This example, as stated in subsequent answers to your post, is no different from returning a const reference to a temporary. In this example, it just happens to be nicely hidden in the string_view implementation instead of being explicit: std::string const & get_url( ) { return "foo"; } This has always been a no-no, and C++11 at least now adds the possibility to refuse such code via deleted functions. O.
Sep 17 2014