Page 1 of 1

quick C++ quiz

PostPosted:Tue Nov 11, 2003 5:58 pm
by Zhuge Liang
<div style='font: ; text-align: left; '>If I you needed to declare a string constant and you were given the following two choices, which one would you pick, and why?

1) #define MY_CONST "There is no spoon"

2) const std::string MY_CONST = "There is no spoon";

3) there is no option 3. For this quiz, pick 1 or 2

Both options have global scope. This may seem like a trivial quiz, but if you are a programmer, please indulge me. I promise it will get interesting(er).

Regards,
Zhuge Liang</div>

Option 2.

PostPosted:Tue Nov 11, 2003 7:32 pm
by Kupek
<div style='font: 10pt verdana; text-align: left; padding: 0% 10% 0% 10%; '>Option 1 is C; Option 2 is "proper" C++. That's how I look at it. I avoid #defines as much as possible. They always scream "hack" to me in C++. (I realize that they are a necessity in C.)

When I'm doing OO code, I want objects. If I want a string, I want a string object, not a string literal. There are so many more things you can do with a string object. When I'm doing C++, I expect my constants to be const variables, not macros.</div>

I'm not a very competant C++ programmer, but: ditto Kupek, for the same reasons.

PostPosted:Tue Nov 11, 2003 7:55 pm
by Andrew, Killer Bee
<div style='font: 10pt georgia; text-align: left; '>But we are a little context-less here. If we have a lot of constants and memory is an issue (and the object-orientedness of constants isn't), maybe it'd be better to use the preprocessor defines?</div>

quick addendum...

PostPosted:Tue Nov 11, 2003 8:41 pm
by Zhuge Liang
<div style='font: ; text-align: left; '>If option 1 were changed from

#define MY_CONST "There is no spoon"

to

const char MY_CONST[] = "There is no spoon";

would you still prefer option 2?

Indulge me once more please. I promise my next post will be interesting (to programmers anyway).

Oh, one more thing. Your development environment is MS VC6. =)

Regards,
Zhuge Liang</div>

PostPosted:Tue Nov 11, 2003 9:00 pm
by Kupek
<div style='font: 10pt verdana; text-align: left; padding: 0% 10% 0% 10%; '>Yes. I want my objects.</div>

PostPosted:Tue Nov 11, 2003 11:06 pm
by Kupek
<div style='font: 10pt verdana; text-align: left; padding: 0% 10% 0% 10%; '>Oh, and I think I know where your going with this. Constant replacement in optimization?</div>

PostPosted:Tue Nov 11, 2003 11:07 pm
by Andrew, Killer Bee
<div style='font: 10pt georgia; text-align: left; '>Optimally yes, but again it depends on the context. Also: MS VC6? Consider my answers irrelevant, I've got no experience with it :).</div>

no, but it does have to do with optimization.  answars inside...

PostPosted:Wed Nov 12, 2003 3:09 am
by Zhuge Liang
<div style='font: ; text-align: left; '>For MSVC6, option 1 is threadsafe, whereas option 2 is not. In fact, this particular "feature" had me debugging this for a week. I combed every inch of my code and largely failed to find the cause of the random server crashes.

The problem is this. The std::string class has an internal buffer to store the characters for the string. So we have 1 global constant, with it's corresponding 1 buffer. So what happens when you make an assignment, or pass the constant into a function?

string test = MY_CONST;

If test was a primitive type, you'd expect test to contain a copy of the constant. Even here, std::string overloads the assignment operator, so normally, you'd expect test to contain a copy of the string, internal buffer included. Obviously, if this were so, I would not have brought the issue up. Microsoft's implementation of std::string (as well as other implementations such as gcc) contain an "optimization" for the class where a new buffer is not allocated until neccessary. That is, test will point to the same character buffer as MY_CONST, until you make a modification to test. Only then, will it allocate a new buffer and copy the MY_CONST buffer over. So

string test = MY_CONST; // test and MY_CONST point to the same buffer
test[0] = 'a'; // now test points to a different buffer.

Obviously, the intention here is saving memory space. One more thing to note. If test is assigned MY_CONST and no changes are made to test, then test will never have to allocate a buffer. It will just point to the MY_CONST buffer. So what happens when test exits the current scope? Obviously it can't deallocate the buffer, since that would invalidate MY_CONST. So how does MY_CONST and test keep track of the buffer memory and when to deallocate it? The answer is reference counting. Whenever MY_CONST is assigned to another string variable, MY_CONST's reference counter is incremented. When a variable exits scope or when it allocates its own buffer, the reference counter is decremented. When the reference counter reaches zero, the string object will know to deallocate the memory.

Normally, this is fine and peachy, in single threaded applications. When we talk multithreaded, things get hairy. Suppose multiple threads pass in MY_CONST to some function or assigns it to some variable, at the same time. Well there would need to be a reference counter increment for each one of those assignments. Since we are talking multithreaded programming, that increment operation MUST BE ATOMIC. Microsoft uses a simple ++ref_count or something like that. ++ref_count IS NOT ATOMIC. Therefore you violate thread safety. Therefore I spend a week and a half debugging server code trying to isolate seemingly completely random server crashes.

That is why given the two options, option 1 would be preferable, if there's any conceivable chance the constant will be used in multithreaded environment.

But this is just a microsoft issue you say? Well no. It turns out that the "lazy allocation" optimization is a fairly common implementation of std::string. As I've said, gcc (at least at one time, I don't know about now) has the same problem. But even if they make the reference counting increments and decrements thread safe by using mutexs, the overhead of using the mutexs may incur significant overhead, depending on how often you assign the constant to a variable or pass it into another function in the application. So even in non-MS environments, I would now tend to shy away from option 2.

So the lesson learned today is that "const type MY_CONST = value;" is still preferable to "#define MY_CONST value", but only if type is either 1) a primitive type, or 2) you have control of the "type" class now and for the foreseeable future. If, like in this case, you don't have control of what the type is doing internally, it can be very dangerous to use an object as a constant.

Regards,
Zhuge Liang</div>

PostPosted:Wed Nov 12, 2003 1:08 pm
by Kupek
<div style='font: 10pt verdana; text-align: left; padding: 0% 10% 0% 10%; '>I think the flip lesson is "Throw out all assumptions when you need threadsafe code."</div>

Option 2...

PostPosted:Thu Nov 13, 2003 2:44 am
by Ishamael
<div style='font: 14pt "Sans Serif"; text-align: justify; padding: 0% 15% 0% 15%; '><b>Link:</b> <a href="http://Option 2 still...">Option 2 still...</a>

Preprocessor directives/macros = BOOO! Like Kupek said, objects are better than literals and/or arrays of characters.</div>

Still prefer option 2...

PostPosted:Thu Nov 13, 2003 3:07 am
by Ishamael
<div style='font: 14pt "Sans Serif"; text-align: justify; padding: 0% 15% 0% 15%; '>By choosing option 1, you're giving up a lot of the power that a String class gives you. You should never assume thread safety. Class documentation should tell you if something is thread safe or not...

I say this, but I almost always take the lazy way out myself (ie, not actually reading the API docs). :)

From the way the problem is described, I wonder if your code would be threadsafe, whether the inital string assignment is a shallow copy or not. The shallow copy definitely muddies the water, but I don't know that an upfront deep copy (ie, internal buffer reallocation and copy upon assignment) would automatically make it any more threadsafe.

Ideally, this would be Java code and I'd just make the method synchronized and be done with it, but nobody said the world was perfect. :)</div>

PostPosted:Thu Nov 13, 2003 3:09 am
by Ishamael
<div style='font: 14pt "Sans Serif"; text-align: justify; padding: 0% 15% 0% 15%; '>Errr...did I just write that whole post in a URL space? Doh! Anyway, it went something like "Preprocessors/macros = BOOOO! Objects = yippee! Almost always."</div>

PostPosted:Thu Nov 13, 2003 3:10 am
by Ishamael
<div style='font: 14pt "Sans Serif"; text-align: justify; padding: 0% 15% 0% 15%; '>Of course, this begs the question of whether you should be taking multithreaded programming advice from someone who can't even post on a messageboard correctly...but let's all agree to avoid that question, shall we? :)</div>

PostPosted:Thu Nov 13, 2003 3:58 am
by Oracle
<div style='font: bold 10pt ; text-align: left; '>This actually strikes close to the product I'm working on at the moment... Not a lot of experience with thread coding, so I had a lot of fun debugging the thread-unsafe code in a program using threads to encode audio/video ><</div>

PostPosted:Thu Nov 13, 2003 11:13 am
by Kupek
<div style='font: 10pt verdana; text-align: left; padding: 0% 10% 0% 10%; '>Particularly in languages which do not explicity have thread support. (Like, say, C++.)</div>

that's cos you're a poofter...

PostPosted:Thu Nov 13, 2003 3:59 pm
by Zhuge Liang
<div style='font: ; text-align: left; '>If you want the power of the string class, you can still have thread safe code by first defining the string constant as a primitive, THEN assigning the constant to a string object within the thread you plan to use it in.

And no, microsoft's documentation makes no mention of std::string's thread un-safety. In fact the whole reference counting implementation is internal and normally you're not supposed to know about it. I actually learned of the implementation from another book (More Exceptional C++) and the non-thread safe microsoft side of it from an obscure message board post.

It's not even a problem of assuming thread safety. It's assuming a const is immutable. When you declare something to be a const, you don't expect its state to change. If the string object state indeed did not change, whether or not the string class itself was thread safe, there would be no bug (in my context at least). Unfortunately its state did change (due to the reference counting) and doubly unfortunately, the state chage was not thread safe.

And synchronization incurs a heavy penalty in multithreaded code, especially in this context.

Zhuge Liang</div>

PostPosted:Fri Nov 14, 2003 1:12 am
by Andrew, Killer Bee
<div style='font: 10pt georgia; text-align: left; '>Bwah</div>