The Case for case sensitivity
July 2007
Jeff Atwood’s post on the evils of case sensitivity is the longest-lived blog post I’ve ever seen. He wrote it at the end of 2005, but people are still commenting now, a year and a half later. Wow!
His main point is that case sensitivity is evil, and that all programming languages should instead be case-insensitive but case-preserving. He argues that case sensitivity wastes time, and he’s never seen any evidence that case sensitivity saves time.
Well, here we go: I like Python’s case sensitivity, and I’ve rarely had a bug caused by it. Plus, I’m convinced case sensitivity saves time. How? In our code we do this sort of thing all the time:
user = User('ben')
Which creates a new instance of the User
class and fills it with ben’s data. Some people think this kind of naming is evil, but for us it’s clear, concise, and Pythonic. I mean, what are the alternatives?
usr = user('ben')
user = user_t('ben')
user = user_class('ben')
user = get_user('ben')
All icky, if you ask me. The last one’s the best, but then get_user
no longer makes much sense as a class name, such as when using class methods like count = get_user.count()
.
The English language does something similar, except it has the opposite convention — a class is always lower case and an instance capitalised. For example, “Of all the cars I know, the Porsche 911 remains The Car.”
English has other case sensitivities as well:
- a Pole on a pole — what Daniel Baraniuk does
- a pole on a Pole — something you’d see in a circus
- a pole on a pole — to hang your flags on in Antarctica
- a Pole on the Pole — someone you’re likely to meet here
But back to programming. Programmers are “sensitive” people — sensitive enough to care about case. Non-geeks might be tempted to say we’re members of the Upper Case. But I believe it boils down to this:
Case sensitivity is great, but you need a convention.
In fact, you need good conventions whether you’re case-sensitive or not, but that’s another story.
It pays for your in-house convention to be similar to other people’s. And lets face it, in any given language there’s only one or two conventions to choose from. The one that’s worked for us is pretty simple:
- classes capitalised, instances lower, so
user = User('ben')
- short variable names lower, like
root
andsession
- longer variable names lowerCamelCase, like
initPage
We made those decisions early on. They’ve proved simple and efficient. And, strangely enough, we don’t have any of Jeff’s “I just spent the last hour…” horror stories to tell.
Comments
J Vincent Toups 18 Jul 2007, 14:46
I think you are right, but you forgot one other solution to the problem: separate namespaces for functions and variables (this is how Common Lisp “deals with” the problem). While this approach has its advantages, I think it may be more trouble than its worth. As in everything, though, I reserve final judgment.
Jonathan Allen 18 Jul 2007, 15:18
I mean, what are the alternatives?
I don’t know… maybe trying to give your variable a better name than “user”? It is usually a bad sign when your variable name is the same as your class name.
Ben 18 Jul 2007, 15:36
Jonathan, that’s the standard response (“it is usually a bad sign…”). But why is it a bad sign? I agree that something like obj
would be a bad variable name, but why is user
bad, if that’s exactly what it is? Can you give a specific example that’s not icky like user_object = user('ben')
?
Kent Boogaart 18 Jul 2007, 15:37
It is usually a bad sign when your variable name is the same as your class name.
Why? If I am declaring a variable to represent the user, then ‘user’ is a perfect name. What would you suggest? ‘theUser’, ‘aUser’, ‘theCurrentUser’? It only makes sense to qualify the variable name if there is something distinguising about it with which you can qualify. For example:
User loggedOnUser = …; User impersonatingUser = …;
It is often desirable to name method parameters in a similar fashion:
public void Logout(User user)
Again, what would you suggest?
public void Logout(User theUserToLogout)?
If it wasn’t obvious, I love case sensitivity. Another reason for it is consistency. I’d hate reading C# code where some devs use ‘String’ and others use ‘string’. I wish there was only one choice.
Gordon Mohr 18 Jul 2007, 19:25
Java, too, resolves this issue with separate namespaces. That is,
User user = new User(“bob”);
works. (In a context where there’s only one User, I wouldn’t even say that’s bad style.) Capitalization of classes is a convention rather than a requirement, so
user user = new user(“bob”);
would be legal, albeit unwise.
Another factor: I suspect case sensitivity makes programming somewhat harder for people whose mother script lacks casing. Is that a bug or a feature?
Ishmaeel 18 Jul 2007, 22:19
If you are writing..
user = User(‘ben’)
what you are doing is
1) Error prone (invites typos) 2) Uninformative 3) Wrong (or not correct enough)
My approach is this: “User” is the name of the class. An object instantiated with it is not “User” as in “The User”. The object has a very specific use and calling it “user” is simply omitting information. Try this:
currentUser = User(‘ben’)
Benefits?
1) Your code is self commenting. 2) A 1-letter-typo will cause less havoc. 3) Next developer who will be maintaining / extending your code will not be tempted to write something like user2 = User(‘maria’)
lm 18 Jul 2007, 22:47
Excellent article. I also like case-sensitive languages more because casing conveys additional information and reduces ambiguity while keeping the code compact. It’s a pity that even experienced coders can’t agree on such a tiny issue.
cw 19 Jul 2007, 01:40
Ishmaeel:
I’ve never once heard of somebody making that mistake. U an u are pretty distinguishable, as are most other capital vs. lowercase pairs. Unless you’re using size 6 font, in which case you should consider increasing it.
Stephen 19 Jul 2007, 01:51
Atwood is an idiot anyway (or just writes consistently half-baked posts). Good post, and excellent points that I hadn’t thought of.
Maybe… 19 Jul 2007, 02:32
Odd/camel casing tends to slow me down when typing–is this a consideration?
I do question the rationale of case insensitivity–the idea that a competent programmer might not realize he used a stupid capitalization. dOES ANYONE EVER NOT NOTICE CASE CHANGES? tHEY ARE ONE OF THE FIRST THINGS THAT JUMP OUT AT YOU–OR ME, AT LEAST. mAYBE OTHER PEOPLE ARE DIFFERENT. :) I cannot recall searching hours for a bug only to discover that it was a case problem. However, I do recall searching for a bug in VB code for a very long time only to discover that the problem was an O in the place of a 0. (“Why can’t it find the file? It’s right there! Network?”) And VB doesn’t even have case sensitivity.
True, case sensitivity is a mess if you don’t use a convention. But absolutely everything in programming is a mess if you don’t follow conventions. Great blog post.
Luke 19 Jul 2007, 02:41
Is it really that frequent that variables are declared and used in such a way that a static analysis algorithm could not determine if a variable is assigned to before it is used? I should think that it would be a good idea to make code simple enough such that this sort of analysis could be done. Would this not make the problem being discussed disappear into the ether?
I guess there is also metaprogramming of methods with method_missing type things, but I still wonder whether it is a good idea, for human brains, to write code that a decent static analysis algorithm could not check for correctness of using methods/variables/etc. that have already been defined/assigned to. I can see the odd case where the extra complexity is warranted, but what about for most situations?
If you’re going to throw the Halting problem at me, be prepared to argue for a low upper bound. The Halting problem is irrelevant if halting for 99.99% of programs can be determined.
Ishmaeel 19 Jul 2007, 03:14
@cw: Distinction between u and U is not the real issue here. You are adopting a convention and it applies to all letters. Yes, there are specially crafted programming fonts out there that make it a point that every single glyph is easily recognized and distinguished, but it’s only half the problem and even so, it’s not as trivial an issue as you make it out to be. You are compromising readability, maintainability and searchability (is that a word?).
Yes, there are solutions to all these problems, but why create the problems in the first place?
If you allow distinction by case only, all kinds of things can go wrong. The shift key could bail out on you and you might never notice until strange bugs start to show up. (Yes, it is possible to mix & match cases without triggering a compilation error.)
@Kent Boogaart: Even if you declare only one user object in that method, distinguishing it from the User class still makes sense.
As to method parameters, I haven’t yet made up my mind about them and in fact, the convention I follow has been like your Logout(User user) example. My reasoning is that its context (=its very specific function) is distinguished by the fact that it is a parameter to this very specific method. That said, I’m still not comfortable about being unable to distinguish between a parameter and a local variable.
@Maybe…: If you are writing code as fast as you can type, I sure don’t want to maintain your code after you are fired. pROGRAMMING aNd WrItinG pRoSe are different things, after all.
Dave Grossman 3 Dec 2008, 18:33
Thanks for being the voice of reason. How anybody can argue for the case insensitivity of a compiled language because they had to debug a problem in an interpreted language is beyond my comprehension. I can’t remember the last time I ever made a mistake that had to do with the case of a variable. It is a complete non-issue.
I see it all the time where someone keeps getting burned by some simple thing in a language then they rail against that feature instead of just blaming the person whose fault it really was.
Michael Mouse 30 Sep 2009, 06:49
Completely agree with this article on case conventions in code being sane and preferable to insensitivity, however for file systems case sensitivity is stupid. You don’t write essays about Poles on poles IRL, ever, and you definitely don’t put such things in file names :)
But there is one crime greater than all this, camelCaseWasTheBiggestErrorEver. Why did they do it? It’s bad_enough_we_have_to_do_this – unless we use a grownups-language-like-lisp but thisShouldNeverHaveBeenAllowedToHappen.
What is wrong with you crazy people!
WhAtevEr 6 Mar 2012, 08:20
Sorry but i absolute dissagree – what you need is a convention for your bjectnaming that doenst mean you need case sensitive object names
Actually you already made a case AGAINST case sensivity
- i think your way of programming is very very bad – having same name for different instances just different in case – sorry this will lead to confusion once more than one person will code or you have to read your own code 1 or 2 years later
second – even worse – its quiet possible – if someone using your style – that you have somewhere a typo which never lead to an compileerror but produce somwhere bad data within runtime a new security leak is born :)
for example you wanna use sessions instead a real username but at the point where the session get assosiated oyu make a typo – so the programm doenst use a session it uses the real username – its quiet possible that many if not all parts are working for a time until you run into a condition where it no longer will – or worse it works but you got a big security hole open
no sorry naming – but case insensitive args are the natural way for computer – they are NOT the natural human way
think about how many time you had yourself a little time penalty because of case sensitive letters – just because typo on the shell – or bug searching typo in your object /instance names …. im not talking about one or 2 big searches – think about all the small ones – just the time took you to rewrite the command and correct your spelling somewhee in the code – or thinking about conventions…
now multiply it with every coder, every *nix operator, every scripter….
now lets assume 2 dollar/ hour rate for correting these now we take the money and solve the world hunger…
im wondering how many years we can feed the worlds with that savings
.. from me comming 10 bucks this week alon e- dam fkn typo lead to a cascade of errors where the source where covered and not so easy to find – 5 fkn hours
PPS: always when i have to code some things in visual basic i love it – not because of that ugly Vb but the IDE and the reading of that code – it autocorrect you so you have everywhere a wonderful readable code – same letters for xxx no matter how you wrote it…. at least it helps with the pain usually comes with vb
WhAtevEr 6 Mar 2012, 08:32
pps: fault ans erros are human – just because one person CLAIMS (and i dont belive you and i bet 15 gran you lie) never make a case typo in his condings doenst mean noone will
point is – the benefit is nothing because there lot other ways todo – and it example like in the article should not be used anyway for many reasons
it just raises chances of error – which lead to wasting time to correct it only because someone never thought about real naming conventions
you talk about stupid people making typos and blame others ? i talk about stupid people without a real naming convention
the argument that case insensitive things happen in natural languages like english doenst count because its absolute different thing
even there a typo usually do not lead to confusion because the sense of a possible missspelled word comes from the context and – in case of spoken words – the pronouncing
also interresting that no newer words have different meanings with big and small letters but mostly old – to very old words have – comming from a differnet culture you cant even compre with a modern programming language
and now guys think twice – lets say you make a case sensitive convention in a coder team with 200 people working comehow on the code – maybe even an opensource project – imagine how much effort you have to make that new people learn and understand AND use it right way
its way easier to say hey convention is varname = variable objname = object,…. than name = object, Name = variable, namE = function :))
no really it just makes things more often more difficult and raises chances of error – which will happen once enough lines of codes are written…