May 19, 2010

Clear Cycling

Filed under: programming — Joshua @ 4:09 am

So, sometimes you need a cycle function. You know, something that takes a list like [a, b, c] and gives you back an a the first time you call it, then a b, and then a c, and then an a again, and so on indefinitely. Don’t ask - it’s just useful. I’ve frequently needed something like this.

Well, Python’s itertools module has one. I’m not sure it needed to be built in - it’s the kind of thing you can easily write for yourself - but it’s nice having it around. I found it yesterday while reading the itertools documentation. No need for it now, but now I know it’s there.

The interesting thing is their implementation. From their site:

def cycle(iterable):
    # cycle('ABCD') --> A B C D A B C D A B C D ...
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
              yield element
              saved.append(element)

Since I’m kinda new to yield, it took me a minute to reassure myself that the first section (the for element in iterable: bit) would only execute as long as there were still elements in the iterable. That is, it weirded me out to see those TWO yield statements, each of which seems reachable in the same cycle. But whatever, of course it works, I was just in new territory.

It still seems slightly inefficient to me. I mean, why have the first section at all? Once you’ve exhausted the original iterable argument, it seems like this just requires an unnecessary test each time through. Maybe it compiles away in the interpretation? I guess it must - I’ve never really looked into CPython’s internals. But the more intuitive way to write it, for me, was this:

def cycle(iterable):
    saved = [element for element in iterable]
    while saved:
        for element in saved:
              yield element
              saved.append(element)

That is, you copy the iterable entirely on first execution into save, and thereafter the lazy calls all behave in exactly the same way: yield an element from saved, paste that same element back onto the end of saved. This is much more readable to me. And it seems like it’s (insignificantly, albeit) more efficient - though like I said, I have no idea whether the CPython interpreter is smart enough to compile out that test that become unnecessary after the iterable argument is exhausted. It probably is. Well, readability is important too.

I guess the point is that objects are passed by reference in Python, so if you’re passing a generator then the official version is MUCH more efficient than my “intuitive” version, since the generator could represent an infinite sequence. That’s gotta be the answer. That way, if you happen to pass something that’s infinite, you don’t blow the stack trying to copy it, like you would in my “intuitive” version. All of which means - I’ve gotta spend more time learning about lazy evaluation! Good thing I’m learning (and probably switching to) Haskell soon…

May 15, 2010

How not to Fight a Language War

Filed under: programming — Joshua @ 5:38 am

Here’s a quote from Abstract Heresies that is so dope I thought I’d steal and repost it:

Lisp doesn’t do anything the other languages cannot, but it does what the other ones do not.

Yes, right! That’s the key number one missing ingredient in, I think, about 99% of pointless language wars. He’s making this point in a post about one of “those people” (and I have certainly been one in the past!) who argue that C++ is “not that bad” because if you’re just willing to put in a bit of extra work you can make it do whatever it is that you need it to do. Well, right, you can. But why - and this is the point - should you have to? There are very few things that one language can do that another language simply can’t. The point in choosing that ever-elusive “right tool” is picking the language that does what you need it to with the minimum amount of fuss. That will never, ever be C++ for anything, so if you’re using C++ then it has to be because there’s some other advantage (memory efficiency, usually) that makes all the other fuss worth it. Since there are such cases, I’m definitely not arguing that C++ should just go off and die. I like it, I use it, and I’m OK with saying that in public. I just think we could save a lot of bandwidth if people reminded themselves, before engaging in a language flame war, that it’s not what it can do, it’s what it does do.

The only problem that I have with this line of argument is that it blunts the OTHER common language war pitfall: that of mistaking a great ecosystem of libraries for a great language. This has been what’s kept Perl on life-support these many long years: there’s a staggering number of easily-callable library and module convenience tools that you can just plug in. You barely even have to know how to program at all to use Perl, really. But while that makes it a highly convenient tool, it does nothing for arguments about language design. Ideally, we’d want a language with a good design to have the kind of dedicated user base that Perl has. The trouble is, once I’ve said that it’s not about what a language can do but what it does do, I weaken my defenses against this other line of attack.

I’m still not really sure how to respond to both at the same time, but I think it might have something to do with having readable library code. That is, if you have a dedicated user base that contributes a lot of useful code, that’s wonderful, but what would be even more wonderful than that would be if you had that user base working in a language where the code they contributed could be read and understood with a minimum amount of effort. This has the advantages that it can easily be modified to fit your needs. Also that it makes it easier to make various libraries contributed by various people with differing priorities work together - and to track down the problems when they don’t. It’s like having a great car that can easily be modified and repaired versus a great car that’s expensive and complicated to repair.

Which, in an odd way, really does unify the two arguments: a great language is about practicality in the sense of minimizing programmer effort. It isn’t about whether you can add features to the language, because you can. It’s about how much fuss is required to add them. And while it’s more about how many programs people have already written in this language, it’s better if it’s easy to adapt those programs to your own use.

An interesting corollary of the second is the main reason why you won’t see Lisp near the top of the TIOBE index any time soon, though. For whatever reason, Lisp programers don’t tend to write a lot of programs that they then make generally available for all users of the language. The library-and-application ecosystem just isn’t there. Not really sure why that is - though the obvious guess that springs to mind is cultural. Haskell doesn’t seem to have the same problem, and so its star is rising.

January 26, 2010

Argdo me, baby!

Filed under: programming, vim — Joshua @ 2:16 pm

I’ve been a Vim user for a long time, and the trouble with being a long-term Vim user is that it’s frequently not too different from being a short-term Vim user. It’s an incredibly powerful text editor with an incredibly steep learning curve, and the trouble is that a lot of us, once we know enough to feel the massive Vim advantage, just never really get around to doing all the truly advanced things.

Like, for example, search-and-replace across multiple files.

Honestly, Vim has added at least 14 hours to my life over the years with all the time it’s saved me on search-and-replace. But even though I knew it could do this across multiple files - have known for almost a year now, actually - I never bothered to go and learn it.

Fortunately, Ibrahim Ahmed’s got the goods. Apparently the trick is that you pass all your related filenames to args:

:args *shtml

Annd then you process them all through argdo, careful to pipe it to update so that all the files get saved. (Yes, undo seems to work across files for this command as well!)

:argdo %s/one/two/g | update

This was a huge help today adding a link to the header logo of every page in a conference page I’m making for the Lingusitics Department. Yes, yes, I know I’m supposed to have done this with server side includes, or Javascript, or to have done the whole project in Rails to begin with. Honestly, it’s not such a large-scale venture that I feel the need to get it absolutely perfect.

The Department Website itself, on the other hand, could look a lot more professional. One of my (serious) goals for this semester - since I think it will probably be the last semester I am webmaster for IU’s Linguistics Department (moving on to graduate soon).

Anyway, thanks, Ibrahim!

December 9, 2009

Code Completion for Qt4 in Vim - Mac OS X Version

Filed under: programming — Joshua @ 4:06 pm

Vim makes everything better. But there are two areas where the Emacs gang really has us beat. One is in getting applications to use Emacs key bindings for their native editors. Open just about any IDE, and you can use it like it’s Emacs. Not with the full featureset, of course, but at least an Emacs guy can get from 0 to 60 in a reasonable number of seconds. Hell, even your basic web browser supports them. Vi keys get more and more support as the years roll by, but basically it’s still nothing you can count on, and when it does happen, it’s usually a pathetically small subset of Vim functionality. I assume that this is because implementing escape mode requires something over and above a macro language. Now, we could get around this by tricking out Vim to act like a proper IDE, but … well, that’s the second and by far more important area where the Emacs crowd has us beat. I admit I don’t have anything to go on here but my hugely unscientific impressions, but it seems to me that Emacs has more of a tinkering culture than Vim does. Vim is, of course, fully customizeable, and large numbers of people do customize it. But it seems like given a roomful of hackers, half of whom primarily use Vim and the other half of whom primarily use Emacs, you’re likely to get a much larger show of hands for the question “Who knows Emacs Lisp?” than you are for “Who knows vimscript?” And some of the Vim users might even be those who know Emacs Lisp! Although, now that you can compile Python interpreters into Vim, maybe we’ll start to see more interest.

Vim users just seem to be a more satisfied crowd - and satisfaction is arguably a programmer’s vice.

No, I didn’t just quote Larry Wall. Erm. You never saw that.

Anyway - I’m a case in point. I’ve been using Vim since 2003-4-ish, and I’ve constantly wanted Intellisense. And only today did I really do something about it. And, unsurprisingly where Vim is concerned, the feature is definitely there and easy to install, it’s just that most of us never get around to doing it because there isn’t really anything like a Vim community to speak of, and if your Vim-using friends know about it, they just never get around to telling you. Presumably because another characteristic of Vim users is humility - arguably another programmer’s vice.

My particular problem was wanting code completion for Qt, which I’ve had to buckle down and admit is my favorite programming environment. (I’ve tried everything else in the last couple of days - Objective-C, Java, Haskell, C#.) And of course, while you can get Emacs keys natively in Qt Creator, it comes with a Vim mode that is frustratingly limited. I just want Vim. And I want it to do my lookups. Is that so much to ask?

On Linux, no. There’s a great page to tell you how to do it here. Unfortunately, it doesn’t quite work for Mac. So here are the steps to getting code completion for Qt4 in MacVim.

(1) Download the OmniCppComplete plugin.

(2) In your ~/.vim directory (make this if you don’t have it - or substitute with your other distinguished plugins directory if you have another location you prefer), unzip it.

(3) Make a distinguished directory for holding tags files (I used ~/.vim/tags, as suggested on the linked Wiki article).

(4) Download the cpp standard libarary headers to this directory, and unpack it

(5) You’ll need to install a new version of ctags, because the system version on the Mac doesn’t work in quite the way you need. Get it here. Doing the normal

./configure
make
sudo make install

thing will install it in /usr/local/bin - so that it doesn’t collide with the system ctags (in /usr/bin).

(6) Now use this ctags to create the files you need in your tags directory:

/usr/local/bin/ctags -R --c++-kinds=+p --fields=+iaS --extra=+q
  /--language-force=C++ cpp_src
mv tags cpp

(7) In your .vimrc file, add the lines:

set nocp
filetype plugin on
set tags+=~/.vim/tags/cpp

Now the next time you fire up vim with a file that has a .h or a .cpp extension, you should get code completion support for the C++ standard libraries. To test it, type something like:

std::v

- and if it doesn’t automatically offer to complete for you, hit Ctrl-x, Ctrl-o to force it. You should get a popup list that you can scroll in. And if you hit a couple more keys, it will narrow it down some.

(8) Another wrinkle in the instructions given on the page is that Qt4 doesn’t install to the same place on Mac as it does on Linux. So, to create the tags for Qt4, cd back to your tags directory, and run the special ctags again, but this time against the Qt headers:

/usr/local/bin/ctags -R --c++-kinds=+p --fields=+iaS --extra=+q
  /--language-force=C++ /Library/Frameworks/QtCore.framework/Headers
mv tags qt4

(9) And then, of course, make Vim aware of the new tags file by adding this to .vimrc:

set tags+=~/.vim/tags/qt4

(9.1) If you’ve had vim opened this whole time, you’ll of course need to resource your .vimrc file to get it to notice the new tags:

:source ~/.vimrc

And, voila! Code completion for Qt! You will never need another toy.

Oh - well, maybe one more. While looking this stuff up today, I also came across this plugin that no C coder should be without. It allows, among other things, rapid switching between header and source files. OK, and maybe two. On the linked wiki article, there’s also a nifty macro for automatically adding tags for your very own project directories. Here’s the obvious Mac mod of that suggestion:

map <C-F12> :!/usr/local/bin/ctags -R --c++-kinds=+p
  /--fields=+iaS --extra=+q .

Happy Vimming!

November 24, 2009

When do Bloggers Jump the Shark?

Filed under: programming — Joshua @ 8:50 pm

…in which the title is a reference to Jeff Atwood’s greatest hit.

OK, you all know the story. Or, in case you don’t, here’s a bit of expository dialogue for ya. Joel Spolsky is an internet celebrity who has been blogging since about 20 seconds before there were blogs - usually on business-of-software-type themes. He’s an important read for people new (and old) to the development world for the three best reasons in the world: (1) he can write, (2) he has insightful things to say and (3) he’s confident without being arrogant - which is to say, he’s confident enough that he can see when the emperor isn’t wearing any clothes, and says so, but not so confident that he tries to sell you on solutions that are out of the reach of the ordinary programmer.

Sometime in 2006 he angered a lot of Ruby programmers by suggesting that Ruby was not quite ready for prime-time - because its ecosystem of libraries, third-party apps and expert advice from the community is almost, but not quite, up to scruff for real-world deployment. To make it all that much worse, in a throwaway line at the end of that entry he let slip that his own flagship application was written in a proprietary, private, in-house programming language called Wasabi. And if that weren’t bad enough, that Wasabi is a dialect of Basic that compiles down to VBScript and/or PHP, and has Ruby on Rails-like active records. For someone claiming that Ruby itself was not quite ready for prime time, it seemed hugely hypocritical.

There was an internet outcry, and Jeff Atwood joined the chorus with this entry’s namesake. Jeff and Joel have since buried what minor differences they ever had and now collaborate on the excellent StackOverflow.com, water under the bridge, etc. etc.

Well, the incident is over, so it’s way too late for pertinent comment, but I don’t mind saying that I think Jeff was wrong about Joel jumping the shark - for a number of reasons. First, Joel was right - both about Ruby and Wasabi. Ruby is indeed not quite ready for prime time. That’s as true now as it was then - and though industrial applications do get written in Ruby, I’m with Joel in not quite being willing to trust any mission-critical code to it. There are just too many holes - and intimations of holes - there that I couldn’t be sure one of them wouldn’t pop up right as it was time to ship. And he was right about Wasabi too. Writing your own compiler and having your own in-house language is an excellent solution to the portability problem because it means you can target as many new platforms as you like with exactly the same codebase. You can even target platforms that have yet to be invented with the same codebase. People misunderstood Joel here - he wasn’t compiling Wasabi down to machine code - but rather down to existing tools that had the requisite ecosystems he’d talked about.

Second, Jeff was misusing the term jump the shark. It doesn’t mean “produced something silly” exactly. Or if it does, then it’s a special kind of “produced something silly.” It’s produced something that’s unintentially silly as a side effect of being more concerned with attracting attention than saying something of substance. Applied to TV programs (its original domain), to say a show has “jumped the shark” is just to say that it’s inadvertantly let slip that it’s no longer interesting or vital because it had to step outside of its normal bounds to get attention. You know, the whole “tonight, on a very special episode of…” thing. Which is why the categories on the original jumptheshark.com website (now defunct) included things like “they had sex” or “they got married” or “they went to college.” It’s the kinds of plot devices that producers come up with rather than the show’s original writers, because where to the writers the show is an object of affection, to the producers it’s just a cash cow, and they’re trying to keep it going indefinitely, without any regard for its organic stopping point. “They went to college” is jumping the shark when it’s on a high school show - like Saved by the Bell. “They had sex” is jumping the shark when it happens on shows like the X-Files, where the unresolved sexual tension between the main characters was one of the drivers. Ditto “they got married” (cough Cheers cough). Jumping the shark isn’t just “running your course,” it’s refusing to admit that you’ve run your course, and failing to recognize that the boundaries you were never supposed to cross were there for a reason. When you’ve run out of chances for Mulder and Scully to just miss going to bed and you find them actually in bed, it’s time to cancel the show - because however logical that development is in character terms, it’s beyond the confines of what makes the show interesting.

It’s not clear that “jumping the shark” works for bloggers. It’s a television term, and I’m not sure how well it translates. I’m sure it doesn’t mean what Jeff was using it to mean - which was just “blogger x said something I think is silly.” But here’s what it might mean, if I can get enough people to agree with me: “blogger x produced a column that manufactures controversy for the purpose of drawing attention to himself long after he’s run out of important things to say.”

If that’s what it would mean to say that a blogger has “jumped the shark,” then I think Joel might plausibly (finally) have done it in September with the Duct Tape Programmer entry.

This one caused its share of stir on the internet too - though not as much as the other one, because people were already well aware of how inimical Joel can be to unit tests. The basic premise is that there is a tension in the programming community between people who get things up and working quickly and those who spend a lot of time planning before getting to work. People who get things done quickly - the “duct tape programmers” - tend to rely on old technologies that they know really well, so that even if the tool isn’t quite up to the job, they can make it work for now, at least long enough to get the application shipped; they’ll go back and fix the leaks later. The people who spend a lot of time planning before getting to work run the risk of never doing anything because even if they DO understand why all these hyperpowered tools that they’re recommending just because they’re trendy buzzwords (which they don’t necessarily have to and frequently don’t), they’re still wasting time learning how to use them. Or, more accurately, they’re learning how to use them at inappropriate times: best to learn the new tools after the app has shipped and apply them to making a better v2.0 than to delay releasing v1.0 for lack of training.

The entry seems almost calculated to piss off the Test Driven Development crowd. It’s the kind of thing that you write knowing you’ll get mail. And that’s OK if what you have to say is important. But once you start combing through Joel’s prose, you find very little that you can actually pin him down on. Case in point:

A 50%-good solution that people actually have solves more problems and survives longer than a 99% solution that nobody has because it’s in your lab where you’re endlessly polishing the damn thing. Shipping is a feature. A really important feature. Your product must have it.

Who’s he arguing against here? There simply isn’t anyone anywhere who will continue talking about code past the point when it’s supposed to have shipped - not in industry anyway. Or, if there are such people, they’re obvious candidates for firing. Done! No one wants those people on their team. It’s as if someone had written an interview advice book that said “the secret to a successful job interview is remembering to offer the job to the people who will succeed at it and passing on those who won’t.” Well, sure. Sure we want coders who write code that gets shipped and people use. What we need you for, Mr. Spolsky, is to tell us how to spot those people.

But the advice here is unhelpful too.

One principle duct tape programmers understand well is that any kind of coding technique that’s even slightly complicated is going to doom your project.

Is that true? I don’t know - because I don’t know what is meant by “complicated.” “Complicated” here seems to mean “any technology the developers don’t already know very well.” But that doesn’t square well with this:

Duct tape programmers tend to avoid C++, templates, multiple inheritance, multithreading, COM, CORBA, and a host of other technologies that are all totally reasonable, when you think long and hard about them, but are, honestly, just a little bit too hard for the human brain.

Do they avoid these things? I don’t think they do. I think people who are brilliant at C++ because they’ve been writing multithreaded code in it for years are going to do just fine writing multithreaded code in C++, and you want those people on your team. There can be such a thing as a C++ “duct tape programmer” if that guy is someone for whom C++ is the tool of choice, the language he writes everything in without blinking. It isn’t really the complexity of a technology that’s at issue, if I understand this correctly, but rather the coder’s familiarity with it. It’s all just a fancy way of saying “don’t get distracted by shiny things when it’s time to sit down and work.”

Or maybe it’s really “don’t spend more time thinking about what you’re going to do than you do actually doing it.” Or, honestly, it seems to be a little of both. But whichever one - or both - of these it really is, it’s the kind of advice that makes itself. It doesn’t help us much to point out that these issues are there since everyone is well aware of them - and not just in programming, but any field, really. There’re aphorisms for both sides that predate the computer by quite a bit: “a stitch in time saves nine” vs. “if it ain’t broke, don’t fix it.” Sometimes it’s good advice to plan ahead carefully, and sometimes it’s good advice to just get down to work and stick with what you know. How to know which is which? That would be a subject for an essay. But this one seems to say “don’t plan too much if planning too much would be a bad thing.” Uh-huh.

It’s manufactured controversy. It’s an essay that says something that seems controversial without actually saying much of anything at all. I don’t know whether there’s such a thing as “jumping the shark” for bloggers - but if there is, it might be something exactly like this. Someone who had useful and insightful things to say for most of his career finds himself stirring up non-controversies so that people can argue with him about something you can’t prove he said.

But then again, “jumping the shark” for bloggers might be more along the lines of what happened to the Arc Language Blog, which used to be about the Arc Programming Language but has since become a repository for - admittedly cool! - stories about the neat things you can do with an Arduino. Arduino’s great, but even if everything he shows is implemented in Arc, the blog has strayed from its purpose.

Well, maybe jumping the shark for bloggers is either of those things, and I’m doing what Joel’s doing by saying “hey, we all know it when we see it, so let’s call it out!”

So fair enough - I do indeed know what is meant by a “duct tape programmer” just from the clever name, even if Joel doesn’t do the best job outlining exactly what it is (or, probably more importantly, what the evil alternative is) for the rest of the screenspace. And I guess this is a term that will now get adopted for common use in the hacking blogosphere. And if that happens, then I guess it’s not fair to say that Joel has jumped the shark, since any blog that is creating useful jargon is ipso facto still relevant. And certainly Joel still says insightful and useful things on the Stack Overflow Podcast. In fact, the latest one I heard was the one that was a collection of clips from DevDays, and I definitely think Joel was on the right side of that brief argument with Scott Hanselman about Wasabi and type inferencing. The larger point that I took away from that, in fact, was that Joel was hinting that a lot of “new” technologies - such as compile-time type inferencing - aren’t really new at all, they’re just “recently popular.” A lot of what counts as progress these days goes back to new programmers not really knowing their field as well as they should - which was actually the subject of one of his better essays from back in the day. It’s like that Asimov short story about the guy who’s considered a genius because he rediscovers long division in a society where all math is done by machines. There really is a cost associated with innovation on the educational level, even if innovation makes you more productive overall.

And actually, I think that would have been a better way to express what a “duct tape programmer” is. It isn’t someone who eschews planning for practice, since both planning and practice are important. Rather, it’s someone who, by virtue of being really good at one thing, is in fact good at all things, because the time he saves not learning every new tool that comes along allows him to think about more general techniques. So it’s the difference between the person who spends his time learning general strategies for Magic: the Gathering rather than buying endless expansion packs hoping to get the ultimate trump card. Or the kid who builds what he needs out of basic Lego sets rather than assembling a different special kit for each highly specific thing he wants to build.

If THAT’s Joel’s advice, then I completely agree. But then I wouldn’t call it “duct tape programming,” since duct tape, in the end, isn’t a production-level tool, but just a stopgap for the time being. Python is that for me - the language I can get things done in quickly that I then go back and port to C++ later. The better advice might be to just get really good at C++, so that I can skip the initial Python step altogether. Which is to say that I think this “duct tape” thing is a red herring. In the absence of real skill and practice (and I am a Linguist first, so programming is a secondary thing - though I admit I find myself wishing more and more that programming was my fulltime job instead), duct tape is the way to go: it keeps your plumbing working until you can read up on water pressure. But duct tape is never better than the real fix, and if you’re a professional, you should start with the real fix. So if the advice Joel is giving is “don’t let shiny new technologies distract you from getting things done,” then some good advice about how to actually accomplish that is “pick one thing and get really good at that thing - learn it exhaustively, in fact, and practice at it daily - and then you won’t be distracted by the shiny new technologies in the first place because you won’t need them so much.” But that’s the kind of advice Donald Knuth would give, and not Jamie Zawinski. Of course, Jamie Zawinski was the first chapter in the book that inspired Joel to write this essay, and Knuth, as luck would have it, is the last. Maybe Joel just hasn’t finished the book yet.

November 17, 2009

DLL Hell Now in Mac!

Filed under: programming, technology — Joshua @ 5:24 pm

The ongoing saga of trying out .NET on Mac OS X via Mono. Remember that libiconv thing, where installing Mono briefly and mysteriously broke Mutt? Well, it turns out it breaks MonoDevelop too! More proof that the Mono Development Team isn’t exactly Mac-friendly. No - that’s NOT an accusation of sneakiness! The Mono Team does great work, and I’m grateful that Mono exists at all. That it runs as well as it does is a testament to their skill and dedication. No, I’m just taking this as more evidence that Mono doesn’t get tested as thoroughly on the Mac platform as it does on Linux and Windows - owing, I speculate, to the preferences of the development team. Fair enough.

In any case, the issue is that you (meaning some subset of us Mac users MAY) have to update libiconv to get Mono to work in the first place. Interestingly, though, the official beta release of MonoDevelop seems to depend on the same outdated version of libiconv that crashed Mono for me in the first place (at least, that’s what the error message I got implied…). Then, in a final and somewhat bizarre twist, it turns out that just grabbing MonoDevelop from a link on Miguel de Icaza’s blog directly (download link here: looks like it’s an early one that M J Hutchinson put up and never took back down) resolves the whole issue. Which is strange, because it makes it look like MonoDevelop for the Mac first used the new improved libiconv, then switched back to the old one inexplicably. Maybe in response to bug reports? But then, why does Mono itself continue to rely on the new version?

Clearly I’m missing something here, but I’m not complaining. I’m happy to have both Mono and MonoDevelop working and am looking forward to trying out the ASP.NET MVC here on the Mac. Thanks again to the Mono team for bringing .NET to the Mac platform - it’s much appreciated!

November 13, 2009

First Impression of a C# Latecomer

Filed under: programming — Joshua @ 5:30 am

As mentioned, I’ve been thinking seriously about porting Boltun (I really should add some more content to that link soon, eh?) to .NET. As a consequence, I’ve been delving into C# recently - reading both Andrew Troelsen’s excellent book Pro C# with .NET 3.0 and the Annotated Standard. So far, I’m really liking the language. Here are some initial thoughts.

Well, actually only one initial thought, and that’s that C# is to C++ what Objective-C was to C, and not what Java was to C++ or C++ was to C.

Alright, my analogy is a little bit contrived since C# really is Java-like in being explicitly targeted to a platform; it doesn’t make the same claim to universality that C++ did. But all things considered, it really does feel like an incremental improvement to C++ in a way that Java just wasn’t.

We all know the story: back in the dark days of malloc Bjarne Stroustroup endeavored to bring an object-oriented layer to what was then the “high-level” language of C. C was and is the logical favorite of systems programmers because it’s just enough abstraction over Assembler that you don’t have to use too much machine-level boilerplate code, but it’s not so far abstracted that you can’t interact with the machine directly. It mimics machine code really well, and as such is, as Kent Dybvig put it in class once, “kind of a beautiful thing.” Dybvig said that in the context of contrasting it with C++, which he and most others think of as an inconsistent, haphazard mess. And so it is. What started out (literally) as C with Classes ended up as a different language entirely, though it does support a mostly-comprehensive subset of C. But in a real sense, C++ was a kind of lie: it wasn’t an incremental improvement to C as the name implied; it was in fact its own thing.

So it’s not too radical to say Objective-C has the better claim to being “C++.” For one thing, it really is C with and object-oriented capabilities bolted on. For another thing, the object-oriented capabilities look and act like Smalltalk, the prototypical OOP language at the time, so even the object layer wasn’t that much of a leap. But Stroustroup called his language “C++” first, and so there you go.

Something similar happened with Java with regard to C++ in the mid-90s. Java was a language born largely out of frustration with C++. It wasn’t just a convenience tool for C++ programmers tired of writing their own string classes and garbage collectors, it was a calculated attempt to replace C++, so that people wouldn’t have to make the case to their employers for migrating away from C-world. The argument was that memory capacity and processor speed improve so quickly that there is no longer a need to optimize at these levels; technology improvements deliver the optimizations faster and less expensively than programmers can. For that reason, you should optimize your programmer resources by letting them work in an environment that abstracts away from the machine entirely, so that code is easier to update and maintain. If you find you need a performance boost, just upgrade your hardware. And they meant that seriously: the mantra was “write once, run anywhere:” and Java really was the first “language” that solved the cross-platform problem. By compiling to an intermediate language of instructions to a virtual machine, it ran on any computer running the virtual machine. It was neither interpreted nor compiled - actually it was a little bit of both.

I put “language” in shock quotes because I happen to agree with Paul Graham that Java isn’t a language that runs on a platform, Java is a platform. Paul Graham is a well-known Java-hater, so of course in context that was a snarky comment, but it needn’t be (a non-prejudiced explanation of the same concept here), and in any case it’s true. Java doesn’t compile down to anything that runs on your machine, it compiles down to something that runs on something that runs on your machine. Whether or not this was a good idea (probably it’s safe to say, with hindsight, that it was, though Paul Graham and others will have some not-unintelligent objections), Java was also a kind of lie. It was advertised largely as a cleaning up of C++, but in fact it was something quite different - just the way that C++ had been advertised as an incremental improvement to C when it was, in fact, a completely different language that just happened to look enough like C++ to lure that crowd into a false sense of security.

.NET, of course, is the same idea, but not wedded to any particular language. In that sense, it’s a kind of Java++: it specifies a virtual machine code that languages should target, and languages that target that bytecode can run on it. It’s “write once, run anywhere” for a fleet of languages rather than one. And the flagship of this fleet is C#.

I guess it would be fair to say that C# is Java whose time has actually come. When Java came out, it was deceptive to tell the C++ programmers that it was meant to make their lives simpler becuase it was in fact a whole new programming paradigm. It freed them from memory management, yes, and it came with a standard library that was actually useful, yes, but it also pulled out from under them all the things that they liked about C++: its expressiveness, its ability to manage machine-level things directly. Any surface syntactic resemblance to C++ was an illusion.

C#, by contrast, can make an actual claim to being an incremental improvement to C++. It won’t have replaced it exactly. C++ still gives a closer coupling to the machine in that C# doesn’t allow stack-allocated objects. All objects in C# are reference types, and so are allocated, at run-time, on the heap. But this is really just C# learning the obvious lesson from C++. The truth is that most C++ programmers who are worried about efficiency steer clear of objects anyway, prefering to use structs. structs are retained in C#, and they’re value types, allocated on the stack, so they’re just as efficient (in theory) as they were in C++. (Reality, of course, is that you’re still running in the .NET environment, so even the stack isn’t exactly machine code. There’s an inevitable performance hit, though unlikely to be noticeable in 99.9% of cases.) C# also gives the same reference semantics that C++ does - you can decorate argument parameters to a function with ref (& in C++), which gives the callee full access to the variable in the calling environment. C# retains C++’s obsession with annotating everything. They even add their own pointless annotations - for example there’s an out decorator for function parameters that are reference types specifically intended to collect return values. Why not just use ref, you ask? Why not indeed. out simply enforces that the variables be uninitialized before function call. No careful programmer actually needs this, but it’s very much in the spirit of C++’s “let the compiler do your debugging for you” philosophy.

It’s probably not really off the mark to say that C++ was two languages in one. There was the mouldy old systems language for the REAL men, and then there was this completely different object-oriented language for the boys, and the nice thing about C++ was that the systems programmers could do object-oriented programming on those rare occasions they felt the need, and the object-oriented people could write efficient code on those rare occasions they needed to. C# is also two languages in one. On the one hand, it retains the C++ dual systems/OOP language, but with much more emphasis on the abstracted half. On the other hand, it’s evolving toward allowing you to do static and dynamic programming all at the same time. The upcomming version will even include a dynamic keyword that allows you to mix late-bound code in with compile-time-bound code. A very basic example of this here: a method that returns a string if its input is an odd numer, an int if even - and it can be called from within a static context. It’s sort of the obvious step, and allows closer interoperability with the increasingly-popular dynamic languages on .NET. Point being it makes C# a kind of hybrid beast along the current programing paradigm faultline (dynamic vs. static) in the way that C++ was a hybrid beast along the faultline of its day.

C# is the new C++. It has been for years, of course, and I’m just majorly late to the table, but I’m really enjoying learning it and I think I’ll end up really liking it. Of course, the day after I start seriously diving into C#, I read that Google has released a language called Go that’s described on TechCrunch as “Python Meets C++.” Just the kind of largely-similar thing I needed to distract me from the task at hand.

November 11, 2009

Mutt Redux

Filed under: programming, technology — Joshua @ 2:18 pm

Well, wouldn’t you know it. My two big log pages for “how-to-install-Mac-unfriendly-shit-on-Leopard” turn out to be incompatible! I KNEW that installing Mono would break SOMETHING, and that something turns out to be Mutt.

The problem is that the library that Mutt calls to do its unicode conversions is libiconv.2.dylib, which was of course one of those things that had to be upgraded to get Mono to work. Identifying libiconv as the problem wasn’t the hard part; what had me gussing was that trying to reinstall Mutt via MacPorts mysteriously didn’t work.

It’s not a black mark against MacPorts. It turns out that I have been using a devel version of Mutt, which means I should be prepared for things to break while they iron out the kinks. So recompiling it from source with the new version solves everything since the new version doesn’t choke on libiconv.

OK, that’s not quite true. The Mutt team seems to have removed a bunch of features, so I had to comment out a lot in my config file to get it to load without a spew of pointless error messages. And it does this weird thing now where it doesn’t play all the headers back to me as I’m composing an email. Somehow the signals haven’t quite been reconnected in some parts. I don’t know if that’s my configuration or some stuff that’s still “under construction.” I’m guessing it’s my configuration somehow, but I’m sort of unwilling to start from scratch on that with an official config file. Maybe this weekend.

Anyway, it’s good to have the world’s best email client back up and running. I’m pleased to have Mono, but life is seriously hard for me without Mutt!

November 10, 2009

ASP.NET on Mac OS X

Filed under: programming — Joshua @ 6:10 pm

OK, so I mentioned that I really should learn PHP. So last night and this morning I started doing that again, and the same thing happened that always happens. Namely: PHP comes across as really convenient, lots of fun in a kid-on-Christmas kind of way, but there’s just something really dinky about it. It gives off the same lame vibe that Perl does - and for the same reason: too many subtly incompatible magic spells that definitely accomplish their limited little goals, but don’t really work as building blocks. I guess I’m spoiled by Python. In that world, there’s one right way to do things, and even if you don’t know what it is off the top of your head, your best guess is really darn close. Learning PHP is just a bunch of annoying memorizing.

Basically, I wanted to learn PHP ahead of open-sourcing Boltun so that I could port Boltun to PHP for maximum platform-independence. The one thing that PHP very definitely has going for it is that it’s ubiquitous … and interpreted. If you can get your application written in pure PHP, you really can provide a one-click install button for people that works on 99.99% of hosting services out of the box. That’s Very Nice. Just ask Wordpress. And Drupal. ET. CETERA.

Even so, it’s not worth turning into one of THEM, so I guess I’ll put off getting good at PHP a little longer.

That said, the current state of things in Boltun is that it’s written in a Pylons (Python) backend with a Javascript (JQuery, to be precise) frontend using Mako for a templating language. By the time you’re at the end of that sentence, you already have an idea what my complaint is. Basically, I’m tired of maintaining what amounts to two separate codebases. There’s Python, and there’s Javascript - and actually I guess you could count Mako as about a thrid of a codebase because although it’s Python-based, the inheritance structure can be a little confusing.

To make a long story short, for the first time in my life I find myself envying the Microsoft coders with their single-stack C#/ASP web framework. Partly because I’ve been wanting to learn C# for a long time anyway, and partly just because that’s the Way Things Ought to Be: most of the functionality in the backend with a templating language that really does interact with your underlying codebase.

I DO have a Vista machine (soon to be Windows 7), but unfortunately for my ASP dreams, Boltun has to run on a Mac server.

No matter: there’s Mono - an open-source implementation of .NET that includes ASP.

I got Mono up and running yesterday (it works out of the box on Leopard - download it here). But as you MIGHT be able to tell by listening to the Mono-honcho on Joel and Jeff, Mono isn’t exactly Mac-friendly. In particular, getting the ASP.NET parts of it to work on Leopard is a PAIN IN THE ASS. But I did it, so if you’re currently pulling out your hair as much as I did all day today, here’s the solution in brief with hat-tips and references noted. (A heartfelt thanks to everyone at those links who posted their solutions, of course. It is MUCH appreciated!)

(1) I did lie a little when I said that Mono works out of the box. It actually almost works out of the box. There’s one little niggling irritant, which is that for some reason my system couldn’t find the pkg-config executable that Mono installed in the right place. Leopard looks for it in /usr/bin, Mono installs it in /Library/Frameworks/Mono.framework/Commands. So I threw up a symlink:

sudo ln -s /Library/Frameworks/Mono.framework/Commands/pkg-config /usr/bin/pkg-config

There may be some others you have to do that way, I can’t really remember. Basically, run the basic tests the Mono guys give you, and it’ll be pretty obvious from the error messages which programs are missing from /usr/bin.

Now on to ASP.NET.

(2) You’ll probably need to clean up MacPorts. Fortunately, I’ve been jittery about MacPorts owing to bad experiences in the ghastly DarwinPorts days, so I could start over without any fear of breaking anything. One of the problems you’ll encounter is that you’ll need something called glib2, and it’s VERY IMPORTANT that it be compiled for 64bit architecture. I couldn’t find a way to install it without MacPorts, which is why I bothered with MacPorts at all. If you don’t have MacPorts, grab the install package and do the normal thing. If I remember correctly, it just works, which is a HUGE improvement over the Dark Days of DarwinPorts right there. Unfortunately, what “works out of the box” isn’t really optimized for Leopard (yes, even now that we’re up to Snow Leopard), so you need to make a tweak or two. This page has got the goods. In a nutshell, you need to make sure that everything that you ever get from MacPorts is both universal and 64-bit, and you do that by editing /opt/local/etc/macports/macports.conf and /opt/local/etc/macports/variants.conf. In the first, you add/change this line:

universal_archs i386 x86_64

In the second add:

+universal

It is obvious from the comments in the files where you do it. Weird sidenote - vim would only open them in readonly mode, so you have to do a

:w!

rather than a

:w

. I have no idea about The Other Editor.

Now, in my case I took the drastic step of unintalling everything that had previously been installed by MacPorts, just to make sure there were no nuanced dependency issues. I can do that because of my aforementioned MacPorts phobia. If you’re hipper to MacPorts than I am, you’ll want to be more careful so as not to inadvertently break anything. The cleanwipe command is:

sudo port -f uninstall installed

Then you install glib2 and libtool:

sudo port install glib2 libtool

(NOTE: the MacPorts install won’t have made your system aware of the

port

command, so you’ll either have to update your path, symlink it to /usr/bin, or do what I did and just execute the command from within the /opt/local/bin, where it’s stored.)

(3) Now the hard part - getting and installing mod_mono for Apache. You can probably get it from MacPorts, but as mentioned I do as little through MacPorts as possible, so I got the sourcecode directly from Novell (here: at the time of writing the latest version is 2.4.2), put it on my desktop and extracted it. Then, of course, you cd into the directory that contains it and there’s a script there called configure that sets it up for you - all the usual stuff. Here are the flags that I used (with many thanks to Geoff Norton via jgeerdes - though note that my suggested flags differ subtly from theirs since I installed Mono via package rather than via MacPorts).

 ./configure CFLAGS="-m64" --prefix=/usr --with-apxs=/usr/sbin/apxs
--with-apr-config=/usr/bin/apr-1-config
--with-apu-config=/usr/bin/apu-1-config
--enable-debug

Probably the –prefix bit isn’t necessary; I was being cautious. If all goes well, this will get you set up. THEN…

(4) The usual make and then sudo make install. Now here comes another quirk. For some reason, the install script creates the files you need and then deletes some of them. What worked for me is replacing them by hand. They’re in [MOD_MONO_DIRECTORY]/src/.libs (in my case /Users/joshua/Desktop/mono-2.4.2/src/.libs). Copy mod_mono.so and mod_mono.0.0.0.so to /usr/libexec/apache2/.

And that worked for me. Basically, I spent most of my time today first finding where these files were, and then banging my head against the wall that they didn’t work, which turned out to be because my glib2 wasn’t 64-bit, hence all the MacPorts stuff.

After that, following the instructions in the INSTALL file included with mod_mono to run the packaged tests just works. (Of course, the package tests are in a different place on Mac, so instead of using their paths, you should use /Library/Frameworks/Mono.framework/Libraries/xsp/test!)

(4.5) In theory, you’re ready to go. And in reality, you might even be. In my case, there was one remaining minor annoyance. My version of libiconv.2.dylib, on which mod_mono apparently depends (run otool -L /usr/libexec/apache2/mod_mono.so to find all the dependencies), was out of date. You wouldn’t know it from the output of otools, though, because the correct version of libiconv.2.dylib IS, in fact, installed on my system, but the path to the correct version is later in my PATH environment variable than the incorrect version. A subtle dependency issue that some might miss, so I thought it worth pointing out. This kind of thing comes up all the time with MacPorts (a legitimate, non-prejudiced reason I try to avoid it - not MacPorts’ fault, just inevitable when you have more than one locus for library storage on a system), and sure enough the correct version that the system was finding was in /opt/local/lib. By now you know the solution:

sudo mv /usr/lib/libiconv.2.dylib /usr/lib/libiconv.2.dylib.correct
sudo ln -s /opt/local/lib/libiconv.2.dylib /usr/lib/libiconv.2.dylib

In other words, move the old one out of the way (but keep it in the same place named something sensible in case you ever need it back!), and symlink to the “correct” one on your system courtesy of MacPorts. (For newer Macs, this shouldn’t be an issue - mine just happens to be almost 2 years old.)

(5) OH ONE OTHER THING. In a bit of poetic justice perhaps, Mono seems to break PHP on my machine. No kidding! I had to comment out this line:

LoadModule php5_module        libexec/apache2/libphp5.so

in /private/etc/apache2/httpd.conf to get it to work. Which is fine by me since the whole point of this exercise in the first place was to put off working in PHP just a little bit longer…

Looking forward to working with you Mono!

November 8, 2009

Closure Notes

Filed under: programming — Joshua @ 4:48 pm

Not news anymore: Google released its Closure Library for Javascript - yes, the magic tools that the Google Wizards Themselves use for their very own company applications.

Somewhat newsy: I’m intrigued and downloaded it. I’m thinking about rewriting the Boltun Project’s Javascript in it. OK, excuse the crappy webpage. OK, excuse the sparse webpage. Or, rather, at the time of writing perform these excusing operations because of course there are all the best intentions in the world of sprucing it up. Boltun is the ICALL project for Russian (”Boltun” is “chatterbox”) I have spent the last three semesters building. It’s not ready for primetime, but I’m tasked with making it so by the end of the semester, when the grant runs out and I go back to teaching. Owing to wide-ranging differences of opinion and vision with the supervisor on this project, I’ll be starting my own cloned and language-independent version in January or thereabouts, so stay tuned.

In any case, Boltun’s Javascript is all JQuery, which is a very nice library in terms of functionality. Pretty incredible, actually, and I would recommend it but that things are sort of “sticky” for strange reasons. There are a lot of - not bugs, exactly, since they function correctly … but a lot of things that don’t function as smoothly as I would like. Google’s library may hold the answers, and in any case it gives me an excuse to completely rewrite the codebase - which of course is something that you should never ever ever do - but I think in this case it’s OK since (a) the Javascript is inconsistent (there are two “types” of lesson, which I wrote at different times and have never gotten around to streamlining), (b) the project is not yet that big, (c) there are no users depending on it yet, (d) I’ve never gotten clear instructions about what the project is even supposed to do, really (hence the desire to do this on my own after the grant runs out), (e) I’m going to open source it in a couple of months which means OTHER PEOPLE WILL SEE IT and most importantly (f) I’m hugely underpaid to do this anyway (it’s an academic grant - I get $1000 a month for what is properly a three-man, $4000/month/each job; they don’t even pay my tuition), and as I recently learned on EconTalk, when competition is restricted at one margin, it will invariably happen at another. From which I take as a corollary that when you underpay someone for something, they find a way to make up the difference themselves, as I’m currently doing by acquiring as wide a range of marketable web programming skills as possible on the clock!

One of which should probably be learning PHP, actually, so I’m thinking of rewriting Boltun in PHP (the backend is currently in Python/Pylons, about which I actually do have a few complaints to air in an upcomming post).

But another of which could easily be Closure. Since it’s Google’s baby, there’s good reason to expect it to rapidly overtake JQuery.

Which brings me to what I wanted to talk about. Closure comes with a neat testing tool that you can open up in a browser. It will attempt to load a dizzying array of pages (apparently the library is HUGE - not that you would really expect any less) all of which test the functionality of some subsection of the libraries, and it then reports on all the crashes it gets.

INTERESTINGLY: there are 90 of these for Opera 10.1, and only 7 for Firefox 3.5.5. I’m guessing there are exactly ZERO for Google Chrome, but Chrome isn’t exactly Mac-friendly yet, so… Oh, and speaking of Macs, wasn’t it Safari that supposedly has the spiffiest Javascript interpreter EVUH? Well, it turns out to be kinda true if Closure is any indication. It also gets only 7 errors, but it completes the suite in less than half the time it takes Firefox. 96s for Firefox, 44s for Safari 4.0.3. Opera finished in 76s for what it’s worth, but then, it failed 90 tests so I’m not sure it gets a gold star for speed…

I didn’t do a close check, but the Safari and Firefox errors seem to be the same ones, so it looks like Closure was written for Chrome, which has some nonstandard implementation-y things in its Javascript interpreter here and there … would seem to be the correct interpretation. And … Safari’s rumored-to-be-blazingly-fast Javascript compiler really is blazingly fast. Go WebKit!

Actually, I’ve never been that much of an Apple loyalist - always kept one foot in the Linux world. So I probably won’t make the switch to Safari. But since Boltun targets Firefox, the fact that Closure seems to run pretty well on it is encouraging. I’ll let you know how the rewrite goes.