Archive for October, 2006

Revision, revision, revision

On Wednesday, we finally finished revisions to the TOPLAS version of AspectML: A Polymorphic Aspect-oriented Functional Programming Language. While somewhat tedious, I think the revisions made the paper more approachable. Not to mention fixing a number of typos, omissions, and minor bugs. Assuming the reviewers dont' request another round of revisions, this will hopefully be about the last time I need to touch this paper.

Also, in conjunction with finishing these revisions, we made an α-quality release of the AspectML interpreter that corresponds to the language as described in the paper. It doesn't really come with any real documentation, but if you're interested in trying it and can't figure it out, let me know. You should only really need to have a recent version of SML/NJ handy. However, I highly recommend using rlwrap if you aren't running SML from emacs or some such.


Bug squashed at last. Kinda.

One thing about WordPress that has been annoying me for quite some time is that it mangles UTF-8 in the subject of e-mail messages it sends out. Consequently, because my site uses the &exists; character, the subject line of messages I get about it are garbage. Apparently this has been a bug in WordPress for quite some time, but I did find that there is a plugin that will fix the problem. Of course I'll need to wait for someone to comment or what not before I'll know whether it truly works.

Also somewhat strange was the other day I received a spam comment after having installed WordPress Hash Cash. This leads me to believe that it is either broken or there are actually people out there spamming from an actual web browser. I suppose in theory it wouldn't be out of the question to robotize a web browser. The arms race continues. I still haven't come up with an effective way of filtering image-based spam. Though I sometimes suspect that some of it gets through because one of my spam filters fails to run if it somehow failed to acquire a lock.


Interesting problem

Is it possible to compute a regular expression over byte sequences to match against specific ranges of Unicode glyphs encoded using UTF-8? This comes up when you're stuck using lexer generators that only understand byte sequences.

I have to assume that it must be possible, because at worst you can just union together the UTF-8 byte sequences for every glyph in the range. However, given how large some ranges can be, this would probably be very inefficient. The better question is whether or not there is a smarter way of constructing your regular-expression or DFA. Though right now I'm not inclined to think about it too deeply as I know if it really comes down to doing this, I can just do it the blindingly stupid way and write some DFA minimization code.

Comments (7)

Seems like there should be a story behind this one.

Danger Helvetica

Though maybe this is just a "Photoshop" job.

Comments (1)


Over the past few weeks I've been starting to feel that perhaps it was time to reconsider my choice of editors again. There are many things to like about vim, but it seems like an evolutionary dead-end. The basics are simple, but the overall design doesn't seem very unified. I did a little programming in vim's "internal" language, although you have the option to link vim against a number of other languages, and it seemed quite clumsy. Not to mention quite imperative. Plus, many projects I work on, like AspectML/InforML are written in languages that do not currently have adequate support in vim. Alan may be a little disappointed with me, but I gather he is using TextMate for most things these days anyway.

So I have been considering my option. I did look at TextMate, as Alan had mentioned it, but it didn't seem too compelling to me. To me it seems somewhat like a reimplementation of emacs, without the Lisp. Not to say that wouldn't be a good thing, but it was also entirely MacOS X specific. I'm a very cross-platform fellow, so I need something that works essentially everywhere.

Next I took a look at jEdit. It seemed fairly nice, and had some interesting plugins. Unfortunately, it was just far too slow on my laptop for regular use. So I looked around a bit more, but I hadn't really found anything that seemed promising. Then on Friday, during a discussion with some of the PLClub folks, the question of antialiased fonts in emacs came up. (X)emacs support for antialiased text was one reason I decided to switch to vim; the other was vim's superior support for UTF-8.

Currently, the "release" version of emacs, 21.x, still lacks both. But the "emacs-unicode-2" branch in CVS does have them. However, I gather that this branch is intended to become emacs 23.x, which either implies they plan of skipping version 22.x altogether, or at the current release rate it will be more than a few years before it is released. Still, I checked it out of CVS and was able to build it without too much trouble. Despite having a stern warning about it begin "alpha quality" on the start up blurb, so far it has been pretty stable for my needs.

So I'm back to emacs again after a number of years. I still think there is a need for a new cross platform editor. I did find Eclipse quite promising, but I think it currently suffers from being perceived as just for editing Java and a seemingly steep learning curve when it comes to developing new editing modes (at least I couldn't find any tutorials or documentation).

Comments (3)