## More notes on Featherweight Scala

As I mentioned the other day, Featherweight Scala has a typing rule that allows assigning a path (x.l1...ln) a singleton type:

Γ ⊦ p : T
-------------
Γ ⊦ p : p.type

(I should note that for some stupid reason WordPress seems to prevent MathML from working, making the rather pleasant <mfrac> tag useless). This is essentially the rule that everyone else uses, but in all the other literature I've looked at (other than νObj) the singleton is labeled with T (something like ST(p)) or T must have a specific type (int, base, etc.).

As I pointed out, Featherweight Scala's version of the rule makes it difficult to write meaningful inversion lemmas. Secondly, the naïve subject reduction theorem that could be obtained would be something like:

If Γ ⊦ t : T and t → t' then exists some T' such that Γ ⊦ t' : T'.

Which feels rather unsatisfactory. Given the subsumption rule for singletons

Γ ⊦ p : T
---------------
Γ ⊦ p.type <: T

It is plausible that a variant of the theorem where T is a subtype of T' would hold, but that would run counter to the expectations that reduction will only allow us to more precisely characterize the type of result. Thinking about it I suppose it might be reasonable to state the theorem as

If Γ ⊦ t : T and t → t' then there exists some T' such that Γ ⊦ t' : T' and
if there exists some p such that T = p.type then Γ ⊦ p : T', otherwise Γ ⊦ T' <: T.

Which I am starting to believe is the best that could be proven. This theorem nicely captures the discontinuities in subject reduction introduced by singleton types. You keep making steps getting a more and more precise type, and then suddenly you "jump" from one type to another, followed by some period where the type becomes more refined, then another "jump", and so forth.

These discontinuities are quite a problem, because it means that after taking a step it may not be possible in some cases to put the resulting term back into an evaluation context and remain well-typed.

νObj [1], the predecessor to Featherweight Scala, has nearly the same problem, but the language has been constructed just right so that typing always becomes more precise. The same changes cannot be made to Featherweight Scala and retain the property that the language is a subset of Scala. Aspinall's original work [2] on on singleton types makes use of an erasure predicate, but his singletons are labeled with a type making its definition trivial. Stone and Harper's work on singleton types [3] solves the problem by using a subject reduction theorem that looks roughly like

If Γ ⊦ t : T and t → t' then that Γ ⊦ t = t' : T.

but I am not sure there is a pleasant notion of term equality in Featherweight Scala, because all reductions either consult the heap or extend it. Singletons are also slightly more tractable in their setting because they require any term being used as a singleton type to have the base type "b".

[1] Martin Odersky, Vincent Cremet, Christine Röckl, and Matthias Zenger. A Nominal Theory of Objects with Dependent Types. Tech report 2002.
[2] David Aspinall. Subtyping with singleton types. CSL 1994
[3] Christopher Stone and Robert Harper. Extensional equivalence and singleton types. ACM TOCL 2006.

## On Arc

Over the years my appreciation for lists has increased. In exploratory programming, the fact that it's unclear what a list represents is an advantage, because you yourself are unclear about what type of program you're trying to write. The most important thing is not to constrain the evolution of your ideas. So the less you commit yourself in writing to what your data structures represent, the better.

Clearly Paul has been living in a cave for the past thirty-five years and did not hear about those quaint ideas, abstract data types and representation independence.

He seems to miss the obvious that using a list to represent a point is quite constraining because you are forced to require one coordinate be the first element of the list and the other coordinate be the second element of the list, and if you ever want to change it, good luck. I suppose his rebuttal would be »oh, it just takes too long or too many tokens to define an ADT for points«.

I really wonder how long it is going to take people to truly learn that telling the compiler about the structure of the data you work with is just as important as telling it about the control flow.

## Principia Narcissus finished

I just finished the last revisions on it today. If you like, you can purchase a bound copy from Lulu at cost. You can also download a copy from them, but it does not look like it will have the cover (or at least as part of a single download). I am hoping I can put together a PDF that contains the front cover for download.

One of the things that rather surprised me when I was making the final tweaks to the cover was that the width of the printed book Lulu sent me is actually 209mm rather than the 210mm you would expect for A4. However, this did not seem to impact anything, so I will not worry about it. My best guess is that a millimeter or so of the page width is lost due to the binding process. I was surprised to find that I had managed to quite accurately center the large orange text on the cover quite well by eyeballing it.

This final version contains a number of typographical fixes and small wording changes. Chapter 5 received the greatest number of edits as I decided to proofread the entire chapter again. It was the last chapter written and had received the least amount of scrutiny.

I really hope that somehow I did not introduce some kind of terrible printing problem or mistake in the process of fixing all the things I noticed that were wrong with the draft printing I received. If I did, I expect most of the recipients will probably not tell me.

## Type safety and singletons

I am continuing to make progress on the proof of soundness for Featherweight Scala, but the most significant difficulty I am encountering involves the rule that allows paths, expressions of the form `x.l1.l2...ln`, to be given singleton types:

 Γ ⊦ p: T Γ ⊦ p: p.type

It seems to make any defining any sort of useful inversion principle impossible. For example, the term `x.l1` where `x` has a field `val l1 : Int = 3` will reduce to `3`. In the proof of preservation, you would expect to be able to use inversion to show that if `x.l1` is well typed with some type `T` that `x` has a field `l1` with type `T`, or at least some subtype of `T`. However, with singletons it is possible that the type of `x.l1` is `x.l1.type`. In that case, the only thing you can determine by inversions is that `x` has a field called `l1` and it has some type. Many similar problems arise determining what would be seemingly straightforward facts because it is impossible to relate two types in a meaningful way.

There must be a resolution to the problem, because I know there are other type systems that allow singletons in this fashion, but I have not been able to hit upon the lemma to prove that will allow smoothing over the discontinuities singletons introduce. For now I think I will try proving safety with the above typing rule omitted, and see whether there are any other problematic features of the language.

## Research continues

My wrist are still not doing so well, so I am trying out using handwriting recognition in the evenings. Blogging is obviously falling by the wayside. So for now, I am mostly focusing on continuing my proof of soundness for Featherweight Scala on paper, which should be helpful if I ever get back to the Coq version of the proof.

This afternoon, I wrote a follow-up to a thread on the TYPES list concerning the applicability of my research on generalizing parametricity. I hope it made some sense. At least I have not yet received a serious criticism of the idea.

## Draft print of Principia Narcissus from Lulu

It just arrived this morning:

It looks pretty good, but there are definitely a few things I need to fine tune. For example, I need to center the text on the spine a little better.

Update: If you clicked through to Flickr, you would have seen my comment on it, but just in case, I should thank Tim Lindenbaum for kindly permitting me to use his daffodil photograph as the basis for the cover.

## Tablet computing

Writing about RSI reminded me that I had never gotten around to talking about the ThinkPad X61 tablet I purchased at the beginning of August. It is a pretty solid when used as a laptop.

As far as being a tablet goes, it works reasonably well in that domain too, with some exceptions. Firstly, for now if you want to use the tablet capabilities to their fullest, you need to run Windows Vista. The tablet is supposed to be supported under Linux, but there is really only one program that supports handwriting recognition program available, Cellwriter. It looks promising, particularly because it can be trained to generate any Unicode glyph – with Vista you are limited to the system's configured language. However, I do not think it would be difficult for Gnome or KDE to catch up in this area if they put a little effort into it.

My initial solution to this was that I would just run Linux under VMWare. Except I soon found that while I can use the tablet as a mouse, VMWare will not accept the input events the handwriting recognition subsystem generates. When I filed this as a bug they did not seem to think this was a problem.

While working with standard Windows applications, anything with a input field can accept handwriting recognition input. I almost wrote my entire defense presentation this way, but near the end I gave in and used the keyboard to do most of the last minute tweaking.

Of course, there is the question of whether writing by hand is any easier on my wrists than typing. It is difficult to say, for one, it becomes basically impossible to use emacs, unless you can do everything from pull-down menus, because of the chording necessary to activate some functionality. And it is definitely slower than typing, even with practice I expect. However, part of the problem could perhaps be resolved by rethinking various applications with tablets in mind.

## Research update

I have been making reasonably steady progress on my formalization of Featherweight Scala in Coq. I have just about seven hundred lines of definitions, with only few more things to make explicit. At that point I should be ready to start on actually proving progress and preservation.

I volunteered to triage issues submitted to the Scala bugtracking system this month, so that has been taking a little bit of my time. Mostly this just involves checking that what was submit is truly a bug and reproducible, before assigning it to the relevant person.

I am also responsible for the interpreter and the ScalaDoc, so I have also spent the time fixing a few bugs that have shown up in the interpreter. The big one of the past week is that for 2.6.2 the plan is to finally interoperate with Java generics. As a consequence, the compiler now defaults to generating class files in the format used in version 1.5 of the JVM and above. However, this broke the nightly tests run against the 1.4 JVM. The interpreter would compile and run fine until it tried to load one of the class files itself had compiled, which had an incompatible version number. So I added a check that if the interpreter is running on a pre-1.5 JVM, to force it to generate compatible class files.

An undergrad e-mailed me to ask about working on the Literate Scala semester project that I had posted. I met with him briefly last week for mostly administrative purposes. That should be interesting.

I should be moving forward on the implementation of virtual classes, but I have been avoiding it because I am not yet sure how to best modify the AST to represent them. I will hopefully have chance to discuss it with Martin this week.

I have also been spending time thinking about the next major project, an effect system for Scala. I have been thinking hard about how to add one while still maintaining reasonable backward compatibility and so that it is not exceedingly verbose. It is a very difficult problem. Furthermore, one of the most widely used effect systems that exists in programming languages, Java's throws clauses, is generally reviled by programmers. One thought I had was that maybe a whiz-bang effect system with effect polymorphism, a rich language of effects, and hooks for adding user defined effects is just overkill. Maybe programmers would be happy with being able to write an annotation like `@pure` on their methods and closures, and leave it at that. Enforcing purity would not be too difficult I think (ignoring the well known problem of the termination effect), and be much easier for programmers to reason about.

My wrists have also started acting up again, so I am not entirely sure how to deal with that, yet continue pushing my current research projects forward.  It has been suggested to me to look into voice recognition software, but I am not sure how compatible that would be with my current workflow.  Not to mention that I can only imagine how much it would annoy the people in the adjoining offices given that they have already expressed displeasure at all the beeping caused by emacs on my desktop computer.

## Prognostication

There is a current thread on Lambda the Ultimate at the moment on predictions for 2008. I will have to remember to look at it again later this year.

Most of the predictions concerning Scala are positive.

I would have to agree with Sean that there have been many years I've heard that concurrency will be the big thing, but I do not feel there has been too many big developments in the area recently. Robin Milner has recently released a draft book on his theory of bigraphs, his newer generic model for concurrency. There is actually an even newer draft of the book, but that is not presently linked from his website, so I am not sure whether I should publish the link.

## Implementing my dissertation

Okay, I am finally getting around to writing something, as requested, about how I implemented my dissertation.

I used pdfLaTeX for several reasons:

• I needed to integrate material from existing papers written in LaTeX.
• Using standard LaTeX means that you need to fiddle `dvips`/`dvipdf` so that that they will find the appropriate fonts. Using pdfLaTeX requires some configuration too, but it all happens in a single step.
• It has support for microtypography, which generally produces better looking text with fewer bad breaks.

I had thought about trying luatex to allow me to automate a few things better, but I decided against it because it might have slowed me down.

I then used the `memoir` class. I was able to use that with relatively little configuration. The only major change I made from the defaults involved the configuration for the chapter titles. I did however wind up needing to use a slightly modified version of the class file itself. This was necessary so that I could use raised dots (·) instead of periods as separators; there was simply no way to configure this for some parts of the document otherwise.

For the bibliography I used Chung-chieh Shan's McBride BibTeX style. I made a slight change to that so that the three emdashes for repeated authors did not have space in between them.

I use Didier Rémy's mathpartir package for typesetting inference rules.

I used my tool otftofd to generate all the LaTeX infrastructure for the OpenType fonts I used. I hacked up some custom OMakefiles for the TrueType fonts. The bits I used for Dejvu Sans Mono are now part of the package I have made available.

I wrote a small sed script for converting my examples written in InforML with some special markup in comments into a form that could be accepted by LaTeX.  This mostly involved escaping some characters and inserting uses of my macro for highlighting text.

That is everything I can think of from a high level.  If there is something specific I did in my dissertation that you would like to know about, let me know.