Even better than the real thing

Scala Classic

Several weeks ago I decided that instead of calling the core calculus that I have been working on Featherweight Scala, which could be confusing given that there is already a different calculus with that name, that I would call it Scala Classic.

I had hoped to submit a paper on Scala Classic to POPL 2009, but it has been taking too long to work out the metatheory. Partly because I am developing a mechanized proof of type soundness in Twelf, and on my first attempt at representing heaps I think tried to be too clever.  However, a more traditional treatment of heaps leads to the need to explicitly represent typing contexts, rather than the implicit treatment of contexts more commonly used in Twelf.

This afternoon I finally finished the proof of the theorem that is traditionally called "substitution".  However, the proof of type preservation will also require another theorem formalizing a substitution-like property for singleton types.  I am hoping that I can prove that theorem more quickly now that I've built up all the explicit context theory.  I have not thought much about whether there are any unusual theorems needed to prove progress.

In any event, I would ideally have the proofs finished by the beginning of August.  My current plan is to try submitting the paper to PLDI (which is even in Europe this year), unless some obviously better venue comes to my attention.

One of the reasons I would really like to have the proofs done by August is that I will be teaching a PhD level course here at EPFL this fall on mechanized sepcifications and proofs, with a strong emphasis on deductive systems. It also looks like I may fill in some for a masters level course based on TAPL.  So between those two things, I am not going to have nearly as much time for research as would be ideal.

Comments

Featherweight Scala lives

With everything else I've been having to deal with the past couple months, I have not made nearly as much progress on Featherweight Scala as I would have liked. Fortunately, in the past two weeks I have finally started to get back on track.

Probably what makes working out the design of Featherweight Scala so difficult is that I am trying my best to commit to it being a valid subset of Scala. It seems like whenever I try to simplify one part of the language, it just introduces a complication elsewhere. Still, I think so far that I have been able to ensure that it is a subset with two exceptions.

The first exception is that some programs in Featherweight Scala will diverge in cases where the Scala compiler will generate a terminating program. This is because in programs that attempt to access an uninitialized field, Scala will return null, while Featherweight Scala diverges. Since the program many never attempt to use the null value, it may simple terminate without raising a NullPointerException.

The second exception arises because type checking in Featherweight Scala's type system is most likely undecidable. Given that the design is in flux, I haven't precisely verified this, but it seems quite plausible given its expressive power is similar to languages where type checking is known to be undecidable. Because type checking is undecidable, and the Scala language implementation attempts to have its type checking phase be terminating, there should be some programs that can be shown to be well-typed in Featherweight Scala that the Scala compiler will reject.

I think the other thing I've determined is that I do not understand F-bounded quantification as well as I thought I did. At least, I find myself worried about the cases where the typing rules presently require  templates, that have not yet been verified to be well-formed, to be introduced as hypotheses while verifying their own correctness. So I need to do some additional reading to see how some other languages deal with this issue. Certainly I can put my mind at ease by making the type system more restrictive, but I do not want to be so restrictive that it is no longer reasonable to call the language Featherweight Scala.

Update: Okay, so after looking at Featherweight Generic Java and νObj again more closely, they both do essentially the same kind of thing. In FGJ, the class table is well-formed if all its classes are well-formed, but checking a class's well-formedness may require presupposing that that class tale is well-formed. In νObj, the well-formedness of recursive record types is checked under the assumption that the record's self-name has the type of the record. Certainly the same sort of things come up when verifying the well-formedness of recursive functions and recursive types, but at least there what is being assumed is at a higher level of abstraction: when checking a recursive function you assume that the function has some well-formed type, or when checking a recursive type you assume that the type has some well-formed kind.

Comments

Research update

I finished a draft of "Generalizing Parametricity Using Information-flow" for LMCS this past weekend. That should leave me with fewer distractions to work on something for ICFP.

My work on Featherweight Scala is moving along, sometimes more slowly than others. Other than the issues with singletons I have already discussed, one problem with the original definition of Featherweight Scala is that it did not truly define a subset of Scala proper – that is, there were many valid Featherweight Scala programs that were not valid Scala programs.

So I have been working on trying to get Featherweight Scala to be as close to a subset of Scala proper as is possible without making it far too complicated to be a useful core calculus. Some of this has involved changing the definition of Featherweight Scala. Some of this has involved lobbying for changes to Scala proper to make it more uniform (so far I've succeed with extending traits to allow value declarations to be overridden).

Through the whole process I've also been spending a lot of time typing things into the Scala interpreter to figure out how Scala proper treats them. For better or worse, I've actually managed to uncover a fair number of bugs in Scala proper doing this. I've almost reached the point where I treat it as a bit of a game: can I from thinking about the properties of the Scala type system come up with some corner case that will exhibit an unsoundness in the language (a ClassCastException, a NoSuchMethodException, etc.), or at least crash the compiler?

Last week I came up with a pretty nice one where I used the fact that Scala should not allow inheritance from singleton types to launder types in an unsafe fashion (now fixed in trunk!). Prior to that I came up with something fun where you could trick the compiler into letting you inherit from Nothing (which is supposed to be the bottom of the subtyping hierarchy). Today I got to thinking about Scala's requirements that paths in singleton types must be stable – all components of a path must be immutable. So for example in

  1.  
  2. var x = 3
  3. val y : x.type = x // ill-typed

The path x is not stable because x is a mutable variable. However, Scala does allow lazy value declarations to be stable.

  1.  
  2. var x = 3
  3. lazy val y = x
  4. val z : y.type = y // allowed, at present

Note that y is not evaluated until it is first needed, and in this case its evaluation involves a mutable reference. Furthermore, add into the mix that Scala also provides call-by-name method arguments (lazy values are call-by-need). So I started riffing on the idea of whether I could violate stability by changing the mutable reference between the definition of the lazy value, or a call-by-name argument, and its use. In retrospect, I am leaning at present towards the belief that there is no way this should be exploitable from a theoretical standpoint. That does not mean that the implementation is necessarily in alignment with theory, however. I did manage to hit upon a combination in my experiments that resulted in a NoSuchMethodException, so the exercise was not a complete letdown.

I should point out that these things do not really reflect on the quality of the Scala language as whole. It is a rich language that does not yet have a complete formal model, and as such, in the implementation process it can be easy to overlook some particularly perverse abuses of the language.

Comments

More notes on Featherweight Scala

As I mentioned the other day, Featherweight Scala has a typing rule that allows assigning a path (x.l1...ln) a singleton type:

Γ ⊦ p : T
-------------
Γ ⊦ p : p.type

(I should note that for some stupid reason WordPress seems to prevent MathML from working, making the rather pleasant <mfrac> tag useless). This is essentially the rule that everyone else uses, but in all the other literature I've looked at (other than νObj) the singleton is labeled with T (something like ST(p)) or T must have a specific type (int, base, etc.).

As I pointed out, Featherweight Scala's version of the rule makes it difficult to write meaningful inversion lemmas. Secondly, the naïve subject reduction theorem that could be obtained would be something like:

If Γ ⊦ t : T and t → t' then exists some T' such that Γ ⊦ t' : T'.

Which feels rather unsatisfactory. Given the subsumption rule for singletons

Γ ⊦ p : T
---------------
Γ ⊦ p.type <: T

It is plausible that a variant of the theorem where T is a subtype of T' would hold, but that would run counter to the expectations that reduction will only allow us to more precisely characterize the type of result. Thinking about it I suppose it might be reasonable to state the theorem as

If Γ ⊦ t : T and t → t' then there exists some T' such that Γ ⊦ t' : T' and
if there exists some p such that T = p.type then Γ ⊦ p : T', otherwise Γ ⊦ T' <: T.

Which I am starting to believe is the best that could be proven. This theorem nicely captures the discontinuities in subject reduction introduced by singleton types. You keep making steps getting a more and more precise type, and then suddenly you "jump" from one type to another, followed by some period where the type becomes more refined, then another "jump", and so forth.

These discontinuities are quite a problem, because it means that after taking a step it may not be possible in some cases to put the resulting term back into an evaluation context and remain well-typed.

νObj [1], the predecessor to Featherweight Scala, has nearly the same problem, but the language has been constructed just right so that typing always becomes more precise. The same changes cannot be made to Featherweight Scala and retain the property that the language is a subset of Scala. Aspinall's original work [2] on on singleton types makes use of an erasure predicate, but his singletons are labeled with a type making its definition trivial. Stone and Harper's work on singleton types [3] solves the problem by using a subject reduction theorem that looks roughly like

If Γ ⊦ t : T and t → t' then that Γ ⊦ t = t' : T.

but I am not sure there is a pleasant notion of term equality in Featherweight Scala, because all reductions either consult the heap or extend it. Singletons are also slightly more tractable in their setting because they require any term being used as a singleton type to have the base type "b".

[1] Martin Odersky, Vincent Cremet, Christine Röckl, and Matthias Zenger. A Nominal Theory of Objects with Dependent Types. Tech report 2002.
[2] David Aspinall. Subtyping with singleton types. CSL 1994
[3] Christopher Stone and Robert Harper. Extensional equivalence and singleton types. ACM TOCL 2006.

Comments (1)

Type safety and singletons

I am continuing to make progress on the proof of soundness for Featherweight Scala, but the most significant difficulty I am encountering involves the rule that allows paths, expressions of the form x.l1.l2...ln, to be given singleton types:

Γ ⊦ p: T


Γ ⊦ p: p.type

It seems to make any defining any sort of useful inversion principle impossible. For example, the term x.l1 where x has a field val l1 : Int = 3 will reduce to 3. In the proof of preservation, you would expect to be able to use inversion to show that if x.l1 is well typed with some type T that x has a field l1 with type T, or at least some subtype of T. However, with singletons it is possible that the type of x.l1 is x.l1.type. In that case, the only thing you can determine by inversions is that x has a field called l1 and it has some type. Many similar problems arise determining what would be seemingly straightforward facts because it is impossible to relate two types in a meaningful way.

There must be a resolution to the problem, because I know there are other type systems that allow singletons in this fashion, but I have not been able to hit upon the lemma to prove that will allow smoothing over the discontinuities singletons introduce. For now I think I will try proving safety with the above typing rule omitted, and see whether there are any other problematic features of the language.

Comments (4)

Research update

I have been making reasonably steady progress on my formalization of Featherweight Scala in Coq. I have just about seven hundred lines of definitions, with only few more things to make explicit. At that point I should be ready to start on actually proving progress and preservation.

I volunteered to triage issues submitted to the Scala bugtracking system this month, so that has been taking a little bit of my time. Mostly this just involves checking that what was submit is truly a bug and reproducible, before assigning it to the relevant person.

I am also responsible for the interpreter and the ScalaDoc, so I have also spent the time fixing a few bugs that have shown up in the interpreter. The big one of the past week is that for 2.6.2 the plan is to finally interoperate with Java generics. As a consequence, the compiler now defaults to generating class files in the format used in version 1.5 of the JVM and above. However, this broke the nightly tests run against the 1.4 JVM. The interpreter would compile and run fine until it tried to load one of the class files itself had compiled, which had an incompatible version number. So I added a check that if the interpreter is running on a pre-1.5 JVM, to force it to generate compatible class files.

An undergrad e-mailed me to ask about working on the Literate Scala semester project that I had posted. I met with him briefly last week for mostly administrative purposes. That should be interesting.

I should be moving forward on the implementation of virtual classes, but I have been avoiding it because I am not yet sure how to best modify the AST to represent them. I will hopefully have chance to discuss it with Martin this week.

I have also been spending time thinking about the next major project, an effect system for Scala. I have been thinking hard about how to add one while still maintaining reasonable backward compatibility and so that it is not exceedingly verbose. It is a very difficult problem. Furthermore, one of the most widely used effect systems that exists in programming languages, Java's throws clauses, is generally reviled by programmers. One thought I had was that maybe a whiz-bang effect system with effect polymorphism, a rich language of effects, and hooks for adding user defined effects is just overkill. Maybe programmers would be happy with being able to write an annotation like @pure on their methods and closures, and leave it at that. Enforcing purity would not be too difficult I think (ignoring the well known problem of the termination effect), and be much easier for programmers to reason about.

My wrists have also started acting up again, so I am not entirely sure how to deal with that, yet continue pushing my current research projects forward.  It has been suggested to me to look into voice recognition software, but I am not sure how compatible that would be with my current workflow.  Not to mention that I can only imagine how much it would annoy the people in the adjoining offices given that they have already expressed displeasure at all the beeping caused by emacs on my desktop computer.

Comments (4)