Archive for types

Higher-rank, impredicative polymorphism in Scala

One project that I spent some time thinking about prior arriving in Lausanne was whether I might be able to put together a Scala library for manipulating data-structures with binding based around the use of parametricity to help enforce the adequacy of function spaces (ala Boxes Go Bananas). However, one problem is that Scala does not have higher-rank polymorphism like GHC. The initial thought was, "well, Scala has higher-rank existential types, so adding the dual case should be pretty straightforward to add to the typechecker". However, it turns out it is not so straightforward to add universal quantification to the implementation of the subtyping algorithm and ensure that constraints can be solved unambiguously. It still may make sense to try and add them in some fashion.

Later, while working on Featherweight Scala, it occurred to me that subtyping is very likely to be undecidable in Scala because its language of types is essentially as expressive as full F<:. Benjamin agreed that there existed a straightforward reduction to languages with only bounded existential quantification rather than bounded universal quantification, but did not know offhand if there is a paper that gives it.

So I spent a bit of time thinking about encoding universal quantification using existential quantification, and as alluded to in TAPL (the answer to exercise 24.3.3), it can be done, but there is no completely local encoding of universals into existentials. To obtain the classical equivalences between the negation of universal quantification and existential quantification (∀α.τ implies ¬∃α.¬τ) in a constructive setting you have to use the double-negation encoding, which essentially means CPS converting your program. The other classical equivalence under negation, ∃α.τ implies ¬∀α.¬τ, is constructively valid which is why it is existential quantification is macro-expressible in languages with universal quantification.

Since I am not aware of any papers that illustrate the encoding of universals into existentials, I figured I would explain it here. The encoding of the universal type pretty much follows from the equivalence:

|∀α.τ| = (∃α.|τ|→0)→0

Here I am using 0 for the void or empty type and using the standard constructive encoding of negation:

|¬τ| = |τ| → 0

The term level encodings are not so difficult, but does require a little thought:

|Λα.e : ∀α.τ|k = k (λx:∃α.|τ|→0.let (α,k') = unpack x in |e : τ|k')

The term encoding has the form |e : τ|k, where τ is the type of the term and k is a continuation threaded through the encoding. The first thing to note is that because type abstractions are values, we always immediately pass them to the continuation. Given that,
if we look at the type we need to give the encoding of type abstraction, it is a function that that takes an existentially quantified continuation, and never returns. The only way to never return is to use the continuation. This tells us just about everything else we need to know about the encoding.

Because the continuation is existentially quantified, it is not possible to invoke it as is, so we must unpack it first. This is how we simulate the introduction of a type variable without using type abstraction. The next problem is that in order to call the continuation, we need to apply it to something with the correct type. The continuation takes values of type |τ|, possibly referencing a free type variable α, which is fortunately what the encoding of the type abstraction's body gives us. Therefore, we can use the unpacked continuation k' as the continuation argument of the recursive call to the encoding.

Given that we know how to encode type abstractions, encoding type applications is straightforward:

|e[τ] : τ'|k = |e : ∀α.τ'{α/τ}|(λv:(∃α.τ'{α/τ}→0)→0.v (pack(τ,k) as ∃α.(τ'{α/τ}→0)))

We encode the term to be applied and pass it a new continuation, that packs up the original continuation and applies it to function to which the term evaluates. I am writing {α/τ} to mean replace τ with α. I am using curly braces rather than the usual square brackets to avoid confusion with the square brackets that is commonly used for type application.

While I am almost certain that the encoding above is correct, it would be good at some point to prove it. This is the sort of proof for which Twelf is very well suited, so that is probably the route I would take.

Okay, so that is the theoretical half out of the way. Now, given the above, can we implement higher-rank impredicative polymoprhism in Scala? Yes, with a minor caveat.

First, I'll define an abstract class that defines the type of bounded universally quantified values, and methods for creating and destructing them.

abstract class Universal {
  type Univ[Bound,Body[_]]
 
  def createUniv[Bound,Body[_]]
      (impl : ((Body[A] => Nothing) forSome { type A <: Bound }) => Nothing) :
        Univ[Bound,Body]
 
  def applyUniv[Bound,Body[_],A<:Bound](arg : Univ[Bound,Body]) : Body[A]
}

This should be fairly straightforward given the encoding described above. The only tricky bit is the use of a higher-kinded type parameter (Body[_]) to give the body of the universal type. The other important point to note is that it is not necessary to completely CPS convert the implementation because Scala provides primitives for non-local control transfers.

One possible implementation of the above abstract class is the following:

class UniversalImpl extends Universal {
  type Univ[Bound,Body[_]] =
     ((Body[A] => Nothing) forSome { type A <: Bound }) => Nothing
 
  def createUniv[Bound,Body[_]]
      (impl : ((Body[A] => Nothing) forSome { type A <: Bound }) => Nothing) :
        Univ[Bound,Body] = impl
 
  def applyUniv [Bound,Body[_],A<:Bound](univ : Univ[Bound,Body]) : Body[A] = {
    case class Control(arg : Body[A]) extends Throwable
    val res = try {
      univ((arg:Body[A]) => throw new Control(arg))
    } catch {
      case Control(arg) => arg
      case e => throw e
    }
    res
  }
}

The implementation of the abstract type and the code for createUniv are trivial. The implementation of applyUniv is a little more interesting. Here, we create a local case class Control, extending Throwable so that it may be thrown as an exception, that will hold a value of the desired result type. We then just pass the representation of the type abstraction a continuation that throws a new instance of Control. We immediately catch it and return the value stored within. If we happen to catch some other sort of Throwable we just re-throw it.

And that's it. It is worth looking at a few examples of how this might be used. The first thing that is necessary is to create a concrete implementation of Universal:

val u : Universal = new UniversalImpl

Given u, we can create an instance of the polymorphic identity function:

type Id[T] = T => T
val id =
    u.createUniv[AnyRef,Id](
      (k: (Id[A] => Nothing) forSome { type A }) =>
         k(x => x))
val idString = u.applyUniv[AnyRef,Id,String](id)
val idStringList = u.applyUniv[AnyRef,Id,List[String]](id)
println(idString("Foo"))
println(idStringList(List("Foo", "Bar", "Baz")))

The first thing that needs to be done is to define a type function for representation the body of the universally quantified type. For the polymorphic identity function we call it Id[T]. The rest is pretty straightforward. It ought to be possible to make the above code a little less verbose if Scala allowed type parameters to methods to be curried: it should be possible for it to infer that the first two arguments to u.applyUniv are AnyRef and Id from the argument id. However, because it will not be able to infer the last type parameter, we have to give all three. It might also be desirable in some case to not have to give a type definition to define the body of the a universal type; this could be achieve by extending Scala with support for anonymous type functions.

Now for that caveat I mentioned. When we defined the type abstraction,

u.createUniv[AnyRef,Id](
  (k: (Id[A] => Nothing) forSome { type A }) =>
    k(x => x))

a significant difference between how this code is written, and how it would have been written in terms of my original explanation of the encoding, is that there is no explicit unpacking of the existential. All existential unpacking in Scala is implicit. This is nice in some ways, but it means that we have no way to give a name to type that the existential is hiding. Therefore, when constructing a function to pass to the continuation,

k(x => x)

it must be the case that Scala can infer the domain of this function's type, because it is not possible to write a type annotation for x without having a name for the existentially hidden type. I think in most cases this should not be possible, as Scala knows the type that k is expecting as an argument, but there may be some corner cases I have not considered.

(Does anyone know how to prevent WordPress from completely screwing up less than and greater than inside of <pre> tags?)

Comments (4)

Research update

I finished a draft of "Generalizing Parametricity Using Information-flow" for LMCS this past weekend. That should leave me with fewer distractions to work on something for ICFP.

My work on Featherweight Scala is moving along, sometimes more slowly than others. Other than the issues with singletons I have already discussed, one problem with the original definition of Featherweight Scala is that it did not truly define a subset of Scala proper – that is, there were many valid Featherweight Scala programs that were not valid Scala programs.

So I have been working on trying to get Featherweight Scala to be as close to a subset of Scala proper as is possible without making it far too complicated to be a useful core calculus. Some of this has involved changing the definition of Featherweight Scala. Some of this has involved lobbying for changes to Scala proper to make it more uniform (so far I've succeed with extending traits to allow value declarations to be overridden).

Through the whole process I've also been spending a lot of time typing things into the Scala interpreter to figure out how Scala proper treats them. For better or worse, I've actually managed to uncover a fair number of bugs in Scala proper doing this. I've almost reached the point where I treat it as a bit of a game: can I from thinking about the properties of the Scala type system come up with some corner case that will exhibit an unsoundness in the language (a ClassCastException, a NoSuchMethodException, etc.), or at least crash the compiler?

Last week I came up with a pretty nice one where I used the fact that Scala should not allow inheritance from singleton types to launder types in an unsafe fashion (now fixed in trunk!). Prior to that I came up with something fun where you could trick the compiler into letting you inherit from Nothing (which is supposed to be the bottom of the subtyping hierarchy). Today I got to thinking about Scala's requirements that paths in singleton types must be stable – all components of a path must be immutable. So for example in

 
var x = 3
val y : x.type = x // ill-typed

The path x is not stable because x is a mutable variable. However, Scala does allow lazy value declarations to be stable.

 
var x = 3
lazy val y = x
val z : y.type = y // allowed, at present

Note that y is not evaluated until it is first needed, and in this case its evaluation involves a mutable reference. Furthermore, add into the mix that Scala also provides call-by-name method arguments (lazy values are call-by-need). So I started riffing on the idea of whether I could violate stability by changing the mutable reference between the definition of the lazy value, or a call-by-name argument, and its use. In retrospect, I am leaning at present towards the belief that there is no way this should be exploitable from a theoretical standpoint. That does not mean that the implementation is necessarily in alignment with theory, however. I did manage to hit upon a combination in my experiments that resulted in a NoSuchMethodException, so the exercise was not a complete letdown.

I should point out that these things do not really reflect on the quality of the Scala language as whole. It is a rich language that does not yet have a complete formal model, and as such, in the implementation process it can be easy to overlook some particularly perverse abuses of the language.

Comments

Focusing and higher-order abstract syntax

I figured I would take a few minutes to point out that Noam Zeilberger had an excellent paper in POPL this year on Focusing and higher-order abstract syntax. I've spent some time in the past trying to better understand Andreoli's material on focusing and Girard's* Ludics from the judgmental perspective that I became accustomed to in my formative years, but I never really came up with anything. I found Noam's computational interpertation of focusing quite pleasing.

One low tech solution that has really been helping my wrists lately is that I obtained adjustable height desk for my office. The desks that came with my office were definitely far too tall for me (even when my chair at its maximum height).

* Girard seems to have redecorated his website since I last visited.

Comments

More notes on Featherweight Scala

As I mentioned the other day, Featherweight Scala has a typing rule that allows assigning a path (x.l1...ln) a singleton type:

Γ ⊦ p : T
-------------
Γ ⊦ p : p.type

(I should note that for some stupid reason WordPress seems to prevent MathML from working, making the rather pleasant <mfrac> tag useless). This is essentially the rule that everyone else uses, but in all the other literature I've looked at (other than νObj) the singleton is labeled with T (something like ST(p)) or T must have a specific type (int, base, etc.).

As I pointed out, Featherweight Scala's version of the rule makes it difficult to write meaningful inversion lemmas. Secondly, the naïve subject reduction theorem that could be obtained would be something like:

If Γ ⊦ t : T and t → t' then exists some T' such that Γ ⊦ t' : T'.

Which feels rather unsatisfactory. Given the subsumption rule for singletons

Γ ⊦ p : T
---------------
Γ ⊦ p.type <: T

It is plausible that a variant of the theorem where T is a subtype of T' would hold, but that would run counter to the expectations that reduction will only allow us to more precisely characterize the type of result. Thinking about it I suppose it might be reasonable to state the theorem as

If Γ ⊦ t : T and t → t' then there exists some T' such that Γ ⊦ t' : T' and
if there exists some p such that T = p.type then Γ ⊦ p : T', otherwise Γ ⊦ T' <: T.

Which I am starting to believe is the best that could be proven. This theorem nicely captures the discontinuities in subject reduction introduced by singleton types. You keep making steps getting a more and more precise type, and then suddenly you "jump" from one type to another, followed by some period where the type becomes more refined, then another "jump", and so forth.

These discontinuities are quite a problem, because it means that after taking a step it may not be possible in some cases to put the resulting term back into an evaluation context and remain well-typed.

νObj [1], the predecessor to Featherweight Scala, has nearly the same problem, but the language has been constructed just right so that typing always becomes more precise. The same changes cannot be made to Featherweight Scala and retain the property that the language is a subset of Scala. Aspinall's original work [2] on on singleton types makes use of an erasure predicate, but his singletons are labeled with a type making its definition trivial. Stone and Harper's work on singleton types [3] solves the problem by using a subject reduction theorem that looks roughly like

If Γ ⊦ t : T and t → t' then that Γ ⊦ t = t' : T.

but I am not sure there is a pleasant notion of term equality in Featherweight Scala, because all reductions either consult the heap or extend it. Singletons are also slightly more tractable in their setting because they require any term being used as a singleton type to have the base type "b".

[1] Martin Odersky, Vincent Cremet, Christine Röckl, and Matthias Zenger. A Nominal Theory of Objects with Dependent Types. Tech report 2002.
[2] David Aspinall. Subtyping with singleton types. CSL 1994
[3] Christopher Stone and Robert Harper. Extensional equivalence and singleton types. ACM TOCL 2006.

Comments (1)

On Arc

From Paul Graham's Arc release annoucement:

Over the years my appreciation for lists has increased. In exploratory programming, the fact that it's unclear what a list represents is an advantage, because you yourself are unclear about what type of program you're trying to write. The most important thing is not to constrain the evolution of your ideas. So the less you commit yourself in writing to what your data structures represent, the better.

Clearly Paul has been living in a cave for the past thirty-five years and did not hear about those quaint ideas, abstract data types and representation independence.

He seems to miss the obvious that using a list to represent a point is quite constraining because you are forced to require one coordinate be the first element of the list and the other coordinate be the second element of the list, and if you ever want to change it, good luck. I suppose his rebuttal would be »oh, it just takes too long or too many tokens to define an ADT for points«.

I really wonder how long it is going to take people to truly learn that telling the compiler about the structure of the data you work with is just as important as telling it about the control flow.

Comments (7)

Type safety and singletons

I am continuing to make progress on the proof of soundness for Featherweight Scala, but the most significant difficulty I am encountering involves the rule that allows paths, expressions of the form x.l1.l2...ln, to be given singleton types:

Γ ⊦ p: T

Γ ⊦ p: p.type

It seems to make any defining any sort of useful inversion principle impossible. For example, the term x.l1 where x has a field val l1 : Int = 3 will reduce to 3. In the proof of preservation, you would expect to be able to use inversion to show that if x.l1 is well typed with some type T that x has a field l1 with type T, or at least some subtype of T. However, with singletons it is possible that the type of x.l1 is x.l1.type. In that case, the only thing you can determine by inversions is that x has a field called l1 and it has some type. Many similar problems arise determining what would be seemingly straightforward facts because it is impossible to relate two types in a meaningful way.

There must be a resolution to the problem, because I know there are other type systems that allow singletons in this fashion, but I have not been able to hit upon the lemma to prove that will allow smoothing over the discontinuities singletons introduce. For now I think I will try proving safety with the above typing rule omitted, and see whether there are any other problematic features of the language.

Comments (4)

Research continues

My wrist are still not doing so well, so I am trying out using handwriting recognition in the evenings. Blogging is obviously falling by the wayside. So for now, I am mostly focusing on continuing my proof of soundness for Featherweight Scala on paper, which should be helpful if I ever get back to the Coq version of the proof.

This afternoon, I wrote a follow-up to a thread on the TYPES list concerning the applicability of my research on generalizing parametricity. I hope it made some sense. At least I have not yet received a serious criticism of the idea.

Comments (5)

Research update

I have been making reasonably steady progress on my formalization of Featherweight Scala in Coq. I have just about seven hundred lines of definitions, with only few more things to make explicit. At that point I should be ready to start on actually proving progress and preservation.

I volunteered to triage issues submitted to the Scala bugtracking system this month, so that has been taking a little bit of my time. Mostly this just involves checking that what was submit is truly a bug and reproducible, before assigning it to the relevant person.

I am also responsible for the interpreter and the ScalaDoc, so I have also spent the time fixing a few bugs that have shown up in the interpreter. The big one of the past week is that for 2.6.2 the plan is to finally interoperate with Java generics. As a consequence, the compiler now defaults to generating class files in the format used in version 1.5 of the JVM and above. However, this broke the nightly tests run against the 1.4 JVM. The interpreter would compile and run fine until it tried to load one of the class files itself had compiled, which had an incompatible version number. So I added a check that if the interpreter is running on a pre-1.5 JVM, to force it to generate compatible class files.

An undergrad e-mailed me to ask about working on the Literate Scala semester project that I had posted. I met with him briefly last week for mostly administrative purposes. That should be interesting.

I should be moving forward on the implementation of virtual classes, but I have been avoiding it because I am not yet sure how to best modify the AST to represent them. I will hopefully have chance to discuss it with Martin this week.

I have also been spending time thinking about the next major project, an effect system for Scala. I have been thinking hard about how to add one while still maintaining reasonable backward compatibility and so that it is not exceedingly verbose. It is a very difficult problem. Furthermore, one of the most widely used effect systems that exists in programming languages, Java's throws clauses, is generally reviled by programmers. One thought I had was that maybe a whiz-bang effect system with effect polymorphism, a rich language of effects, and hooks for adding user defined effects is just overkill. Maybe programmers would be happy with being able to write an annotation like @pure on their methods and closures, and leave it at that. Enforcing purity would not be too difficult I think (ignoring the well known problem of the termination effect), and be much easier for programmers to reason about.

My wrists have also started acting up again, so I am not entirely sure how to deal with that, yet continue pushing my current research projects forward.  It has been suggested to me to look into voice recognition software, but I am not sure how compatible that would be with my current workflow.  Not to mention that I can only imagine how much it would annoy the people in the adjoining offices given that they have already expressed displeasure at all the beeping caused by emacs on my desktop computer.

Comments (4)

Prognostication

There is a current thread on Lambda the Ultimate at the moment on predictions for 2008. I will have to remember to look at it again later this year.

Most of the predictions concerning Scala are positive.

I would have to agree with Sean that there have been many years I've heard that concurrency will be the big thing, but I do not feel there has been too many big developments in the area recently. Robin Milner has recently released a draft book on his theory of bigraphs, his newer generic model for concurrency. There is actually an even newer draft of the book, but that is not presently linked from his website, so I am not sure whether I should publish the link.

Comments

Retraction on Coq and lists

Okay, so everything I said the other day about mutually defining your ASTs in Coq with custom lists, and then using Coq's implicit coercions to move back and forth between lists as defined in Coq's standard library, does not work.  Firstly, the coercions will not always be inserted when you will need them, and even inserting the coercions manually does not work out well as Coq will then not be able to determine that recursive functions you define on your ASTs will necessarily terminate.

So I have switched back to using lists from the standard library and hoping that defining custom induction principles based upon Yves Bertot's suggestion in the discussion here will work for everything I need to do.

Comments (2)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »