Correction on existential unpacking

(WordPress ate my first draft, grrr.)

So after thinking about it further, I was incorrect, and it is possible to explicitly unpack existentials in Scala. As with some other languages, it is done via pattern matching. The following example illustrates how:

  1. val x : List[T] forSome { type T } = List(42)
  2. val w = x match { case y : List[u] => ((z : u) => z)(y.head) }

However, in practice this functionality seems to be rather fragile. For example, the following two variations are rejected:

  1. val x : T forSome { type T } = 42
  2. val w = x match { case y : u => ((z : u) => z)(y) }

and

  1. val x : List[List[T]] forSome { type T } = List(List(42))
  2. val w = x match { case y : List[List[u]] => ((z : u) => z)(y.head.head) }

In both cases it reports that it cannot find the type u. In the second case, it could be attributed to erasure, as there is no way dynamically to guarantee that the contents of List are in turn also a List. However, the first seems reasonable, so it should probably be reported as a bug.

Comments

Higher-rank, impredicative polymorphism in Scala

One project that I spent some time thinking about prior arriving in Lausanne was whether I might be able to put together a Scala library for manipulating data-structures with binding based around the use of parametricity to help enforce the adequacy of function spaces (ala Boxes Go Bananas). However, one problem is that Scala does not have higher-rank polymorphism like GHC. The initial thought was, "well, Scala has higher-rank existential types, so adding the dual case should be pretty straightforward to add to the typechecker". However, it turns out it is not so straightforward to add universal quantification to the implementation of the subtyping algorithm and ensure that constraints can be solved unambiguously. It still may make sense to try and add them in some fashion.

Later, while working on Featherweight Scala, it occurred to me that subtyping is very likely to be undecidable in Scala because its language of types is essentially as expressive as full F<:. Benjamin agreed that there existed a straightforward reduction to languages with only bounded existential quantification rather than bounded universal quantification, but did not know offhand if there is a paper that gives it.

So I spent a bit of time thinking about encoding universal quantification using existential quantification, and as alluded to in TAPL (the answer to exercise 24.3.3), it can be done, but there is no completely local encoding of universals into existentials. To obtain the classical equivalences between the negation of universal quantification and existential quantification (∀α.τ implies ¬∃α.¬τ) in a constructive setting you have to use the double-negation encoding, which essentially means CPS converting your program. The other classical equivalence under negation, ∃α.τ implies ¬∀α.¬τ, is constructively valid which is why it is existential quantification is macro-expressible in languages with universal quantification.

Since I am not aware of any papers that illustrate the encoding of universals into existentials, I figured I would explain it here. The encoding of the universal type pretty much follows from the equivalence:

|∀α.τ| = (∃α.|τ|→0)→0

Here I am using 0 for the void or empty type and using the standard constructive encoding of negation:

|¬τ| = |τ| → 0

The term level encodings are not so difficult, but does require a little thought:

|Λα.e : ∀α.τ|k = k (λx:∃α.|τ|→0.let (α,k') = unpack x in |e : τ|k')

The term encoding has the form |e : τ|k, where τ is the type of the term and k is a continuation threaded through the encoding. The first thing to note is that because type abstractions are values, we always immediately pass them to the continuation. Given that,
if we look at the type we need to give the encoding of type abstraction, it is a function that that takes an existentially quantified continuation, and never returns. The only way to never return is to use the continuation. This tells us just about everything else we need to know about the encoding.

Because the continuation is existentially quantified, it is not possible to invoke it as is, so we must unpack it first. This is how we simulate the introduction of a type variable without using type abstraction. The next problem is that in order to call the continuation, we need to apply it to something with the correct type. The continuation takes values of type |τ|, possibly referencing a free type variable α, which is fortunately what the encoding of the type abstraction's body gives us. Therefore, we can use the unpacked continuation k' as the continuation argument of the recursive call to the encoding.

Given that we know how to encode type abstractions, encoding type applications is straightforward:

|e[τ] : τ'|k = |e : ∀α.τ'{α/τ}|(λv:(∃α.τ'{α/τ}→0)→0.v (pack(τ,k) as ∃α.(τ'{α/τ}→0)))

We encode the term to be applied and pass it a new continuation, that packs up the original continuation and applies it to function to which the term evaluates. I am writing {α/τ} to mean replace τ with α. I am using curly braces rather than the usual square brackets to avoid confusion with the square brackets that is commonly used for type application.

While I am almost certain that the encoding above is correct, it would be good at some point to prove it. This is the sort of proof for which Twelf is very well suited, so that is probably the route I would take.

Okay, so that is the theoretical half out of the way. Now, given the above, can we implement higher-rank impredicative polymoprhism in Scala? Yes, with a minor caveat.

First, I'll define an abstract class that defines the type of bounded universally quantified values, and methods for creating and destructing them.

  1. abstract class Universal {
  2. type Univ[Bound,Body[_]]
  3.  
  4. def createUniv[Bound,Body[_]]
  5. (impl : ((Body[A] => Nothing) forSome { type A <: Bound }) => Nothing) :
  6. Univ[Bound,Body]
  7.  
  8. def applyUniv[Bound,Body[_],A<:Bound](arg : Univ[Bound,Body]) : Body[A]
  9. }

This should be fairly straightforward given the encoding described above. The only tricky bit is the use of a higher-kinded type parameter (

  1. Body[_]
) to give the body of the universal type. The other important point to note is that it is not necessary to completely CPS convert the implementation because Scala provides primitives for non-local control transfers.

One possible implementation of the above abstract class is the following:

  1. class UniversalImpl extends Universal {
  2. type Univ[Bound,Body[_]] =
  3. ((Body[A] => Nothing) forSome { type A <: Bound }) => Nothing
  4.  
  5. def createUniv[Bound,Body[_]]
  6. (impl : ((Body[A] => Nothing) forSome { type A <: Bound }) => Nothing) :
  7. Univ[Bound,Body] = impl
  8.  
  9. def applyUniv [Bound,Body[_],A<:Bound](univ : Univ[Bound,Body]) : Body[A] = {
  10. case class Control(arg : Body[A]) extends Throwable
  11. val res = try {
  12. univ((arg:Body[A]) => throw new Control(arg))
  13. } catch {
  14. case Control(arg) => arg
  15. case e => throw e
  16. }
  17. res
  18. }
  19. }

The implementation of the abstract type and the code for

  1. createUniv
are trivial. The implementation of
  1. applyUniv
is a little more interesting. Here, we create a local case class
  1. Control
, extending
  1. Throwable
so that it may be thrown as an exception, that will hold a value of the desired result type. We then just pass the representation of the type abstraction a continuation that throws a new instance of
  1. Control
. We immediately catch it and return the value stored within. If we happen to catch some other sort of
  1. Throwable
we just re-throw it.

And that's it. It is worth looking at a few examples of how this might be used. The first thing that is necessary is to create a concrete implementation of Universal:

  1. val u : Universal = new UniversalImpl

Given

  1. u
, we can create an instance of the polymorphic identity function:

  1. type Id[T] = T => T
  2. val id =
  3. u.createUniv[AnyRef,Id](
  4. (k: (Id[A] => Nothing) forSome { type A }) =>
  5. k(x => x))
  6. val idString = u.applyUniv[AnyRef,Id,String](id)
  7. val idStringList = u.applyUniv[AnyRef,Id,List[String]](id)
  8. println(idString("Foo"))
  9. println(idStringList(List("Foo", "Bar", "Baz")))

The first thing that needs to be done is to define a type function for representation the body of the universally quantified type. For the polymorphic identity function we call it

  1. Id[T]
. The rest is pretty straightforward. It ought to be possible to make the above code a little less verbose if Scala allowed type parameters to methods to be curried: it should be possible for it to infer that the first two arguments to
  1. u.applyUniv
are
  1. AnyRef
and
  1. Id
from the argument
  1. id
. However, because it will not be able to infer the last type parameter, we have to give all three. It might also be desirable in some case to not have to give a type definition to define the body of the a universal type; this could be achieve by extending Scala with support for anonymous type functions.

Now for that caveat I mentioned. When we defined the type abstraction,

  1. u.createUniv[AnyRef,Id](
  2. (k: (Id[A] => Nothing) forSome { type A }) =>
  3. k(x => x))

a significant difference between how this code is written, and how it would have been written in terms of my original explanation of the encoding, is that there is no explicit unpacking of the existential. All existential unpacking in Scala is implicit. This is nice in some ways, but it means that we have no way to give a name to type that the existential is hiding. Therefore, when constructing a function to pass to the continuation,

  1. k(x => x)

it must be the case that Scala can infer the domain of this function's type, because it is not possible to write a type annotation for

  1. x
without having a name for the existentially hidden type. I think in most cases this should not be possible, as Scala knows the type that
  1. k
is expecting as an argument, but there may be some corner cases I have not considered.

(Does anyone know how to prevent WordPress from completely screwing up less than and greater than inside of <pre> tags?)

Comments (4)

Lazy environment variables

I was just thinking that it would be really useful if command-line shells supported lazy environment variables. Lately, because of my work on Scala I will often find myself entering a line something like

  1.  
  2. export PATH=/home/linuxsoft/apps/java-ibm-1.5/bin:$PATH
  3. ...

This is, despite the write-once promises of Jave (well, JVM bytecode), Scala will fail to build or a test will fail on specific implementations of the Java runtime and VM. I have been doing this so frequently, I finally decided to write some ZSH shell scripts to make it a little less work.

Just having a short macro that does the above for all the various Java runtimes is not ideal, because then my

  1. PATH
keeps getting longer and longer. ZSH might be smart about this when caching lookups, but it is inelegant. Another solution is to write something that does a search and replace on my
  1. PATH
as a string. However, the most elegant solution would simply be to not perform expansion on the contents of
  1. PATH
until it must be passed as part of an exec

ZSH can do a lot, so maybe it already has some feature that approximates this, but it would be nice if I could just write something like

  1.  
  2. lazy export PATH=$JAVA_BIN:$PATH
  3. export JAVA_BIN=/home/linuxsoft/apps/java-ibm-1.5/bin
  4. ...

And then my scripts can just operate on

  1. JAVA_BIN
rather than having to modify
  1. PATH
directly.

Update: I just noticed that setting the variable

  1. JAVACMD
is enough for most purposes, but the above concept still seems reasonable.

Comments (2)

Research update

I finished a draft of "Generalizing Parametricity Using Information-flow" for LMCS this past weekend. That should leave me with fewer distractions to work on something for ICFP.

My work on Featherweight Scala is moving along, sometimes more slowly than others. Other than the issues with singletons I have already discussed, one problem with the original definition of Featherweight Scala is that it did not truly define a subset of Scala proper – that is, there were many valid Featherweight Scala programs that were not valid Scala programs.

So I have been working on trying to get Featherweight Scala to be as close to a subset of Scala proper as is possible without making it far too complicated to be a useful core calculus. Some of this has involved changing the definition of Featherweight Scala. Some of this has involved lobbying for changes to Scala proper to make it more uniform (so far I've succeed with extending traits to allow value declarations to be overridden).

Through the whole process I've also been spending a lot of time typing things into the Scala interpreter to figure out how Scala proper treats them. For better or worse, I've actually managed to uncover a fair number of bugs in Scala proper doing this. I've almost reached the point where I treat it as a bit of a game: can I from thinking about the properties of the Scala type system come up with some corner case that will exhibit an unsoundness in the language (a ClassCastException, a NoSuchMethodException, etc.), or at least crash the compiler?

Last week I came up with a pretty nice one where I used the fact that Scala should not allow inheritance from singleton types to launder types in an unsafe fashion (now fixed in trunk!). Prior to that I came up with something fun where you could trick the compiler into letting you inherit from Nothing (which is supposed to be the bottom of the subtyping hierarchy). Today I got to thinking about Scala's requirements that paths in singleton types must be stable – all components of a path must be immutable. So for example in

  1.  
  2. var x = 3
  3. val y : x.type = x // ill-typed

The path x is not stable because x is a mutable variable. However, Scala does allow lazy value declarations to be stable.

  1.  
  2. var x = 3
  3. lazy val y = x
  4. val z : y.type = y // allowed, at present

Note that y is not evaluated until it is first needed, and in this case its evaluation involves a mutable reference. Furthermore, add into the mix that Scala also provides call-by-name method arguments (lazy values are call-by-need). So I started riffing on the idea of whether I could violate stability by changing the mutable reference between the definition of the lazy value, or a call-by-name argument, and its use. In retrospect, I am leaning at present towards the belief that there is no way this should be exploitable from a theoretical standpoint. That does not mean that the implementation is necessarily in alignment with theory, however. I did manage to hit upon a combination in my experiments that resulted in a NoSuchMethodException, so the exercise was not a complete letdown.

I should point out that these things do not really reflect on the quality of the Scala language as whole. It is a rich language that does not yet have a complete formal model, and as such, in the implementation process it can be easy to overlook some particularly perverse abuses of the language.

Comments

When sending messages by post would be faster

We have been experiencing a massive delay in all the scala-* mailing lists at EPFL, both the public and private lists. The admins are so worried about open relays, the firewall prevent us from having our lab's admin run the mailing lists on one of our servers. Of course, despite telling the central admins that this is a serious problem, it has just gone from bad to worse. Last week we were seeing delays of about an hour. The past day or two the delay seems to have hit something like at least twelve hours. That means if someone does a Subversion commit, we do not see the commit e-mail for at least twelve hours. The same goes with submissions to the bug tracking system, or even internal queries.

Arguably, it has been good for productivity in some ways.

Comments (2)

Type safety and singletons

I am continuing to make progress on the proof of soundness for Featherweight Scala, but the most significant difficulty I am encountering involves the rule that allows paths, expressions of the form x.l1.l2...ln, to be given singleton types:

Γ ⊦ p: T


Γ ⊦ p: p.type

It seems to make any defining any sort of useful inversion principle impossible. For example, the term x.l1 where x has a field val l1 : Int = 3 will reduce to 3. In the proof of preservation, you would expect to be able to use inversion to show that if x.l1 is well typed with some type T that x has a field l1 with type T, or at least some subtype of T. However, with singletons it is possible that the type of x.l1 is x.l1.type. In that case, the only thing you can determine by inversions is that x has a field called l1 and it has some type. Many similar problems arise determining what would be seemingly straightforward facts because it is impossible to relate two types in a meaningful way.

There must be a resolution to the problem, because I know there are other type systems that allow singletons in this fashion, but I have not been able to hit upon the lemma to prove that will allow smoothing over the discontinuities singletons introduce. For now I think I will try proving safety with the above typing rule omitted, and see whether there are any other problematic features of the language.

Comments (4)

Prognostication

There is a current thread on Lambda the Ultimate at the moment on predictions for 2008. I will have to remember to look at it again later this year.

Most of the predictions concerning Scala are positive.

I would have to agree with Sean that there have been many years I've heard that concurrency will be the big thing, but I do not feel there has been too many big developments in the area recently. Robin Milner has recently released a draft book on his theory of bigraphs, his newer generic model for concurrency. There is actually an even newer draft of the book, but that is not presently linked from his website, so I am not sure whether I should publish the link.

Comments

« Previous Page « Previous Page Next entries »