Archive for Uncategorized

The Great Confusion

In the past forty-eight hours, I've written code in C++, F#, and Scala.  On top of that, XCode, Visual Studio, and Emacs all have different default keybindings for all the most common operations.  Then add in switching between US and Danish physical keyboard layouts.


Algorithmic puzzle

I was assigned the task of fixing a bug in the Scala standard library involving the indexOf, which given a receiver object that is a sequence and another sequence of the correct type, checks whether the latter is contained within the latter and returns the index. The current version does not correctly handle the case when a suffix of the receiver object matches a strict prefix of the argument (for example, List(1,2,3).indexOf(List(3,4)) will report a match at the index of 2). This should be fixed for the upcoming 2.7.2-RC2 release.

As soon as I started rewriting the code, I wondered why the original author hadn't just used an off the shelf searching algorithm. However, a quick search reminded me why: algorithms like Knuth-Morris-Pratt and Boyer-Moor construct a table based upon the sequence to search for. However, Scala sequences may be infinite so it is not possible to blindly go ahead and attempt to construct a table, because doing so may diverge.

Furthermore, there is no way to test whether a sequence is finite without potentially diverging. So it is not possible to first test whether the argument is finite, because if the receiver object is finite then indexOf should return that there is no match. Alternately, testing whether the receiver object is finite would be incorrect because it is possible the argument is finite an could potentially match.

However, it seems like it should still be possible to do better than O(nm), where n is the length of the receiver and m the length of the argument. For example if you start out with the sequence 1, 2, 3, 1 ... and the pattern 1, 3, 4 ... it seems like it should be possible to exploit the fact that you've looked ahead and know that there is no point and comparing 2 with 1. Alternately it seems like it might be possible to lazily build a table from the argument, but I would need to think longer to see whether it is always possible, in Knuth-Morris-Pratt for example, to fill in a complete prefix of the table without having processed the entire pattern.

In any event, searching with combinations of keywords like "string", "searching", "lazy", "infinite", etc. did not really turn anything up. One possible direction might be to look at "incremental" search algorithms like those used in text editors, etc. However, I expect that because they are geared to interactive use that the pattern will usually be quite small and therefore much thought has not been put into optimizing them.

Comments (6)

Even more modules in Scala

In response to Rossberg's challenge to encode the following:

  2. signature S =
  3. sig
  4. structure A : sig type t; val x : t end
  5. val f : A.t -> int
  6. end
  8. structure M =
  9. struct
  10. structure A = struct type t = int; val x = 666 end
  11. fun f x = x
  12. end
  14. structure N = M :> S

In Scala, the above becomes:

  2. type S = {
  3. val A: { type T; val x: T }
  4. def f(arg: A.T): Int
  5. }
  7. val M = new {
  8. val A = new { type T = Int; val x = 666 }
  9. def f(arg: Int) = arg
  10. }
  12. val N = M : S

And using

  1. N.f
  1. N.A.x
is even well-typed, but still encounters the same problems with evaluation I mentioned earlier. And yes, the type
  1. N.A.T
is hidden:

  2. scala> val x : Int = M.A.x
  3. x: Int = 666
  4. scala> val x : Int = N.A.x
  5. <console>:7: error: type mismatch;
  6. found : N.A.T
  7. required: Int
  8. val x : Int = N.A.x
  9. ^
  11. scala> val x : N.A.T = 42
  12. <console>:7: error: type mismatch;
  13. found : Int(42)
  14. required: N.A.T
  15. val x : N.A.T = 42
  16. ^

As for how Scala, solves the double vision problem, I would look at the work on νObj, and the forthcoming paper on Featherweight Scala.

Update: Looking more closely at the definition of "double vision" as defined by Dreyer, the Scala implementation cannot currently handle his motivating example. I will need to think about how the example works in Featherweight Scala, which to my knowledge shouldn't have a problem with at least defining the type signatures. I think the short answer is that νObj and Featherweight Scala (the non-algorithmic version) solve the problem in a vacuous fashion, as the type checking problem for them is undecidable. Scala proper possibly could resolve the problem by using a less restrictive cycle detection algorithm.