Adventures in Scala Pseudo-Abuse

So, what have I been doing with programming recently?

Scala is an amazing multiparadigm programming language that runs on the Java Virtual Machine and interoperates with Java. I learned about it last time reading random articles on Twitter.

When I say “amazing” I mean “This is a language in which my code gives me nerdgasms every time I read it.” Wheeee.

Okay, it’s not perfect. People say it’s too academic. It has a notoriously complicated type system (which is Turing-Complete at compile time). Its documentation is a bit patchy too. For a serious introduction, the Scala website has plenty of links under documentation, and a tour of features. Somebody wrote another tour that explains things a bit more. So here, instead of introducing it seriously, I’m just going to screw with its features.

Example of freedom. Scala lets names consist of symbols, and treats one-parameter methods and infix operators exactly the same. The full tokenization rules are a bit detailed and I put them at the bottom of this post for the interested. This lets you create classes with arithmetic and domain-specific languages easily, but it also creates some silly opportunities:

If you’re in a code-golf mood, this lets you shave a few characters by sticking variable names right next to keywords. Here’s a 52-character quine in script mode. (Adapted from this post from Code Commit)

Functions are first-class objects! All the folding and mapping of functional programming (not to mention monadic operations) of functional programming are there! No more five-line boilerplate anonymous ActionListeners! Applied (no pun intended) to quines again, here’s one that doesn’t use printf:

…Okay, on to serious programming, sort of. While coding I was trying to figure out how to run my Scala program from a shell script outside Eclipse for convenience, which was failing because apparently its classpath for the basic Scala stuff is confused. Out of curiosity and concern for the unlikely hypothetical in which somebody else wants to use my program, I decided to try using Simple Build Tool (SBT).

Well, it’s in Scala and it helps you build your code. Previously I wrote mostly single-file scripts and left it to Eclipse to abstract everything away from big projects until there’s only a button called “Run”, and now it’s taking its toll — I looked up “build” on Wikipedia, just to be sure. “The process of converting source code files into standalone software artifact(s) that can be run on a computer, or the result of doing so.” Okay, I guess it’s more than compiling because of the “standalone” bit, maybe because you can add resources like images or whatever? Well, I can try to use it anyway.

The install page suggests a Homebrew install. Ooh. I installed Homebrew (“The missing package manager for OS X”) after some blogger whom I linked to last time I was debugging C++ commented about it, but then I never brewed anything because I didn’t dare try to disentangle my gdb from the rest of my system and reinstall it and I didn’t have anything to install. Partly it was also due to stray library files that Homebrew didn’t like, but I decided it was too risky to delete them, so after finding enough support on StackOverflow I forged ahead.

$ brew install sbt

beer

What the heck is that!? How the heck did it get into my terminal!? Okay, a little research reveals it’s actually a Unicode character, U+1F37A BEER MUG. What a useful character. /sarc

Onwards… I had already gotten a Scala installation, but I didn’t know where to put it so it had been living under ~/scala, linked up with a PATH setting in my .profile. Wait, if Homebrew could do SBT then surely I could make it responsible for my Scala too. Yes, it can. Yay now I don’t have to keep anything under my home folder!

Then, the errors. Commence the error-message-pasting-into-Google!

My code won’t compile. For a long time it couldn’t find a main class, but that was because I got confused about the SBT directory structure. The Scala sources should be placed three directories deep, at src/main/scala/whatever.scala. Fixed! But then…

60 compile errors! More testing showed that it was not finding any of the scala.swing libraries.

Trying to diagnose the problem, I cut-and-paste a Scala GUI example into a single file and tried to compile it outside manually without SBT… yup, it couldn’t find SimpleGUIApplication, so it wasn’t finding swing… right? Nope, the sample code I had chosen was too old. The Scala folks deprecated SimpleGUIApplication in 2.8.0 and removed it in 2.10.0; I had to use SimpleSwingApplication in 2.10. Luckily, basic usage of the two are the same.

Okay. Therefore, it’s not Scala’s problem but SBT’s. What is it doing, grabbing its own installation and not knowing where anything else lives? Gah. Poking around in the Cellar revealed that there were symlinks everywhere, and all the libraries and binaries were under their respective folders one level deeper, under another odd little folder called libexec. I couldn’t really figure out what libexec meant, but quoth the Homebrew Cookbook:

Note that in the context of Homebrew, libexec is reserved for private use by the formula and therefore is not symlinked into HOMEBREW_PREFIX.

So, I’m still not sure what it means, but it would probably be shady to feed libexec into SBT as the default build path and I’d break something else later.

Then I realized I was being stupid. scala.swing is an extra library; it’s not part of the “base classes” although it comes by default with installations. The whole point of using SBT was to organize things like these dependencies. Blargh.

Flipping through the SBT docs to figure out what to add was pretty nasty, so I ended up Googling. This extra line in build.sbt will do the trick:

libraryDependencies += "org.scala-lang" % "scala-swing" % "2.10.0"

Okay, here goes…

glacier:gfb glacier$ sbt run
[info] Set current project to gridderface (in build file:/Users/glacier/gfb/)
[info] Updating {file:/Users/glacier/gfb/}default-32fcf2...
[info] Resolving org.scala-lang#scala-swing;2.10.0 ...
[info] downloading http://repo1.maven.org/maven2/org/scala-lang/scala-swing/2.10.0/scala-swing-2.10.0.jar ...
[info]  [SUCCESSFUL ] org.scala-lang#scala-swing;2.10.0!scala-swing.jar (3969ms)
[info] Done updating.
[info] Compiling 60 Scala sources to /Users/glacier/gfb/target/scala-2.10/classes...
[error] java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen space
[error] Use 'last' for the full log.

BOOM!

OutOfMemoryError. “PermGen Space.” What?

I ran it again. This time, my GUI popped up but it had no grid, and the terminal started hexdumping frantically. Same mysterious error. hexdump Okay, what the heck is PermGen space? Apparently it has to do with Java’s garbage collection. Objects are divided into a bunch of levels based on how often they’re used; the block where objects are created frequently is also garbage-collected more frequently. There’s a special “permament” block set aside for classes or something that’s never garbage collected, and the size of the block is itself limited so if you have too many classes, bad things happen.

Of course others have bumped into this error, although most seem to be doing web apps, and it can (probably) be fixed by increasing the heap size as here. The post suggests that it’s due to SBT not knowing it can garbage-collect certain closures. Oh well, so Scala creates even more overhead on top of the JVM; too bad but I can live with that because, as I’ve said before, functions are first-class objects omgwtfbbq.

Edit 2014/12/30: While clicking random links I noted that that domain has expired. The post is still accessible at wordpress.com, but to be safe, the fix is to put this line in the file ~/.sbtconfig :
SBT_OPTS="-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:PermSize=256M -XX:MaxPermSize=512M"

What I ended up doing, though, was just running it for the third time. The heisenbug vanished.

A few from-scratch recompiles later (to get git to recognize what I did to the files — git mv src/gridderface src/main/scala — instead of deleting and readding dozens of files), I ended up always having the first sbt compile throwing this error, while the second one succeeds. I’m not sure why that should be; I guess some closures or classes or whatever are compiled the first time and cached for the second, so then they don’t take up space.

What this means is, it is now apparently possible to clone my Gridderface repo and let sbt run it by itself. Yay. Of course, now the set of people who would consider using this program is the intersection of the sets of (a) people interested in logic puzzles (b) people who solve them on computer (c) people who like vi-style keyboarding (d) people who are willing to install Scala and mess with the command line. I think the cardinality of this set is most likely one.

Oh well, programming this was fun enough by itself. :D


Because I’m bored, here are the Scala identifier parsing rules! (from the spec (PDF))

“Letters” are the normal lowercase letters and uppercase letters, as well as $ and _ which are considered uppercase letters, and lots of Unicode letters. “Digits” are the normal ten digits. “Parentheses” are ()[]{}; “Delimiter characters” are `'".;, (backquote, single quote, double quote, period, semicolon, comma). Every other printable non-whitespace ASCII character is an “operator character”, as are Unicode mathematical symbols and other symbols.

An identifier may consist of 1. a nonempty sequence of operator characters 2. a nonempty sequence of letters and digits which starts with a letter 3. a nonempty sequence of letters and digits which starts with a letter, then an underscore, then a sequence of operator characters 4. virtually anything surrounded by backquotes (needed to call a Java library method named with a Scala keyword, e.g. `Thread.yield`)

Variable identifiers are those which start with a lowercase letter. Others are constant identifiers.

(note: the commenting setup here is experimental and I may not check my comments often; if you want to tell me something instead of the world, email me!)