The future of Emacs, Guile, and Emacs Lisp
LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing |
GNU Emacs is one of the longest continuously developed applications in the free-software world; at just over 30 years old it just qualifies, by some definitions, as a multi-generational project. But such longevity brings its own challenges. Many in the Emacs community have concluded that it may soon be time to replace the editor's internal Lisp interpreter with a faster and more modern alternative. But replacing the underlying virtual machine on which so much of Emacs runs has far-reaching implications—including the impact it would have on Emacs's own flavor of Lisp.
Although the subject has come up before, it was raised most recently on September 11, when Chris Webber wrote to the Emacs development list asking about the status of Robin Templeton's work on Guile-Emacs. Like the name implies, Guile-Emacs replaces the internal Emacs Lisp engine with the interpreter from GNU Guile. The Guile interpreter was originally written to support the Scheme language (itself a dialect of Lisp), but today it supports multiple other languages as well, including Emacs Lisp.
Guile's engine is reported
to be faster than Emacs's internal Lisp interpreter, but it also
offers several other valuable features, like concurrency and the
prospect of supporting Emacs extensions that are written in other
Guile-supported languages. Templeton, who has been working on
Guile-Emacs for the past five years in a series of Google Summer of
Code projects, listed quite a few other
benefits to rebasing Emacs on the Guile engine, including "a
full numeric tower, structure types, CLOS-based object orientation, a
foreign function interface, delimited continuations, a module system,
hygienic macros, multiple values, and threads.
" As of now,
Templeton reports that the vast majority of modules in GNU Emacs, as
well as a significant set of popular external extensions, run reliably
on Guile-Emacs.
When it comes to formally grafting Emacs onto the Guile interpreter, however, there are much stricter requirements to consider: users expect all of their Emacs extensions—including both third-party extensions and their own personal code—to continue to run without incident. That is a tall order indeed. And, as Eli Zaretskii pointed out, Guile is not as stable on systems lacking GNU utilities. Furthermore, while Guile is the official extension language for GNU projects, at the moment it is a considerably smaller project than Emacs itself. Suddenly imposing the larger project's needs on the Guile developers (not to mention imposing the bug reports of the entire Emacs community) might place a considerable strain on Guile project resources. David Kastrup, among others, worried whether the Guile team would be able to handle the load. Suddenly gaining a large set of new users could be good for the Guile project, of course, and attract new contributors, but that outcome is not guaranteed.
Moreover, there is the question of how tightly Guile and Emacs should be coupled. For one thing, the two projects currently use different internal string representations, which means that text must be decoded and encoded every time it passes in or out of the Guile interpreter. That inefficiency is certainly not ideal, but as Kastrup noted, attempting to unify the string representations is risky. Since Emacs is primarily a text editor, historically it has been forgiving about incorrectly encoded characters, in the interest of letting users get work done—it will happily convert invalid sequences into raw bytes for display purposes, then write them back as-is when saving a file.
But Guile has other use cases to worry about, such as executing programs which ought to raise an error when an invalid character sequence is encountered. Guile developer Mark H. Weaver cited passing strings into an SQL query as an example situation in which preserving "raw byte" code points could be exploited detrimentally. Weaver also expressed a desire to change Guile's internal string representation to UTF-8, as Emacs uses, but listed several unresolved sticking points that he said warranted further thought before proceeding.
Finding a common string representation might be doable, then, but it is not the only hurdle to integrating Emacs and Guile. Some, like Stefan Monnier, floated the possibility that Emacs Lisp should be intentionally "evolved" into another, more standardized Lisp dialect. Such an evolution would not be easy in any case, but especially not toward Guile's native language Scheme. It might be more feasible to adapt Emacs Lisp into Common Lisp, but it would still be far from trivial. While there are many areas of compatibility between the two languages, even a few incompatibilities can result in a world of pain for developers. Templeton, for example, noted that Common Lisp has no feature that corresponds to Emacs's buffer-local variables, which are often used in Emacs extensions. But Monnier pointed out an even thornier problem—while Emacs Lisp regards a boolean "false" and an empty list as being equal, other Lisp dialects do not:
Redefining such a fundamental language construct would have
far-reaching ramifications, of course—for the decades' worth of
existing code and for the community of Emacs developers. But even if
Guile's Emacs Lisp interpreter were solid, Monnier added, its Scheme
underpinnings would likely appear in inconvenient places. "I
suspect some errors signals coming from Guile's runtime will end up
using Scheme-style data and will end up spilling into the Elisp side
if we're not extra careful.
"
A related question is whether or not Emacs Lisp should remain the sole language for developing Emacs. As long as Guile supports other languages, including Scheme as well as entirely unrelated options like JavaScript, there is a case to be made that a Guile-based Emacs should support a true foreign function interface and allow users to write Emacs extensions in their language of choice. Opinion on that topic seemed to be sharply divided. Richard Stallman expressed support for the idea of supporting Scheme extensions in Emacs, while several others (including Phillip Lord) are open to unrelated languages like JavaScript as well. Others, such as Monnier, doubted whether real integration with other languages would be possible. Some, like Daniel Colascione, argued that attempting to support outside languages popular at any given moment would be a waste of effort.
In the short-and-medium term, adding support for Emacs extensions written in languages other than Emacs Lisp does not seem to have much traction. There are too many more pressing issues to consider, especially if the project decides to pursue rebasing Emacs on the Guile interpreter. Alternative Lisp interpreters have been discussed, such as Kristian Nygaard Jensen's suggestion of Embeddable Common-Lisp. At best, though, the other alternatives come with the same integration challenges as Guile; there are simply not all that many GPL3-compatible, actively developed Lisp interpreters to choose from. Guile, at least, is an official GNU project, and several members of its development team have expressed support for the possibility of working with Emacs.
The way things stand now, there has not been a formal consensus as
to the eventual replacement of Emacs's internal Lisp interpreter with
Guile. Templeton's Guile-Emacs work has amassed quite a few admirers
in the Emacs development community, but it will likely need further
testing, followed by some firm decisions on the part of key Guile and
Emacs maintainers about exactly what integration would look like.
Given that those conversations (such as the string-handling
discussion) are happening now, there is reason to be hopeful—but
expecting any sort of timeline or roadmap remains premature.
(Log in to post comments)
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 8:23 UTC (Thu) by ncm (guest, #165) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 22:34 UTC (Thu) by smurf (subscriber, #17840) [Link]
However, it has the distinct advantage that your large ASCII text does not suddenly need eight times the storage space just because you insert a character with a smiling kitty face.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 22:49 UTC (Thu) by mjg59 (subscriber, #23239) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 10, 2014 16:08 UTC (Fri) by lambda (subscriber, #40735) [Link]
Except, when pattern matching UTF-8, you can generally just match on the bytes (code units) directly, rather than on the characters (codepoints); the algorithms that need to skip ahead by a fixed n characters are generally the exact string matching algorithms like Boyer-Moore and Knuth-Morris-Pratt. There's no reason to require that those be run on the codepoints instead of on the bytes.
If you're doing regular expression matching with Unicode data, even if you use UTF-32, you will need to consume variable length strings as single characters, as you can have decomposed characters that need to match as a single character.
People always bring up lack of constant codepoint indexing when UTF-8 is mentioned, but I have never seen an example in which you actually need to index by codepoint, that doesn't either break in the face of other issues like combining sequences, or can't be solved by just using code unit indexing.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 12, 2014 6:12 UTC (Sun) by k8to (guest, #15413) [Link]
It's a little more tedious to CUT a UTF8 string safely based on a size computed in bytes than in some other encodings, but not much more, and that's very rarely a fast path.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 14, 2014 17:29 UTC (Tue) by Trelane (subscriber, #56877) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 13:53 UTC (Thu) by rsidd (subscriber, #2582) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 10, 2014 16:14 UTC (Fri) by lambda (subscriber, #40735) [Link]
Speed, and concurrency, are absolutely an issue for me. When I use some of the more advanced modes which provide things like completion and M-. (go to definition) for Python, on large source files with lots of imports, it sometimes gets unbearably slow, as a lot of operations are synchronously waiting for RPC calls to the backend that is parsing your file and looking for appropriate completion candidates.
Now, I'm not much of an elisp hacker, so I don't know if it could be done better with the current backend. But if Guile will allow people to write new extensions that can more easily do operations in the background without blocking the UI, I'm all for it.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 12, 2014 6:14 UTC (Sun) by k8to (guest, #15413) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 21:25 UTC (Thu) by Jandar (subscriber, #85683) [Link]
[1] since the 'b' within the word "blob" means binary I don't use the abomination "binary blob".
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 23, 2014 15:33 UTC (Thu) by nye (guest, #51576) [Link]
Trivia time:
Although it is sometimes playfully backronymed, 'blob' is not actually an acronym, but a standard English word (and for the benefit of non-native speakers who may be unaware: it's actually a fairly common one in widespread use).
Also, when it was first backronymed, the 'b' did not stand for 'binary', but for 'basic'.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 24, 2014 15:01 UTC (Fri) by hirnbrot (guest, #89469) [Link]
So the word is at least older than UNIX.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 24, 2014 17:06 UTC (Fri) by tialaramex (subscriber, #21167) [Link]
Two modern dictionaries I have both suggest the word first appeared in the 15th century. If so, that's before Early Modern English stabilised. So as far as the language we use today is concerned, the word "blob" has always existed although its exact meaning has drifted of course.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 24, 2014 18:30 UTC (Fri) by Jandar (subscriber, #85683) [Link]
When was the last time someone mentioned "blob" on LWN (with or without the prefix "binary") where it hadn't referred to a Binay Large OBject?
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 24, 2014 19:15 UTC (Fri) by bfields (subscriber, #19510) [Link]
http://web.archive.org/web/20110723065224/http://www.cval...
>>>The term blob, as Jim would point out, is not an acronym for anything.
>>
>>It's not Binary Large OBject?
>
>No, that was a marketing anti-acroynm invented after the fact because
>somebody thought 'blob' was unprofessional.
Blob's a common English word whose everyday meaning ("an indeterminate mass or shape", says google) already works just fine for describing opaque data.
The fact that I haven't seen anyone on lwn capitalize it like an acronym suggests none of us are thinking of the made-up acronym.
blob
Posted Oct 24, 2014 20:48 UTC (Fri) by tialaramex (subscriber, #21167) [Link]
All that's happened here is that you were walking around with a glitch in your internal dictionary. Happens to everybody. Humans mostly learn words not by purchasing a big book of definitions and reading it, but by inferring their meaning from context. Robert Browning got the idea this way from an old poem he'd read that "twat" meant some sort of headgear for senior nuns. So when he needed a word that rhymed nicely with "bats" in a religious context he picked "twats" and now generations of English students get to laugh at his error when they read his poetry at undergraduate level. You are not likely to be so unlucky.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 22:56 UTC (Thu) by ch (guest, #4097) [Link]
[1] CMU Common Lisp was public domain (because of Fahlman's wisdom). This is in a large sense the original OSS license that existed for a while before lawyers required keeping copyright in order to enforce CYA clauses.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 9, 2014 22:59 UTC (Thu) by ch (guest, #4097) [Link]
An embedded polygot VM in Emacs is probably the answer.
Posted Oct 10, 2014 16:24 UTC (Fri) by bkuhn (subscriber, #58642) [Link]
I've admittedly never been fully convinced at RMS' claim that we could compile all other languages down to Guile. I have a master's degree specifically focusing on multi-lingual VMs, and I'm somewhat convinced this is all much harder than it looks. (Admittedly, my expertise on this is now antiquated by years of focusing on licensing and politics instead. ;)
FWIW, if I were going to approach this problem anew, I'd be interested froma technical perspective if it's possible to embed Vert.x in Emacs and write an Elisp implementation for Vert.x. Sadly, it seems that Vert.x has switched to a GPL-incompatible license, so I guess this is a non-starter for licensing reasons. I don't get why they abandoned the Apache license.
An embedded polygot VM in Emacs is probably the answer.
Posted Oct 12, 2014 6:16 UTC (Sun) by k8to (guest, #15413) [Link]
Somehow I assume any compute model can be represented in almost any VM, though it may require a certain level of ridiculousness to make it go. Maybe I'm wrong in my assumption?
An embedded polygot VM in Emacs is probably the answer.
Posted Oct 13, 2014 18:03 UTC (Mon) by bkuhn (subscriber, #58642) [Link]
That's the tough part: most JVM ports are simply reimplementations of an interpreter in Java, or at the very least, a very heavy-weight runtime library for the language in Java.
Similarly, you're going to face that with Guile. However, if Guile's own VM has gotten versatile now, that might work well.
The worst possible solution (?)
Posted Oct 10, 2014 16:47 UTC (Fri) by b7j0c (subscriber, #27559) [Link]
Hopefully this doesn't turn into a huge distraction for emacs...I accept that elisp basically sucks, but I don't want the limited energy available for emacs hacking being dedicated to a change that won't matter and may fracture the community.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 10, 2014 16:58 UTC (Fri) by NalaGinrut (guest, #61470) [Link]
Others, such as Monnier, doubted whether real integration with other languages would be possible. Some, like Daniel Colascione, argued that attempting to support outside languages popular at any given moment would be a waste of effort.
-------------------------------------end---------------------------------
Yes, it's not so easy to integrate other languages, especially non-Lisp dialect on a Scheme platform. Elisp could be relative easy one, but others are challenging.
But to be fair. Guile community lack multi-lang development experience with the new compiler tower, because they haven't gotten the chance. So it could be slow progress and unstable. What if they pass such difficult time?
I think someone concerned that multi-lang may cause Scheme become even colder(almost no one use it in real world) since people can use other popular language on Guile platform.
Well, it's poor to afraid so. I have the confidence. If Scheme can prove to achieve the multi-lang feature in real practice, then Scheme would have proved the power of itself. People will think about it reasonably.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 13, 2014 15:21 UTC (Mon) by mtk (subscriber, #804) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 13, 2014 21:31 UTC (Mon) by alogghe (subscriber, #6661) [Link]
Currently I'm quite happily using https://github.com/nsf/gocode for example within Emacs.
Most language support in the other major editors is going this route as well.
That being said, its likely that the FFI that Guile brings to Emacs would be of tremendous value in adding adding bits of polyglot language support.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 14, 2014 2:05 UTC (Tue) by lsl (guest, #86508) [Link]
Plugins for Emacs or Vim doing exactly that are no longer a question for the future. They exist for several years now, using e.g. libclang for C/C++/ObjC or the mentioned gocode for the Go language. There's no reason all this code needs to be part of the text editor.
Those editor plugins mostly tend to work by either being glue around some library, by querying some code oracle server over some IPC mechanism or by just shelling out to some external program. The latter is more than adequate for a big number of cases (e.g. code reformatting) and has the advantage of being nice, simple and general.
Using frontends of actual production compilers also avoids the issue of your IDE disagreeing with the compiler, which seems to be a common problem among IDE users. I think QtCreator is moving in the same direction for exactly this reason.
Useful tools for working with code don't need to be part of an editor or IDE and IMHO they shouldn't be. Build them so they're useful on their own. Those who want to interact with them through their favorite text editor can write the necessary integration code.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 14, 2014 2:16 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 15, 2014 16:58 UTC (Wed) by nix (subscriber, #2304) [Link]
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 15, 2014 18:04 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]
It's mostly used for simple completions and nothing else. IntelliJ products can use their semantic model for smart code refactoring and inspection.
Oh, and CEDET's parser is pitifully inadequate even for plain C, it has no hope of ever understanding C++ completely.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 21, 2014 16:30 UTC (Tue) by nix (subscriber, #2304) [Link]
Obviously getting it right for C++ is impossible -- but you need a C++ compiler to understand C++ properly. Actually you need something *better*, because you want to be able to parse incomplete, partially written, and syntactically invalid code as much as possible. That sort of error recovery is *hard*.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 21, 2014 16:36 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]
Pure C is also not that easy, because lots of C++'s syntactic ambiguity actually comes from C.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 23, 2014 1:34 UTC (Thu) by cmccabe (guest, #60281) [Link]
Details here: http://eli.thegreenplace.net/2007/11/24/the-context-sensi...
C++ has an undecidable grammar.
I can parse C with yacc. I could never hope to parse C++ with anything but a hand-written parser.
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 23, 2014 11:41 UTC (Thu) by jwakely (guest, #60262) [Link]
int Foo (int i = T<1, int>::i);
(See http://open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#325 for other examples like that)
The future of Emacs, Guile, and Emacs Lisp
Posted Oct 24, 2014 14:56 UTC (Fri) by taylanub (guest, #99527) [Link]
I think this is wrong. As far as I know, all Elisp functions (whether C subrs or Elisp defuns), including the string API, are Guile procedure objects in Guile-Emacs, just as Scheme procedures are. And Elisp strings are just one other data type aside Scheme strings. So working with Elisp strings in Guile-Emacs just means that the Guile VM applies procedures that are Elisp functions to objects that are Elisp strings; no encoding or decoding is involved.
(It could be that the Scheme string type gets benefits from being a native type or so, but I don't think that makes much of a difference either.)
IFF however you want to pass Elisp strings to a Scheme API (which naturally expects Scheme strings, not Elisp strings), you need to convert. Monnier's concern about Scheme errors getting caught by Emacs's interactive debugger comes into consideration here; indeed if you cause some error in the current (early) version of Guile-Emacs, you can get some rather horrific error messages.
It needs work to get a seamless experience, but I'm optimistic. The benefits we will be getting are huge IMO.
> Since Emacs is primarily a text editor, historically it has been forgiving about incorrectly encoded characters, in the interest of letting users get work doneāit will happily convert invalid sequences into raw bytes for display purposes, then write them back as-is when saving a file.
> But Guile has other use cases to worry about, such as executing programs which ought to raise an error when an invalid character sequence is encountered.
I think that whole topic was blown out of proportion. What we want is simply that Guile's normal string API works with valid UTF-8 only by default, but there are knobs you can play with --or an extra API-- to allow badly encoded data. Emacs will then use those knobs/API to keep its current behavior. There is no fundamental issue here; Guile will just have to support both use-cases and Emacs will use the right one for its purposes.
> So Guile-Emacs picked one of those as being Elisp's nil, so as long as you stay all within Elisp things work just fine (presumably), but as soon as some Scheme gets into the picture you might get new values which are similar to nil but aren't `eq' to it.
The last part is wrong and I'm partly to blame for spreading that claim. Here's the correction:
According to Templeton, Elisp's `eq' will consider Scheme null, Scheme false, and Elisp nil to all be `eq' to each other. The current Guile-Emacs behavior is just a bug.
So when writing Elisp code, you have *no* issue whatsoever.
The same doesn't go for Scheme though; `eq?', `eqv?', and `equal?' all distinguish between the three objects. So Guile Scheme code, if it interacts with Elisp code, has to be careful. This doesn't affect vanilla Emacs users using Guile-Emacs, and neither does it affect vanilla Guile (Scheme) users; it only affects those who want to go on to adventures and extend or control Emacs via Scheme code.