Sunday, December 30, 2012

Introducing Flect

The Flect project is an attempt to implement a pragmatic, functional systems programming language as a real world alternative to C and C++ in both freestanding and hosted programming environments.

The project was created for a variety of reasons described on the introduction page. This page also describes the language at a high level and the features intended in the long run.

The language specification can be found here.

Note that Flect is very much a work in progress and is by no means ready for actual use.

Upcoming CI fleet changes

We're planning to do some changes to the CI fleet over the course of the next week or so.

First, the FreeBSD and Solaris machines will be taken down.

We have not had much luck in porting the D tool chain to Solaris and seeing as there has been little demand for it, we have decided to drop the efforts. Worse yet, LLVM doesn't build on Solaris anymore.

The FreeBSD machines are being taken down because they have been a pain to keep running in the CI fleet. Most of the time, they cannot maintain a stable connection to the master node which has resulted in many failed builds that wouldn't really be failures. It is also clear that there is only a very small FreeBSD user base.

So, having said this, all plans to support Solaris are dropped. We will maintain FreeBSD support - just not in the CI infrastructure.

Second, the OS X machine will be getting an upgrade to Mountain Lion at some point. This means that projects such as LDC will be executed on the OS X machine in the near future.

Lastly, the Fedora/x86 machine has been shut down, but we plan to set up a new machine to take its place soon.

Sunday, June 10, 2012

libgc-d version 1.1 released

Version 1.1 of libgc-d has been released. This is a simple binding library for the libgc library. It has been tested with MCI's interpreter on Linux and OS X and should be ready for production use.

Highlights of this release:
  • Support for building with D 2.0 versions of the LDC compiler.
  • Minor fixes to the build script.
  • GC initialization fixes in the test suite.
  • Added a helper function for marking pointer reachability.
  • Added helper functions to hide/reveal pointers for disappearing links.
Archives can be downloaded here.

libffi-d version 1.1 released

Version 1.1 of libffi-d has been released. This library is a wrapper around the C library libffi which is useful for dynamic invocation of native functions. Since libffi-d provides a more 'D-like' wrapper around the C library, as opposed to just being a binding, there is a certain amount of logic in libffi-d that is actually quite error-prone. It has been battle-tested in MCI's interpreter on Windows, Linux, and OS X.

Highlights of this release:
  • Support for building with D 2.0 versions of the LDC compiler.
  • Some minor fixes to the build script.
  • The Visual D project files are now only for VS 11 and Visual D 0.3.32.
Archives can be downloaded here.

Wednesday, June 6, 2012

Nightly Arch Linux package for MCI available

We now maintain an Arch Linux PKGBUILD for nightly MCI builds in the AUR. Note that, despite what the package name might suggest, it is only updated once a week on average. It downloads a nightly MCI package and builds it with a debug configuration and with stripping disabled. It supports both 32-bit and 64-bit x86.

To install the package, just do:

yaourt -S mci-nightly

Note that, being a nightly package, it is by no means stable.

Update: This package is no longer available. A new package is available which builds directly from Git. Thus, you can do:

yaourt -S mci-git

You may wish to remove the old package first:

yaourt -R mci-nightly

Friday, June 1, 2012

CI server maintenance

The primary CI server is currently undergoing maintenance, which means that ci.lycus.orgnightlies.lycus.org, and api.lycus.org will be down for about a day. The maintenance primarily consists of a partition resize operation which should leave significantly more free space for the Jenkins instance.

Update: OK, we're back online. Note, though, that all existing nightly packages have been wiped (but they will still be built).

Thursday, April 26, 2012

libgc-d version 1.0 released

Version 1.0 of libgc-d has been released. This is a simple binding library for the libgc library. It has been tested with MCI's interpreter on Linux and OS X and should be ready for production use.

Archives can be downloaded here.

libffi-d version 1.0 released

Version 1.0 of libffi-d has been released. This library is a wrapper around the C library libffi which is useful for dynamic invocation of native functions. Since libffi-d provides a more 'D-like' wrapper around the C library, as opposed to just being a binding, there is a certain amount of logic in libffi-d that is actually quite error-prone. It has been battle-tested in MCI's interpreter on Windows, Linux, and OS X.

Archives can be downloaded here.

Wednesday, April 18, 2012

Itanium machine acquired

We have just acquired a HP Integrity rx4640 AB370A machine thanks to a donation from Tony Young. We expect to start porting MCI to Itanium as soon as GDC (the GNU D Compiler) is updated to the DMD 2.059 front end.

Monday, March 12, 2012

Support for user header data

It's very common for languages to need to attach some sort of information to objects in memory. At first glance, doing this with MCI seems easy: Just insert an extra hidden field on all types. Problem solved!

... Except, what about arrays and vectors?

It's perfectly normal to want to attach some language-specific type information to arrays and vectors in addition to plain objects. For instance, your language may want to keep information about the array's specific encoding, or perhaps some element type qualifiers need to be stored.

For this reason, we've added an extra field to the header of all RuntimeObject instances. It's a word-sized field which can hold any reference type (i.e. a plain reference, an array, or a vector). It can be accessed through the new field.user.addr, field.user.get, and field.user.set instructions.

This design does mean that there's an entire word of extra memory for all managed objects. This new field can be useful, but in cases where you don't need it, it just sits there eating memory for no good reason.

Unfortunately, there's no good way to get rid of the field when a program doesn't use it. It would introduce a lot of ABI compatibility issues to let programs disable it (and also complicate a lot of MCI code). We do recognize that on 32-bit machines, that extra word can actually make a difference, but the world is obviously moving towards 64-bit architectures these days, and so we've chosen to simply rely on the vastly increased address space in those.

Thursday, March 1, 2012

Introducing libgc-d

We've just published libgc-d on GitHub (and it is building on the CI server as well)!

This is a bare-bones binding to the libgc library (also known as the Boehm-Demers-Weiser GC). It is a conservative garbage collector targeting applications written in languages such as C, C++, D, etc. We're using libgc-d in MCI as we now support libgc as a garbage collector option on all non-Windows platforms.

Saturday, February 25, 2012

CI server now building for 32-bit x86

We've just added build jobs for 32-bit x86 on the Lycus CI server. We're looking for x86 machines running Windows (7 or Server 2008) and OS X (Lion) that we can run Jenkins slave instances on. If you have such systems available and are willing to let us use them to run libffi-d and MCI builds on, please drop us an email! We also welcome BSD systems (FreeBSD and OpenBSD in particular). AIX, Solaris, and Hurd machines are also welcome, although these are given low priority as far as porting MCI goes.

Our plan is still to port MCI to ARM, PowerPC, MIPS, and Itanium. For ARM, we already have an Efika MX (with support for hardware-assisted floating-point) that we need to set up somewhere. If anyone wants to donate power and an Internet connection for this machine, it would be greatly appreciated! As far as MIPS goes, we have our eyes on the RouterStation Pro. Obtaining it is easy enough, but again, we need somewhere to have it running. For PowerPC, we're planning to get one of the older Mac machines (since anything non-Apple seems to be way too expensive, ironically). We don't have any plans for Itanium yet. Getting Itanium hardware is virtually impossible for us since we lack funding of any kind.

Friday, January 20, 2012

Changes to support precise GC

When we originally designed MCI's ISA, we didn't consider the fact that precise garbage collection would require a more strict type system and well-defined memory layout. In order for precise garbage collection to work, all GC-tracked objects in the heap must carry some sort of type information. In the previous memory layout, objects would sometimes not carry a header (which contains type information and GC bits), thus forcing the GC to be conservative. This is clearly bad, since precise GC is one of the most important features of modern garbage-collected systems.

Other virtual machines, such as the CLR and the JVM, have a distinct object reference type. This type cannot be converted (e.g. to a pointer), nor can it be dereferenced. This is ideal, because with these two constraints, we can enforce that GC-tracked objects must always have a header. On the other hand, such a type severely limits what unsafe code can do. Whether this is a good or a bad thing is arguable, but it does nonetheless reduce the amount of languages that can target the MCI. For instance, the D programming language lets you cast an object to void* if you so desire. Such things are, of course, type-unsafe and cannot be expected to work with a precise GC. We don't see any immediate reason to support language features like this one.

So, the following changes have been made:
  • We've introduced the reference type. This is a type specification similar to a pointer. The difference is that it may have at most one indirection, and the element type must be a structure type. The object a reference refers to is guaranteed to have a header. If we have a structure type Foo, the syntax for a reference to it would be Foo&.
  • Arrays and vectors are now GC-tracked. This is another necessary change to avoid conservative scanning. If arrays and vectors would remain native data types, we wouldn't be able to scan them precisely. This means that, in practice, arrays and vectors work similarly to references, though they allow some more operations, like conversions.
  • Arrays now know their dynamic length. If we didn't make this change, we wouldn't be able to reliably scan arrays, as we couldn't possibly know their size. This would mean that any object inserted into an array would have to be pinned (more on pinning later), resulting in terrible GC performance and heap fragmentation (not to mention complex code).
The introduction of reference types means that field.get, field.set, and field.addr have been changed to accept these types. Additionally, we've introduced a new instruction, array.len, which fetches the length of an array. Perhaps more importantly, mem.* instructions have been completely revised: We've fused the mem.gcalloc, mem.gcnew, and mem.gcfree instructions with mem.alloc, mem.new, and mem.free. Since references are now distinct, there's no longer a need for the GC variants of those instructions.

Now that arrays know their dynamic length, you might wonder if we're planning to allow using them in arithmetic and logic operations like vectors; after all, this seems like a natural thing to do, since optimizing these operations is relatively trivial. We haven't made up our minds about this just yet. The reason these things are allowed with vectors is that the compiler can statically unroll the operations because the vector's length is known. This cannot be done with arrays, and therefore, the benefits of allowing SIMD-style operations on them might not be worthwhile.

On to pinning. Pinning an object has two implications: The object will be considered reachable by the garbage collector until unpinned (i.e. it will never be collected until unpinned), and it cannot be moved by a copying or compacting GC. Both of these things are performance issues, but they are a necessary evil. When using a precise GC, we cannot pass an object to native code without pinning it. Since the GC has absolutely no knowledge of external code and its usage of managed objects, there is no way it would be able to do correct reachability analysis (not even conservatively). Another issue that copying and compacting collectors face is that after moving an object, they would have to update all external references. This is not practical at all, and again for the same reason: The collector has no knowledge about the external code's usage patterns. In other words, objects have to be pinned when passed to external code, and unpinned when the external code has finished running. This does, of course, require knowledge of the external code's implementation details, but there is no way around this.

The MCI provides two instructions for doing the above: mem.pin and mem.unpin. Both work with references, arrays, and vectors.

Another reason that we had to make these changes is that the interpreter and the JIT compiler have to share the same memory layout. If they didn't, GCs would have to special-case execution with the interpreter, which is clearly horrible design.

Long story short: With these changes, we should be able to support precise garbage collection in the MCI, at the negligible cost of reducing the amount of type-unsafe operations that will work as "expected".

Saturday, January 14, 2012

Looking for PowerPC and Itanium machines

We're looking to port MCI to PowerPC and Itanium as soon as we can, but unfortunately, both are not the cheapest hardware to acquire. If anyone could provide us with hardware, or just access to boxes for developing on, please contact us!