Noordstar Blog

foss

As can be read in my first blog post, I intend to build an Elm-like language with some unique design choices – and the community has taught me some valuable things!

TLDR: My design seems akin to cutting-edge programming languages. I have a functional proof-of-concept! I'm not content with the syntax yet.

My idea isn't original – but it seems new

One of the most intriguing parallels to selective mutability I've discovered is the “resurrection hypothesis” mentioned in a 2019 paper called Counting Immutable Beans. The resurrection hypothesis mentions the idea that many objects die just before the creation of an object of the same kind.

The map function is a great example to this:

type Tree a = Leaf a | Node (Tree a) (Tree a)

map : (a -> b) -> Tree a -> Tree b
map f tree =
    case tree of
        Leaf x ->
            Leaf (f x)

        Node tree1 tree2 ->
           Node (map f tree1) (map f tree2)

If the language is purely immutable, then you might use twice the memory necessary: you would traverse through the tree, build a new tree with the identical structure, and then discard the old one. But imagine that the update function of our MUV-system looks like this:

type alias Model = { name : String, tree : Tree Int }

update : Int -> Model -> Model
update n model =
    { model | tree = map (\x -> x * 2 } model.tree }

Most immutable languages would duplicate the tree and discard the old one, effectively doubling memory usage. But with selective mutability, the tree could be updated in-place.

There's also other programming languages developing similar ideas. Research language Koka uses Perceus, an algorithm that offers reference counting in a way that avoids a garbage collector. Similarly, Neut does what they call static memory management, where they find malloc/free pairs of matching sizes to optimize around the resurrection hypothesis.

A proof-of-concept works – for now

As an experiment, I have built a proof-of-concept transpiled version. A major challenge with catching memory problems is that most OS systems aren't built for catching memory issues. From my understanding, C, LLVM and Rust typically rely on the OS to manage the stack & heap. If there's an overflow, or some other problem, the OS terminates the program, reporting a segmentation fault. Not very helpful!

As a result, I have designed my own stack/heap system in a C program. Similar to a VM, the code runs in a single block of memory that's assigned to the program on startup. It functions reliably, regardless of available memory size.

For now, this snippet represents the decompiled version of the file low.c:

main =
    h "Kaokkokos shall prevail by the hands of Alkbaard!"

h : String -> String
h x = f ( g x )

f : String -> String
f = String.upper

g : String -> String
g = Console.print -- currently an identity function with side-effects

This proof-of-concept shows that a memory-aware runtime is feasible, though Mem.withDefault handling is still pending implementation.

Memory-aware language design might need some changes

There's two major challenges with a design using => operations. It's a difficult concept to understand, and it limits the capability for the language to compile to environments where memory cannot managed.

As a result, it might be rewarding to design the types in a Mem module that is only usable in environments where memory CAN be managed. This has several downsides to be considered, but the confusion of the => operation might not necessarily be a better directly. For example, consider the following two functions:

-- Example 1
foo1 : Foo => Foo -> Foo
foo1 x =
    (\y -> bar x y )
        |> Mem.withDefault identity

-- Example 2
foo2 : Foo -> Foo => Foo
foo2 x y =
    bar x y
        |> Mem.withDefault defaultFoo

Both functions are essentially different, offering different guarantees in different scenarios. While foo1 is guaranteeing to return a Foo type when two Foo types have been inserted, foo2 guarantees to return a Foo -> Foo function after one Foo type has been inserted.

This guarantee sounds relatively reasonable when you look at the code, but how much effort would it cost to write a fully memory aware function?

foo : Foo => Foo => Foo
foo x =
    (\y -> bar x y |> Mem.withDefault defaultFoo )
        |> Mem.withDefault identity

This is a rather unappealing way to write code, and quite difficult to read. This might need some reworking.

Conclusion

I have learned a few more concepts and the development seems to go rather well! I am encountering less hurdles than expected, and the design seems manageable.

As with the previous post, I am very much open to ideas. Let me know if you have any thoughts to share!

#foss #functional #languagedesign


Older English post <<<

Older post <<<

Introduction

As a programmer who has experienced the elegance of writing Elm, I’ve often wished for a language that extends Elm’s core philosophy beyond the browser. While many programming languages emphasize type safety, immutability, and purity, few address memory safety as a core language feature.

What if we designed a programming language where memory failures never crash a program? Where aggressive dead code elimination produces highly optimized output? And where every function is guaranteed to be pure and immutable?

This article outlines a conceptual framework for such a language—its principles, challenges, and potential optimizations.


Core Principles

1. Functional, Pure & Immutable

Everything in the language is a function. Functions are pure, meaning they always return the same output for the same input, and immutability is enforced throughout. Even variables are just functions with zero arguments.

This ensures strong guarantees for compiler optimization and program correctness.

2. Side-Effects Managed by the Runtime

Like Elm, side-effects cannot be executed directly by user code. Instead, side-effects must be passed to the runtime for execution. This delegates responsibility to the runtime designers and allows the compiler to assume that all side-effects are managed safely.

3. Memory Safety as a Core Language Feature

This language ensures programs never crash due to memory exhaustion. A special memory-safe module (Mem) allows functions to specify default return values in case of memory failure:

add : Int -> Int => Int
add x y =
    x + y
        |> Mem.withDefault 0

Mechanism

  • The => syntax signals a memory-safe function.
  • Mem.withDefault 0 ensures a fallback return value in case of failure.
  • Default values are allocated at startup to prevent mid-execution failures.

By guaranteeing upfront memory allocation, runtime failures are prevented once the runtime passes the initial startup phase.


Handling Dynamic Data Structures

Since the language enforces immutability, dynamically sized data structures must be created at runtime. If memory limits are reached, functions must define fallback strategies:

  • Return the original input if allocation fails.
  • Return an default value specified by the developer.

Ideally, memory exhaustion can be explicitly handled with a dedicated return type:

type Answer = Number Int | OutOfMemory

fib : Int => Answer
fib n =
    case n of
        0 -> Number 1
        1 -> Number 1
        _ ->
            case (fib (n - 1), fib (n - 2)) of
                (Number a, Number b) -> Number (a + b)
                _ -> OutOfMemory
    |> Mem.withDefault OutOfMemory

Extreme Dead Code Elimination

The compiler aggressively removes unused computations, reducing program size. Consider:

type alias Message =
    { happy : String
    , angry : String
    , sad : String
    , mood : Bool
    }

toText : Message -> String
toText msg =
    if msg.mood then msg.happy else msg.angry

main =
    { happy = "I am happy today."
    , angry = "I am extremely mad!"
    , sad = "I am kinda sad..."
    , mood = True
    }
    |> toText
    |> Mem.withDefault "Ran out of memory"
    |> Console.print

Optimization Process

  1. Since mood is always True, the else branch is never used.
  2. The function simplifies to toText msg = msg.happy.
  3. The .angry, .sad, and .mood fields are removed.
  4. Message reduces to type alias Message = String.
  5. The toText function is removed as a redundant identity function.

Final optimized output:

main = Console.print "I am happy today!"

While this may require too many computations at compile-time, all of these optimizations seem fair assessments to make.


Compiler-Assisted Mutability for Performance

While immutability is enforced, the compiler introduces selective mutability when safe. If an old value is provably unused, it can be mutated in place to reduce memory allocations.

Example:

type alias Model = { name : String, age : Int }

capitalizeName : Model -> Model
capitalizeName model =
    { model | name = String.capitalize model.name }

Normally, this creates a new string and record. However, if the previous model.name isn't referenced anywhere else, the compiler mutates the name field in place, optimizing memory usage.


Compiler & Debugging Considerations

For effective optimizations, the compiler tracks:

  • Global variable usage to detect always-true conditions.
  • Usage patterns (e.g., optimizing predictable structures like Message).
  • External data sources, which are excluded from optimizations.

To aid debugging, the compiler could provide:

  • Graph-based visualization of variable flow.
  • Debugging toggles to disable optimizations selectively.

Conclusion: A New Paradigm for Functional Memory Safety?

Most languages handle memory safety through garbage collection (Java, Python), manual management (C, C++), or borrow-checking (Rust). This language proposes a fourth approach:

Memory-aware functional programming

By making memory failures a core language feature with predictable handling, functional programming can become more robust.

Would this approach be practical? The next step is to prototype a minimal interpreter to explore these ideas further.

If you're interested in language design, memory safety, and functional programming, I’d love to hear your thoughts!

#foss #functional #languagedesign


Older English post <<< >>> Newer English post

Older post <<< >>> Newer post