GEP-20
Abstract
This GEP extends Groovy’s parens-form multi-assignment with rest
bindings (*t) at the tail, head, or middle of the declarator list,
and map-style key destructuring (name: n). The extensions are
strictly additive — every program valid in Groovy 4 / 5 today
compiles with identical semantics — and each new shape uses an
unparseable token sequence in the existing grammar, so the proposal
claims previously-unused syntax rather than reinterpreting any
existing form.
The proposal targets Groovy 6.x.
Motivation
Groovy’s multi-assignment in def (…) has been stable since Groovy
1.6. Today’s grammar supports only positional binding:
def (a, b, c) = list
def (int x, String y) = pair
Three idioms are common in modern languages but unavailable in Groovy:
-
Tail rest binding:
def (h, *t) = list— separate the head from the remainder. Common in functional code (recursive list traversal, pipeline-style processing of streams and iterators). -
Map-style destructuring:
def (name: n, age: a) = person— extract named properties or map keys directly, without intermediate variables. -
Rest in non-tail positions:
def (*f, last) = list,def (l, *m, r) = list— useful when the boundaries are at the ends rather than the middle.
Adding these to the parens form keeps the grammar familiar (no new bracket form, no marker words) while filling the gaps.
Design principles
-
Strictly additive — every program valid in Groovy 4 / 5 compiles with identical semantics. New forms use grammar that fails to parse today.
-
No new keywords — the new shapes use only existing tokens (
*,:).def,var, andvalare also accepted before a binder for symmetry with switch case patterns and the bracket-form grammar in GEP-19, but no marker is required:def (a, b) = …works without any inner marker. -
stays an identifier — existing idioms likedef (, y, m) = Calendar.instancekeep working without warning. This proposal introduces no wildcard semantics for_. -
Reuse Groovy’s existing tokens — the rest marker is
*, matching Groovy’s spread operator. The map form useskey: valuesyntax matching Groovy map literals. -
Sized contracts only when needed — forms that can be lowered streaming-style (tail rest) accept iterators and unbounded sources via an
iterator()fallback. Forms that genuinely need to know the RHS size (head rest, middle rest) document the sized-RHS contract and fail fast against non-conforming input.
Features
Existing positional form (recap)
For reference, def (a, b, c) = rhs evaluates rhs once into a
synthetic local then assigns each declared variable from
rhs.getAt(0), rhs.getAt(1), rhs.getAt(2). This is unchanged. Any
RHS supporting getAt(int) continues to work — List, String,
arrays, ranges, iterators (via DGM’s getAt(Iterator, int)), and any
user-defined class providing the operator.
Type ascriptions on positional bindings (def (int a, String b)) are
likewise unchanged.
Using _ for unwanted slots (recap)
For many versions, _ has been a valid variable name in Groovy.
It is used by things like REPLs or in script variable bindings to store the last result,
and has been used by convention for results you don’t care about.
In Groovy 5.0 (GROOVY-10943), this was extended to allow multiple _ binders in
the same declarator list without a "duplicate variable" error, following the
convention across Scala, Rust, Kotlin, and Python.
GEP-20 adds nothing to it. The new rest and map-style binders
inherit the same convention; see _ and other identifiers for the
cross-form matrix. For completeness, we’ll recap the existing feature.
A binder named _ is the conventional "I don’t care about this slot"
marker, useful when the RHS yields more elements than you intend to
consume:
def (_, year, month) = Calendar.instance // discard era; keep year, month
def (_, _, day, _) = Calendar.instance // discard era, year, month, ...
Reading _ within scope behaves differently across compilation
modes. In dynamic Groovy, _ is a normal local; the slot is bound
and the value is reachable, but by convention typically not used.
Under @TypeChecked / @CompileStatic, reading _ is reported
as The variable [_] is undeclared.
Binder markers (def, var, val)
def, var, and val may appear before any binder; all are equivalent to
omitting a type:
def (var a, var b) = [1, 2] // same as: def (a, b) = [1, 2]
def (def a, var b, int c) = [1, 2, 3] // mix and match
def (name: var n, age: val a) = person // also in map-style
Modifier propagation
Modifiers on the outer declaration propagate to every binder, including the new rest and map-style binders introduced below:
final (a, *t) = list // both `a` and `t` are final
final (name: n, age: a) = person // both `n` and `a` are final
Tail rest binding
A single *ident at the end of the declarator list captures the
remaining elements:
def (h, *t) = list
def (a, b, c, *rest) = list
The rest binding is well-defined for any RHS that supports either
getAt(IntRange) or iterator(). The compiler picks one of three
lowerings, in dispatch-precedence order:
// Path A — RHS is a java.util.stream.Stream:
def __rhs = rhs
def __it = __rhs.iterator()
def h = __it.hasNext() ? __it.next() : null
def t = StreamSupport.stream(
Spliterators.spliteratorUnknownSize(__it, 0),
false
).onClose { __rhs.close() }
// Path B — RHS supports getAt(IntRange):
def __rhs = rhs
def h = __rhs.getAt(0)
def t = __rhs.getAt(1..-1)
// Path C — RHS iterable, neither Stream nor range-indexable:
def __it = rhs.iterator()
def h = __it.hasNext() ? __it.next() : null
def t = __it
Path A is the dedicated branch for Stream<T>. Although Stream has a
getAt(Stream, IntRange) extension, that overload materialises
eagerly into a List<T> — defeating the GEP-20 guarantee that
unbounded sources never silently materialise. Stream therefore
bypasses Path B and gets its own lazy iterator-wrap path.
Path B is defined by capability: any non-Stream RHS with a
resolvable getAt(IntRange) participates, regardless of class. The
rest binder’s type tracks the actual return type of that overload —
it is not hard-coded to List<T>. So the rest of a String slice is
a String; the rest of a BitSet slice is a BitSet; the rest of a
user class whose getAt(IntRange) returns the same class is that
same class.
Path C is the fallback for anything else iterable. It’s the key
enabler for unbounded sources: given an infinite or lazy iterator,
def (h, *t) = source returns the head and an iterator the caller
can keep pulling from — there is no materialisation.
| RHS type | getAt(IntRange) |
Path | Type of t |
|---|---|---|---|
|
✗* |
A |
|
|
✓ |
B |
|
|
✓ |
B |
|
|
✓ |
B |
|
|
✓ |
B |
|
|
✓ |
B |
|
|
✓ |
B |
|
|
✓ |
B |
|
user class with |
✓ |
B |
|
|
✗ |
C |
|
|
✗ |
C |
|
|
✗ |
C |
|
|
✗ |
C |
|
() Stream’s ✗ reflects "has getAt(IntRange) but is steered away
from Path B by design". See the prose above.
Path A preserves the Stream pipeline: t.filter { … }.map { … }
.collect(toList()) continues to work after the head is bound. The
returned Stream is sequential and reports no spliterator
characteristics (capturing characteristics from the original requires
reading its spliterator, which is mutually exclusive with the iterator
path used for head extraction). The new Stream’s onClose handler
delegates to the original’s close(), so a typed source such as
Files.lines(p) is closed when the rest binder is closed:
Files.lines(path).withCloseable { src ->
def (header, *body) = src
body.filter { !it.isBlank() }.forEach { … }
} // closing src closes the original; closing body would also close it
Notes on Stream-shaped RHS
-
Already-consumed Stream: a Stream that has already had a terminal operation invoked throws
IllegalStateExceptionwhen destructuring pulls its iterator. This is unchanged from any other Stream usage. -
Parallel Stream: the wrapped tail is sequential. Re-parallelising the tail (
t.parallel()) is a one-call opt-in. -
Primitive streams (
IntStream,LongStream,DoubleStream): these are not subtypes ofStream<T>, so they fall through to Path C and the rest is anIterator<Integer>/Iterator<Long>/Iterator<Double>. To get Path A semantics, call.boxed()first:def (h, *t) = intStream.boxed()givest : Stream<Integer>. -
Resource ownership: with
onClosechaining, the rest binder becomes the canonical owner of the source’s resources. Avoid closing both the original and the rest, or holding the original alongside the rest, to prevent double-close in user code. -
Single-rest degenerate case:
def (s) = streamhas no head slots to extract, so it short-circuits to a plain assignment (s = stream).sis the *original Stream — same identity, characteristics, parallelism, andonClosehandler intact. The "fresh stream wrapping the advanced iterator" wording in the table only applies when at least one head slot is bound.
Type ascriptions on the rest binder
The rest binder may carry a type ascription, mirroring the existing positional form:
def (h, List<Integer> *t) = [1, 2, 3, 4] // Path B — t : List<Integer>
def (c, String *cs) = "hello" // Path B — t : String
def (h, BitSet *t) = someBitSet // Path B — t : BitSet
def (h, Iterator<Integer> *t) = someSet // Path C — t : Iterator<Integer>
def (h, Stream<Integer> *t) = someStream // Path A — t : Stream<Integer>
def (l, List<Integer> *m, r) = list // typed middle rest
The declared type must be compatible with the path the RHS selects.
For Path B the rest type is whatever the RHS’s getAt(IntRange)
returns (a supertype is also accepted) — List<T> for lists / arrays /
ranges, String for String / GString, BitSet for BitSet,
the user-declared return type for custom classes. For Path A it is
Stream<T> (or a supertype). For Path C it is Iterator<T> (or a
supertype). Under @CompileStatic a mismatch is reported at compile
time; in dynamic mode it surfaces as a GroovyCastException when the
value is coerced. Untyped rest is the recommended default; ascribe a
type when you want a tighter element type or to fail fast if the RHS
shape changes.
Edge cases:
-
Empty RHS:
def (h, *t) = []—h = null,t = [](Path B) ortis an exhausted iterator (Path C). -
Exhausted iterator:
def (h, *t) = exhaustedIter—h = null,t = exhaustedIter. -
RHS that supports neither protocol: compile error under
@CompileStatic;MissingMethodExceptionat runtime in dynamic mode.
Rest binding in head and middle positions
A *ident in non-tail position requires a sized, indexable RHS:
def (*front, last) = list
def (l, *middle, r) = list
def (a, b, *m, y, z) = list
These forms lower via indexed access. Positional slots use getAt(int)
(with negative indices for tail-side fixed slots); the rest slot uses
getAt(IntRange). The RHS contract is getAt(int) plus
getAt(IntRange) — capability-based, not class-based. List, array,
String, GString, Range, BitSet, and any user class declaring
both methods are eligible. Stream is not (its getAt(IntRange)
materialises eagerly, conflicting with the GEP’s no-silent-
materialisation rule).
// def (l, *m, r) = rhs
def __rhs = rhs
def m = __rhs.getAt(1..-2) // emitted before any negative-index call
def l = __rhs.getAt(0)
def r = __rhs.getAt(-1)
The getAt(IntRange) call is emitted before any negative-index
getAt(int) call. This is load-bearing for safety: DGM’s
getAt(Iterator, int) materialises the iterator on negative indices
(toList(self) then index from the end), which would silently hang on
an infinite source. By emitting the IntRange call first, iterators
fail fast with MissingMethodException (no getAt(Iterator, IntRange)
overload exists in DGM) before any materialisation can happen — the
same clean failure mode def (a, b) = 42 already produces today
against a non-indexable RHS.
Constraints:
-
At most one rest binding per parens form. Multiple rest bindings (
def (a, *b, *c, d)) is a compile error. -
RHS sizes smaller than the total number of fixed slots are tolerated:
getAt(int)returnsnullfor out-of-bounds positions and the rest slot becomes an empty collection.
Map-style destructuring
key: ident pairs in the declarator list bind via property access:
def (name: n, age: a) = person
def (host: h, port: p) = config
Lowering:
def __rhs = person
def n = __rhs.name
def a = __rhs.age
The protocol is Groovy’s normal property access (obj.name), so the
form works on any RHS responding to the named property via the MOP:
-
Mapvalues — Groovy’s MapMOP makesm.nameequivalent tom['name']. -
Java beans — calls the getter, e.g.
getName(). -
GroovyObject— dispatches viagetProperty('name'). -
Any user-defined class providing the property by any of the above routes.
Type ascriptions pin or coerce binding types:
def (name: String n, age: int a) = person
Failure modes match expression-position property access:
| RHS shape | Property absent | Result |
|---|---|---|
|
Key not in map |
Binding receives |
Bean / |
No matching property |
|
Static type lacks the property |
— |
Compile error under |
Map-key restrictions
class is a reserved keyword and so cannot appear as a destructuring
key — def (class: c) = m is a syntax error. Identifiers like
metaClass parse, but m.metaClass resolves to the map’s MetaClass
via Groovy’s standard MOP, not to a key named metaClass. Users
needing either key fall back to subscript: def c = m['class'].
Mixing positional and map
Mixing positional and map-style declarators within a single def (…)
is a compile error in 6.x:
def (h, name: n, *t) = obj // error in 6.x
Disallowing this leaves the option open for a future spec; permitting it now would lock in semantics before use cases emerge.
_ and other identifiers
The "discard" convention recapped under
Existing positional form (recap) applies uniformly across every
new form introduced in this proposal. _ is a regular identifier
throughout — no wildcard semantics, no special parser node, no
behavioural change beyond the pre-existing repeat-binder exemption
(GROOVY-10943, 5.0):
def (_, y, m) = Calendar.instance // pre-existing
def (_, *t) = list // _ binds to the head
def (h, *_) = list // _ binds to the rest
def (*_, last) = list // _ binds to the front
def (l, *_, r) = list // _ binds to the middle
def (name: _, age: a) = person // _ binds to person.name
def (_, _, day) = Calendar.instance // multiple _ in one list
_ is otherwise treated identically to any other identifier under
both compilation modes: the slot is bound, no deprecation warning is
emitted, and reads of _ follow the same dynamic-vs-typed
asymmetry described under Existing positional form (recap) —
reachable in dynamic mode, reported as undeclared under
@TypeChecked / @CompileStatic.
Compilation
Grammar additions
Two local additions to the multi-assignment production:
-
In the declarator list, allow an optional leading
*on at most one identifier (with optional type ascription). -
Add a
key: ident(orkey: Type ident) alternative to the declarator list, exclusive of positional declarators within a singledef (…).
Empty declarator lists (def () = expr) remain a syntax error, as
today. No changes outside the multi-assignment production. No new
keywords. No reserved identifiers.
Lowering summary
| Shape | Lowering | RHS contract |
|---|---|---|
|
Today’s lowering, unchanged |
|
|
Path A ( |
|
|
|
|
|
Property access |
Property dispatch via MOP |
Static type checking
Under @CompileStatic and @TypeChecked:
-
Path A vs Path B vs Path C for tail rest is decided at compile time from the RHS’s declared type. Path B is determined by capability: STC probes
rhs.getAt(IntRange)and accepts the receiver as Path B if the call resolves. Stream is the one deliberate exception (itsgetAt(Stream, IntRange)materialises eagerly, so it routes to Path A instead). -
Bindings receive their natural narrowed type. The tail rest’s type is whatever
getAt(IntRange)actually returns:List<T>for lists / arrays / ranges;StringforString/GString;BitSetforBitSet; the user-declared return type for custom classes. For Path A it isStream<T>; for Path C (anything else iterable) it isIterator<T>. ForMap<String, V>, map-style bindings areV; for beans, bindings are the property’s declared type. -
If neither protocol is available on the RHS’s static type, the compiler reports an error.
Implicit type of the rest binding
An untyped rest binding (*t, *f, *m) takes the inferred type
shown in the Tail rest binding table. Head and middle rest are
restricted to Path B — a non-indexable RHS (Set, Iterator, Stream)
in those positions is a static type error.
Dynamic dispatch
In dynamic mode, Path A vs Path B vs Path C for tail rest is selected
at runtime by MultipleAssignmentSupport.tailRest:
-
List/CharSequence/ array — fast-path dispatch viagetAt(IntRange)with size-aware empty-slice handling. -
Stream— Path A (lazy iterator wrap,onClosechained). -
Any other receiver with a resolvable
getAt(IntRange)(e.g.BitSet, user custom classes) — MOP dispatch viagetAt(IntRange). -
Otherwise — Path C (return the already-advanced iterator).
Head and middle rest go through nonTailRestSlice, which always
dispatches via the MOP. Failure modes are runtime exceptions
consistent with expression-position dispatch:
-
Non-indexable sources without a
getAt(IntRange)extension (iterators, sets, custom Iterables) fail fast withMissingMethodException. -
Streamhead/middle rest in dynamic mode fails fast withIllegalArgumentException("reverse range")fromStreamGroovyMethods.getAt(Stream, IntRange)— head/middle rest fromStreamis rejected at compile time under@CompileStatic.
RHS evaluation
The RHS expression is evaluated exactly once into a synthetic local. All bindings derive from that single evaluation. Order of binding assignment is left-to-right of declared positions, except that in non-tail-rest forms the IntRange call is emitted before any negative-index call (see Rest binding in head and middle positions).
Compatibility
Backwards compatibility
Every program valid in Groovy 4 / 5 compiles with identical semantics. The new shapes use grammar that does not parse today:
| Form | Parses today? |
|---|---|
|
Yes — unchanged |
|
Yes — unchanged |
|
Yes — |
|
No — claimed by this proposal |
|
No — claimed |
|
No — claimed |
|
No — claimed |
No syntactic ambiguity is introduced: *ident and key: ident in
declarator position both fail to parse today, so claiming them as new
grammar does not change the meaning of any existing program.
_ is not deprecated
This proposal explicitly does not deprecate as an identifier in
any context. Existing idioms (def (, y, m) = Calendar.instance,
(0..n).each { _ → … }, etc.) continue to compile with no warning.
Forward compatibility
The parens form claims grammar that is unparseable today and uses
tokens (*, key:) that have no other meaning in declarator position.
This leaves the unrelated def […] bracket-literal grammar entirely
free for any future pattern-matching or destructuring proposal that
wants to use it, with no risk of conflicting reinterpretation.
Excluded and deferred features
The following forms were considered and intentionally left out of this proposal. Some are deferred to a future spec (typically alongside GEP-19 pattern matching); others are not planned.
| Feature | Status | Rationale |
|---|---|---|
Bracket form |
Deferred |
Out of scope for the parens-form additions; would belong with a dedicated bracket-pattern proposal |
Wildcard |
Not planned for parens form |
Existing identifier semantics preserved; any future wildcard form belongs in a separate grammar (e.g. bracket-form patterns) |
Mixing positional and map-style in one |
Not planned for 6.x |
Semantics not obvious; defer pending concrete use cases |
Multiple rest bindings ( |
Not planned |
Inherently ambiguous; no language ships this |
Nested patterns (e.g. |
Deferred |
Both bracket-form and recursive-parens shapes are currently unparseable, so either remains claimable. Choice of syntax should fall out of a dedicated pattern-grammar proposal (GEP-19) rather than being settled here. |
Head/middle rest against a non-indexable RHS |
Not planned |
No clean semantics; iterators / streams fail with
|
Map destructuring with renaming via the |
Deferred |
|
Default values (e.g. |
Deferred |
Useful for map-style destructuring on partial inputs; can be added later without grammar collision |
Per-binder modifiers (e.g. |
Deferred |
Modifiers belong on the outer declaration: |
Preserving Stream characteristics / parallelism through Path A |
Deferred |
The Path A tail wrap ( |