GEP-19
|
Abstract
This GEP extends Groovy’s switch expression with structural pattern
matching for lists, maps, and types, building on the record patterns
introduced in 5.0 and aligning with Java’s pattern-matching trajectory
(JEPs 440, 441, 456, 507).
The proposal adds list patterns ([var h, var… t]), map patterns
([name: var n, var… rest]), type patterns with binding (String s),
guard clauses (when), and a unified wildcard form (_ inside record
patterns; var _ elsewhere). The same pattern grammar is also accepted
in def […] = … bracket-form declarations, providing a consistent
destructuring grammar between switch case labels and assignment for
users who want it. The everyday parens form (def (…) = …) is
covered separately by GEP-20 and ships in Groovy 6.x.
All existing switch semantics are preserved: legacy isCase matching
(constants, ranges, regex, classes, collections, maps, closures) compiles
exactly as it does today. Pattern matching is opt-in via the binding
markers that distinguish a structural pattern from a legacy case label.
Motivation
Groovy’s switch already exceeds Java’s classic switch through the
isCase protocol — case labels can be classes, ranges, regex patterns,
collections, and closures. Groovy 5.0 added record patterns aligned with
Java 21’s JEP 440. What remains absent is structural decomposition of
the most common data shapes: lists, maps, and arrays.
Languages that have shipped this — Scala, Rust, Swift, Kotlin (limited), JavaScript (via destructuring), Python (via PEP 634), and Jactl — show that destructuring in switch / match is one of the most-used pattern matching idioms once available. Without it, the canonical functional expression of recursive list algorithms remains awkward in Groovy:
// Without structural matching
def qsort(list) {
if (list.size() <= 1) return list
def h = list[0]
def t = list.drop(1)
qsort(t.findAll { it < h }) + [h] + qsort(t.findAll { it >= h })
}
// With structural matching (this proposal)
def qsort(x) {
switch (x) {
case [] -> []
case [var h, var... t] -> qsort(t.findAll { it < h }) + [h] + qsort(t.findAll { it >= h })
}
}
Java is moving in the same direction. Beyond what has shipped, JEP
drafts cover deconstructor methods for arbitrary classes, primitive
patterns in switch (JEP 507 preview), and array patterns under
discussion. Aligning Groovy’s surface syntax with Java’s where they
overlap — and lowering through a uniform internal representation —
keeps Groovy forward-compatible without paying for it twice.
Design principles
-
No legacy regression — every program valid under Groovy 6 compiles with identical semantics in 7. The disambiguation between legacy
isCaseand structural patterns is parser-decidable and based solely on the presence of binding markers. -
Java alignment where syntax overlaps —
whenguards, type patterns, record patterns, and_unnamed pattern syntax match Java verbatim. Rest bindings reuse Groovy’s existing varargs syntax (Type… ident), the form Java is most likely to adopt for variadic deconstructor components. -
Groovy-native where Java has no analogue — list and map literal patterns (
[…],[k: v]) have no Java counterpart because Java has no list or map literals. Groovy can lead here without future drift risk. In all binding positions,varanddefare interchangeable (matching Groovy’s existing local-variable convention and GEP-20), and rest slots accept the shortcut…(or… ident) forvar… _(orvar… ident). -
Shared bracket-form grammar with assignment — the pattern grammar is accepted by
def […] = exprdeclarations as bycase […] →labels. The parens form (def (…) = …) covered by GEP-20 remains the canonical surface for everyday destructuring; the bracket form is the opt-in bridge to switch pattern grammar. -
Forward-compatible lowering — list and map patterns desugar through an internal
Deconstructablestrategy. When Java’s deconstructor JEP ships, surface forms likecase List.of(var h, var… t) →slot in as additional spellings of the same lowering — no architectural change needed.
Features
Type patterns with binding
A type pattern matches if the switch value is an instance of the named type, binding a new local variable in the case body:
switch (obj) {
case String s -> s.toUpperCase()
case Integer i -> i * 2
case Number n -> n.doubleValue()
default -> null
}
The bound name is narrowed to the declared type within the case body
and any when guard. @CompileStatic and @TypeChecked see the
narrowed type via the same flow analysis already used for record
patterns.
Record patterns
Record patterns (introduced in Groovy 5.0) deconstruct record-typed
values positionally. Component positions accept nested patterns,
including the unnamed pattern _:
record Point(int x, int y) {}
record Line(Point start, Point end) {}
switch (obj) {
case Point(int x, int y) -> "$x,$y"
case Point(int x, _) -> "x=$x"
case Line(Point(_, _), Point p2) -> "ends at $p2"
}
Bare _ is permitted inside Type(…) because record-pattern
component grammar is not expression grammar — there is no collision
with _ as an identifier in expression position.
List patterns
List patterns destructure List, array, and Iterable values
structurally. Element positions accept literals, type patterns,
var/def bindings, nested patterns, and a single rest binding:
switch (xs) {
case [] -> "empty"
case [...] -> "non-empty list (any shape)"
case [var only] -> "single: $only"
case [var h, var... t] -> "h=$h, t=$t"
case [def h, ... t] -> "h=$h, t=$t (using shortcuts)"
case [Integer h, var... t] -> "int head $h"
case [var first, var... middle, var last] -> "$first..$last"
case [1, var x, ...] -> "starts with 1, then $x"
}
var and def are interchangeable in any binding position. The
shortcut … is accepted for var… (and … ident for
var… ident); see _Rest bindings below.
Element count rules:
-
Without rest: a pattern of
nelements matches values of exactlynelements. -
With one rest: a pattern of
nfixed elements plus a rest binding matches values of at leastnelements. Rest may appear in any single position. Multiple rest bindings at the same level are a compile error.
Type rules:
-
For
List<T>input, untyped element bindings are inferred asT, the rest binding asList<T>. -
For
T[]input, the rest binding is inferred asT[]. -
For other
Iterable<T>input, the iterable is materialised as a list once for matching. -
For
Objectinput, bindings fall back toObjectandList.
Map patterns
Map patterns destructure Map values by key. Keys are compile-time
constant expressions; values are arbitrary patterns:
switch (m) {
case [name: var n, age: var a] -> "person $n, $a"
case [type: 'circle', radius: var r] -> "circle r=$r"
case [name: String n, ... rest] -> "named $n; others=$rest"
case [name: def n, ...] -> "any named map (others discarded)"
}
Map pattern semantics are open: a pattern matches if all named keys are present and their value patterns match. Extra keys in the map are ignored unless captured by a rest binding. Closed semantics — "exactly these keys" — is expressed via a guard:
case [name: var n] when ((Map) m).size() == 1 -> ...
The rest binding var… rest in a map pattern binds a Map of the
entries not matched by named keys.
Empty literals
The empty list literal [] and empty map literal [:] in case-label
position are always treated as patterns matching empty collections of
the appropriate kind:
case [] -> "empty list"
case [:] -> "empty map"
The legacy isCase semantics for these — never matching anything,
because [].contains(x) and [:].get(x) are always false / null —
have no practical use, so claiming the pattern interpretation removes
no functionality.
Wildcards
The unnamed pattern matches any value without binding it. The form depends on context:
| Position | Form | Reason |
|---|---|---|
Inside |
|
No expression-grammar collision; matches Java 22 (JEP 456) |
Inside |
|
Bare |
Rest discard |
|
See Rest bindings below |
Top-level case label |
|
Bare |
Rest bindings
Rest bindings collect remaining elements into a single binding. The
canonical form is var… ident (or def… ident / Type… ident),
reusing Groovy’s existing varargs token sequence — the same shape as
int… args in method parameter lists:
case [var h, var... t] -> ... // canonical
case [def h, def... t] -> ... // equivalent (var/def interchangeable)
case [var h, Integer... t] -> ... // typed rest
case [var h, var... _] -> ... // discarded rest, canonical
The var… t spelling reuses an existing Groovy token sequence
(varargs in method declarations) and matches the shape Java is most
likely to adopt for variadic deconstructor components — keeping the
case List.of(var h, var… t) surface form (when Java specifies
it) consistent with the list-literal form.
… shortcut
The triple-dot … is accepted as a shortcut wherever var… (or
def…) appears, reflecting that the leading var / def is
ceremonial once … has signalled the rest position:
case [var h, ... t] -> ... // shortcut for `var... t`
case [var h, ...] -> ... // shortcut for `var... _`
case [...] -> ... // matches any list
Both … ident and bare … flip a […] from legacy to pattern
interpretation on their own. The … token has no expression-position
meaning today (it is reserved only for varargs in method declarations
and enhanced-for index variables), so claiming it as pattern grammar
reinterprets no existing program.
Bare … matches any list (including empty), since the rest can
absorb zero elements. It pairs naturally with case [] ->:
case [] -> "empty"
case [...] -> "non-empty (the empty case is matched above)"
The typed shortcut Integer… t is not further shortened to
… t, because the type ascription carries semantic content (a
runtime element-type check) that bare … would discard.
For reference, GEP-20’s parens-form def (…) = … uses *ident
and * for the same role — the parens-form analogue of
var… ident and var… _ (or the … shortcut). The *
spelling is not adopted inside […] patterns here because *
collides with list-literal spread in expression position; see
_Excluded and deferred features for the relationship between the
two and the conditions under which the parser could in principle
accept the * spelling inside an already-disambiguated pattern.
Guards
when guards apply to patterns:
case Integer i when i > 0 -> "positive"
case [var h, var... t] when t.size() > 5 -> "long list head=$h"
case Point(int x, int y) when x == y -> "diagonal point"
Guards may reference any binding from the same pattern. Guards are evaluated once after pattern matching succeeds; arms with failing guards fall through to the next arm.
Patterns in instanceof
Type patterns and record patterns extend to instanceof, mirroring
Java:
if (obj instanceof String s) {
println s.length()
}
if (point instanceof Point(int x, int y)) {
println "$x, $y"
}
List and map patterns are not valid in instanceof because they have
no type at the head. To test "is this value shaped like …", use a
single-arm switch expression returning a boolean.
Disambiguation rule
A […] or [k: v, …] case label is parsed as a structural
pattern if and only if at least one of the following holds:
-
The literal is empty (
[]or[:]). -
Some element (or value, in maps) is a binding form:
var <ident>/def <ident>e.g.
var hordef h(interchangeable)var _/def _unnamed binding
<Type> <ident>e.g.
Integer h<Type> _type-checked unnamed binding
var… <ident>/def… <ident>rest binding
… <ident>rest binding (shortcut for
var… <ident>)<Type>… <ident>typed rest binding
var… _/def… _rest discard
…rest discard (shortcut for
var… _) -
Some element is a nested pattern (record pattern with at least one unambiguous binding form among its components, nested list pattern, nested map pattern).
Otherwise, the label retains its legacy isCase semantics — exactly
today’s behaviour.
The rule is parser-local and does not depend on surrounding scope. Every binding form listed above currently fails to parse as a Groovy expression in list-literal or map-literal value position, so claiming them as pattern grammar does not change the meaning of any program valid in Groovy 6.
The bare identifier _ continues to parse as an identifier in
expression position — case [_] -> therefore retains its legacy
meaning. Users wanting a single-element wildcard pattern write
case [var _] ->.
Bracket-form assignment
In Groovy 7.0, def […] = expr accepts the same pattern grammar as
switch case labels. This is the bridge between switch and assignment
destructuring:
def [var h, var... t] = list // canonical
def [def h, ... t] = list // equivalent (shortcuts)
def [Integer h, var... t] = list // typed head
def [var first, var... middle, var last] = list // rest in middle
def [name: var n, age: var a] = person // map destructuring
def [Point(int x, int y), ... rest] = list // nested record pattern
The bracket form supports the full switch pattern grammar — nested
patterns, type bindings with narrowing, wildcard _ (via var _
or def _), typed rest, and record patterns. A binding marker
(var, def, a type, or … for rest) is required inside […]
for the same disambiguation reason it is in case labels: without
one, the literal would parse as today’s legacy list literal.
The parens form (def (…) = …) is a separate, simpler grammar
covered by GEP-20 and shipped in Groovy 6.x. It remains canonical for
everyday destructuring (positional bindings, simple rest, map-style
keys) and does not require var markers because the surrounding def
already declares the names. The two forms coexist:
| Form | Capabilities |
|---|---|
Parens ( |
Positional, rest with |
Bracket ( |
Full pattern grammar — nested patterns, wildcard |
Lowering for bracket-form assignment
The bracket form lowers via the same Deconstructable strategy as
switch case labels, with one accommodation: tail-rest forms accept
the same RHS contract as GEP-20’s parens form (getAt(IntRange) or
iterator() fallback), so iterators and unbounded sources work in
assignment context. List patterns in switch case labels do not
accept iterators — pattern matching against an iterator would
destructively consume it, which is surprising for a match operation.
The divergence between contexts is therefore:
-
def [var h, var… t] = iter— accepted, usesiterator()fallback,tis the iterator (matching GEP-20’s parens form). -
case [var h, var… t] → …against an iterator — does not match; list patterns requireList, array, or anIterablethat can be materialised non-destructively.
A failed match in a bracket-form declaration throws
IllegalArgumentException. Partial matching is via switch.
Compilation
List and map patterns lower through an internal Deconstructable
strategy that performs:
-
a type check (
instanceof List,instanceof Map,instanceof T[], etc.), -
size or key checks for the named elements,
-
component extraction (
get(int),subList,containsKey/get, key-set difference for rest), -
binding assignment.
Record patterns reuse the lowering already present in 5.0. When Java’s
deconstructor JEP ships, surface forms like
case List.of(var h, var… t) -> and
case Map.of("name", var n) -> are accepted as additional spellings
that lower to the same Deconstructable calls — no re-architecture.
Implementation considerations:
-
Switch dispatch on JDK 21+ should evaluate
java.lang.runtime.SwitchBootstraps.typeSwitchas the dispatch path for arms whose patterns admit it, on parity with how Java compiles pattern switch. -
Bindings live in synthetic locals; the static type checker propagates narrowed types into them.
-
For dynamic Groovy, bindings are typed
Objectand runtime checks dominate; for@CompileStatic, narrowed types let the JIT see through. -
Iterable inputs that are not
Listor array are materialised as a list once per match attempt; this is observable for iterators with side effects.
Java alignment
| Java feature | Status | Groovy alignment |
|---|---|---|
Pattern matching for |
Shipped |
Type and record patterns valid in |
Record patterns (JEP 440, 21) |
Shipped |
Already in Groovy 5.0; extended in this proposal |
Pattern matching for |
Shipped |
Arrow-form |
|
Shipped |
Adopted verbatim |
Unnamed patterns |
Shipped |
Adopted in |
Primitive patterns (JEP 507) |
Preview |
Groovy already coerces; revisit when finalised |
Deconstructors for arbitrary classes |
Draft |
|
Array patterns |
Discussed |
If Java picks |
Excluded and deferred features
| Feature | Status | Rationale |
|---|---|---|
|
Deferred |
The triple-dot shortcut ( |
Or-patterns ( |
Deferred |
Existing comma-separated case labels ( |
Type-prefixed list patterns (e.g. |
Deferred |
Awaits clarity from Java’s array pattern direction; for now, type information flows from the switch input |
Patterns in |
Deferred |
|
Patterns in |
Not planned |
Multi-catch already covers type unions; structural matching of exception state is rare |
Closed map patterns ("exactly these keys") |
Not planned |
Expressible via guard; dedicated syntax not warranted |
Bare |
Not planned |
Conflicts with legacy |
Exhaustiveness enforcement |
Warn-only initially |
Java errors for sealed-type expression switches; Groovy emits a warning in 7.0 and may escalate via opt-in flag in 7.x |
Compatibility
Backwards compatibility
Every program valid in Groovy 6 compiles with identical semantics in Groovy 7. The disambiguation rule is purely parser-local and is triggered only by syntactic forms that currently fail to parse:
| Form | Parses today (Groovy 6 with GEP-20)? |
|---|---|
|
Yes — legacy |
|
Yes — legacy |
|
Yes — legacy (list containing |
|
No — new, claims unused grammar |
|
No — new (Groovy-native equivalent) |
|
No — new ( |
|
No — new |
|
No — new (5.0 supported only |
|
No — new (bracket form) |
|
No — new (bracket form) |
|
Yes — covered by GEP-20 in Groovy 6.x |
|
Yes — covered by GEP-20 in Groovy 6.x |
Two […] forms have legacy semantics that never usefully match —
case [] → and case [:] → — and are reinterpreted as empty-list
and empty-map patterns respectively. No existing program depends on
these never-matching legacy forms.
_ semantics across forms
Wildcard _ semantics introduced in this proposal apply inside
[…] list and map patterns, and inside Type(…) record
patterns. The parens-form assignment def (…) = expr (covered by
GEP-20) continues to treat _ as a regular identifier indefinitely,
matching GEP-20’s explicit non-deprecation. Existing idioms such as
def (_, y, m) = Calendar.instance keep compiling unchanged with no
warning in Groovy 7.
| Context | _ meaning |
|---|---|
|
Identifier — unchanged from today |
|
Wildcard — bind-and-discard |
|
Wildcard — bind-and-discard |
|
Wildcard — bind-and-discard (Java-aligned) |
This scoping means GEP-19 introduces no behavioural break for any
existing program: the wildcard semantics live in grammar ([…]
patterns, Type(…) patterns) that does not parse today.