Readable Scala Style

2021-03-02

The following is a somewhat opinionated style guide targeted at people primarily with Java background who want to benefit from using the mainstream (also known as Future-based) Scala stack and do so while optimizing their reward to effort ratio. This guide will be well suited for teams that need a reasonable set of simple guidelines to follow, especially at the beginning, until everyone develops good intuition for Scala.

Signal your intention with crystal clear names

When it comes to programming, it's better to be wrong than vague. This implies that you should document your intent in code as clear as possible. For example, if you have a choice between calling a variable id or documentId and you know that you're dealing with documents here, choose the latter. Generic names such as correlationId are OK in generic code, but using them in business logic is usually a bad idea. Likewise, avoid using data types as names, i.e it's better to use updated or created rather than timestamp or date.

Minimize the LOC measure by leaning towards longer lines

This is somewhat controversial, but when it comes to reading the code, I much prefer something that doesn't spread multiple screens. Of course, this doesn't mean that you should try to fit everything into a single line. Rather, try to group operations so that they make up a single logical action. For example, if you have a collection of elements and you need to get something from it through a sequence of method calls, feel free to keep them on the same line. If the level of abstraction throughout the method is as consistent as it should be, every line can be viewed as a single finished thought.

Strive for methods that mostly consist of single line val assignments

If a method mostly consists of val assignments, it forms a very easy to follow structure. Provided that you named your values and methods reasonably well, your Scala code should read like English. Occasional branching is OK, but it shouldn't distract the reader from the main idea, and the main flow should be easily evident.

Keep nesting to a reasonable level

Scala offers great many tools to avoid excessive nesting. Rather than building a pyramid of flatMaps, we can always write a for-expression instead. Instead of writing multiple nested if-expressions, we can wrap everything in Try and then use a Failure with an application-specific exception to signal an erroneous condition. If we're only interested in a happy path and don't care which operation caused an error, we, yet again, can use a for expression.

Do not expose more than necessary

Even though Scala uses public visibility by default, it's still better to limit it whenever possible as you would do in Java or C#. If a method is only used by one other method, consider moving it inside. If a method or constant are only used within the class, they must be made private. If a public utility method is only used in one place, it's usually better to move it to where it's needed and again, make it private. If a public utility method from a common module is only used within one service, it must be moved out of the common module. Also, check out Neal Ford's Stories Every Developer Should Know to learn about the dangers of inappropriate code reuse.

Do not keep mutable state in service classes

Most service classes are stateless collections of methods grouped together, because they perform related business functions and probably have similar or overlapping dependencies. Because they are stateless, they can safely be instantiated as @Singletons without any concern about concurrency. This is exactly the opposite of what classical object-oriented programming tells us to do, and this is perfectly fine as we're not doing OOP here.

Avoid putting logic in case classes

Whereas service classes should contain only methods but not state, case classes should contain state but not logic. Again, this is in direct contrast to what OOP tells us to do. Also, keep in mind that if the case class in question resides in the commons package, everything that was said about keeping exposure to a minimum is still relevant here.

Always destructure tuples

It's easy enough to create a tuple, but it's not always easy to read the code that is full of them. If you're calling zipWithIndex, for example, it's better to destructure the pair immediately thereby avoiding cryptic calls to ._1 and ._2. Likewise, never return a tuple from a method, especially public. If the need arises, use a properly named case class (possibly, defined within a service class to limit its scope).

Use for expressions for monadic sequencing only

In theory, for expressions can be used to express methods such as map, flatMap, filter, foreach, so in a way they behave like a mixture of the do notation from Haskell and list comprehensions from Python. In practice, filter and foreach are better left unsugared, especially when collections are involved. On the other hand, sequentializing monadic types like Future, Try, IO is a great way to make code more readable and reduce nesting.

Use val whenever possible and only resort to var if necessary

Unlike Haskell, Scala has the var keyword and allows developers to define mutable variables. However, in an expression-centric language they are almost never needed. For the most part, you should only consider using a method-local var in a very rare situation of performance optimization. Everything else is better done with vals.

Consider replacing while loops with @tailrec functions

Most iterations in Scala can be expressed in terms of library functions. However, sometimes there is a need to exit the iteration urgently and still return a result. This can be done either imperatively with while, var and breaks or with recursion. In many cases, recursion results in a more understandable code, so take time and try to implement it, but make sure that it can be annotated with @tailrec.

Prefer enumeratum to the standard Enumeration

The standard pattern for implementing enum in Scala is described on StackOverflow and often used by Scala beginners. This approach is very limited and in fact inferior to the standard Java enum in almost every way. Always prefer enumeratum as a way of implementing enums in Scala 2.X, and only resort to using Enumeration to support legacy systems.

Consider using union enums with Either to signal expected errors

Java has the notion of checked exceptions, which is mostly viewed by the Java community as a design mistake. However, this concept allows developers to differentiate between expected and runtime errors. Even though all exceptions in Scala are essentially RuntimeException's, in many situations it is necessary to make expected errors part of the API. If you're not using advanced libraries such as ZIO, the best strategy is to use Either with a custom error type as Left value and normal return value as Right. The custom error type should ideally be an algebraic data type, i.e should consist of non-intersecting concrete values. Passing Strings with an arbitrary error message in English as Left is almost always a bad idea.

Use Option only when the absence of value is expected

If properly used, Option completely eliminates the problem of NullPointerException in Scala code. Ideally, it should be used to signal to the caller that the value may or may not be present. Since Option is effectively part of the API, it forces the caller to make sure that both possibilities are covered. For example, findById methods of a low-level repository might return an Option to signal that the value might not be present in the database. However, if the entity is expected to exist for some other higher-level operation, its absence should be signalled with a failure rather than None.

Be mindful of where your Future is running

Most methods on Future require an ExecutionContext that specifies in which thread pool this particular piece of code is going to run. Often times, people simply define or import a global implicit context and forget about the problem altogether. Just as often this creates a situation when all code, including blocking and long-lasting operations, is running on a single CPU thread pool depriving other tasks from live threads.

Avoid accidental exception swallowing

Many monadic types such as Future or Try catch exceptions internally, and most of the time, this is exactly what you want. However, it's also very easy to accidentally swallow an exception completely thereby depriving the caller from ever knowing that an error took place. Always be careful with error propagation when using methods like recover and be doubly suspicious when you encounter a nested Future.

Keep inheritance use to an absolute minimum

Using inheritance to avoid code duplication is a terrible idea in Java, and it's no different in Scala. If there's some logic that can be expressed as a pure function and reused by other services from the same module, it's better to extract it as a helper method. Likewise, if the functionality of a certain class needs to be extended, use composition as GoF suggested in "Design Patterns" and Joshua Bloch re-asserted in "Effective Java".

Avoid unnecessary method overloading

Method overloading in Scala works exactly the same way as in Java, but this doesn't mean that it should be used just as often. While Java has to distinguish between primitives and reference types, Scala doesn't have this problem, and overloading is usually a bad idea as it makes code less readable. In general, there are many reasons to avoid overloading in Java and in Scala.

Use implicit parameters sparingly

The implicit keyword is one of the defining features of Scala 2.X, and there are several well-known implicit-specific "design patterns" such as type classes, type tags, extension methods etc. However, when it comes to implicit parameters in the context of the standard Scala, there's only one worth considering: Implicit Contexts. This pattern is particularly useful when you need to drag a single value of a specific type (so-called "context") through a series of method calls, and it sometimes may be thought of as a modern day version of a ThreadLocal variable. In most other cases, prefer explicit parameters.

Use lazy values sparingly

Marking value as lazy means that the initialization of a value will be postponed until the value is first used. One use case for lazy is a configuration parameter that is needed for only of a subset of functions defined by the service class. Another use case is compile-time dependency injection with libraries like MacWire. When it comes to local values, lazy makes code more difficult to reason about and should be avoided.

Use default parameters sparingly

Default parameters allow developers to skip passing them when calling a method. This saves a couple of keystrokes, but inevitably makes code more difficult to understand. When a method uses multiple default parameters, it's impossible to tell what will actually happen without looking at the method implementation. Default parameters also make cross-module refactoring much harder and consequently invite insidious and hard-to-catch bugs. In general, there are better ways to make API safer and more convenient to use, for example, applying the Factory Method design pattern.

Take advantage of the type system

Scala has a great variety of tools for domain modeling, and you should strive to use them often. In particular, you should prefer case classes to tuples, sum types or similarly modelled enumerations to String constants. Also, it's usually a good idea to model different states of your domain objects as separate types as it is usually recommended by the Functional DDD community. Consequently, try not to write methods that take and return values of the same type as they almost always require the reader to look at the implementation to understand what they do. For more details, check out Scott Wlaschin's talk Domain Modeling Made Functional.

Know when to use parentheses in methods

Parameterless methods can be declared with or without parentheses, but the decision is never arbitrary. For accessor-like methods (think "getters" from Java) which are also pure, parentheses shouldn't be used. This rule makes sense because in Scala methods and values share the same scope and can override each other. For pretty much everything else, parentheses should be used.

Use curly braces for non-trivial lambda expressions

If a function takes exactly one parameter, the develop calling it has a choice of using either parentheses or curly braces. This choice, however, is almost never arbitrary. If the parameter is not a function itself, use parentheses. If this parameter is a function, and you have a non-trivial lambda to pass, use curly braces with the body on the next line. If you're passing a method reference or a trivial lambda with underscores, use parentheses and keep everything on the same line.

Avoid infix notation in regular method calls

In Scala, operators such as + or :: are defined as regular methods such as map or flatMap. This means that it's possible (but not recommended) to use 1.+(2) instead of more familiar 1 + 2. This also means that it's possible (and also not recommended) to use xs map increment instead of xs.map(increment). When dealing with regular methods outside of a specific DSL context, always prefer the latter approach and consider using automatic code formatters to enforce this rule.

Do not wrap for expressions in parentheses

The following rule is borrowed from PayPal's style guide. For expressions should not be wrapped in parentheses in order to chain them with recover, andThen, etc. Instead, extract the result of the for expression into its own variable and perform additional operations on that. This rule also fits nicely into Kent Beck's idea about giving meaningful names to intermediate values.

Maximize percentage of pure code

Even though the standard Scala with Future is intrinsically side-effecting, everyone should strive to limit the amount of side-effecting code and ideally push it to the edge of the program. While the latter might be challenging to do in a typically-procedural layered architecture with repositories as low-level dependencies, one can always try to push side effects to the controller level and make the rest of the code return pure descriptions of what needs to be done.

When in doubt, prioritize readability

Sometimes rules may be contradicting, and it may not be obvious which one should win. When this happens, always consider the overall readability of the code first. If some unique Scala feature makes code easier to understand and change, consider using it. If not, resort to more traditional approaches from classical books like "Code Complete", "Clean Code" and "The Pragmatic Programmer".