class: center, middle # Algebras in Scala Ross A. Baker • `@rossabaker` ??? - notes here - see https://remarkjs.com/#1 --- # What is an algebraic structure? -- 1. A set -- 2. Operations on the set -- 3. Axioms on the operations --- # A set -- ℤ -- = { ..., -3, -2, -1, 0, 1, 2, 3, ... } -- = `scala.BigInt` -- When an algebra paper says says "set", a Scala developer might think "type". --- # Operations a + b -- ```scala def add(a: BigInt, b: BigInt) = a + b ``` -- When an algebra paper says "operations", a Scala developer might think "function". a --- # Axioms -- _Associativity_ : a + (b + c) = (a + b) + c -- ```scala import org.scalacheck._, Prop.forAll object AdditionSpec extends Properties("Addition") { property("associativity") = forAll {(a: BigInt, b: BigInt, c: BigInt) => add(add(a, b), c) == add(a, add(b, c)) } } ``` -- When an algebra paper says "axioms", a Scala developer might think "Scalacheck". --- # Axioms -- _Closure_ : a + b is in ℤ -- ```scala def add(a: BigInt, b: BigInt): BigInt = a + b ``` -- And sometimes, the axiom is already specified in the types. --- # An addition algebra in Scala ```scala object Addition { def add(a: BigInt, b: BigInt): BigInt = a + b } object AdditionSpec extends Properties("Addition") { import Addition.add property("associativity") = forAll {(a: BigInt, b: BigInt, c: BigInt) => add(add(a, b), c) == add(a, add(b, c)) } } ``` -- * The set is all values of type `BigInt` -- * The operation is `add` -- * Axiom: _associativity_: (a + b) + c = a + (b + c) -- * Axiom: _closure_, or all operations return a member of the set --- # An string concatentnation algebra in Scala ```scala object StringConcat { def concat(a: String, b: String): String = a + b } object StringConcatSpec extends Properties("StringConcat") { import StringConcat.concat property("associativity") = forAll {(a: String, b: String, c: String) => add(add(a, b), c) == add(a, add(b, c)) } } ``` -- * The set is all values of type `String` -- * The operation is `concat` -- * Axiom: _associativity_: (a + b) + c = a + (b + c) -- * Axiom: _closure_, or all operations return a member of the set --- # Let's parameterize it and give it a weird name to sound smart ```scala trait Semigroup[A] { def combine(a: A, b: A): A } ``` -- ```scala implicit val BigIntSemigroup: Semigroup[BigInt] = new Semigroup[BigInt] { def combine(a: BigInt, b: BigInt) = a + b } ``` -- ```scala implicit val StringSemigroup: Semigroup[String] = new Semigroup[String] { def combine(a: String, b: String) = a + b } ``` --- # Sums -- ```scala def sum(xs: List[BigInt]): BigInt = xs.foldLeft(BigInt(0))(_ + _) ``` -- ```scala def sum(xs: List[Double]): Double = xs.foldLeft(0.0)(_ + _) ``` -- ```scala def sum(xs: List[String]): String = xs.foldLeft("")(_ + _) ``` -- These are all the same: they are seeded with a "zero", and then fold left to combine values. --- # Monoid -- * A semi-group -- * Plus a null operation, `zero` -- * Additional axiom: _left identity_: 0 + a = a -- * Additional axiom: _right identity_: a + 0 = a --- # Monoid in Scala ```scala trait Monoid[A] extends Semigroup[A] { def zero: A } class MonoidSpec[A](name: String)( implicit M: Monoid[A], A: Arbitrary[A]) extends Properties(name) { property("associativity") = forAll {(a: A, b: A, c: A) => M.combine(M.combine(a, b), c) == M.combine(a, M.combine(b, c)) } property("left identity") = forAll { a: A => M.combine(M.zero, a) == a } property("right identity") = forAll {(a: A, b: A, c: A) => M.combine(a, M.zero) == a } } ``` --- # The standard library has a monoid algebra ```scala trait Numeric[A] { def plus(a: A, b: A) def zero: T def toDouble: String } ``` -- ```scala scala> List(BigInt(1), BigInt(2), BigInt(3)).sum res0: scala.math.BigInt = 6 ``` -- ```scala scala> List(0.1, 0.2, 0.3).sum res1: Double = 0.6000000000000001 ``` -- ```scala scala> List("a", "b", "c").sum
:19: error: could not find implicit value for parameter num: Numeric[String] List("a", "b", "c").sum ^ ``` -- `Numeric` has more operations `Monoid`, but `Monoid` should be sufficient to sum! --- # Monoid instances ```scala implicit val BigIntMonoid = new Monoid[BigInt] { def zero = BigInt(0) def combine(a: BigInt, b: BigInt) = a + b } implicit val DoubleMonoid = new Monoid[Double] { def zero = 0.0 def combine(a: Double, b: Double) = a + b } implicit val StringMonoid = new Monoid[String] { def zero = "" def combine(a: String, b: String) = a + b } ``` --- # Better sum ```scala def sum[A](xs: List[A])(implicit M: Monoid[A]): A = xs.foldLeft(M.zero)(M.combine) ``` -- ```scala scala> sum(List(BigInt(1), BigInt(2), BigInt(3))) res3: scala.math.BigInt = 6 ``` -- ```scala scala> sum(List(0.1, 0.2, 0.3)) res4: Double = 0.6000000000000001 ``` -- ```scala scala> sum(List("a", "b", "c")) res5: String = abc ``` --- # CommutativeMonoid -- * A monoid -- * Additional axiom: _commutative_: a + b = b + a -- ```scala trait CommutativeMonoid[A] extends Monoid[A] class CommutativeMonoidSpec[A](name: String)( implicit M: Monoid[A], A: Arbitrary[A]) extends MonoidSpec[A](name) { property("commutativity") = forAll {(a: A, b: A) => M.combine(a, b) == M.combine(b, a) } } ``` --- # Practical algebraic thinking -- * Suppose we have a `List[A]` too large to fit into memory -- * Combining `A` is commutative: we can do it in any order. -- * Combining `A` is associative: we can do any arbitrary subset first. -- * If we shuffle `List[A]` into smaller `List[A]` across worker nodes... -- * and reduce those smaller `List[A]` into a single `A` on the worker nodes... -- * and reduce those results in a final step... -- * we'll get the same `A` as if we'd reduced the original list! -- Algebraic reasoning tells us we can do a distributed reduce if `A` is a commutative monoid! -- (And the `Functor` algebra tells us we can do the map half, but let's move along.) ```scala ``` --- # Alternate interpreters Let's roll back to simpler words for a bit: ```scala object Addition { def add(a: Int, b: Int): Int = a + b } ``` What if we want to be able to show our work? --- # Algebraic data type We'll lift our language into an algebraic data type. ```scala sealed trait Expr case class Const(i: Int) extends Expr case class Add(a: Expr, b: Expr) extends Expr ``` -- Aside: why are they called algebraic data types? -- * Sealed traits are Scala's _sum types_. The number of possible `Expr`s is the sum of the number of `Const`s plus the number of `Add`s. -- * Case classes are Scala's _product types_. The number of possible `Add`s is the number of possible `a`s times the number of possible `b`s. -- * Function application is exponential. The number of pure implementations of `A => B` is the number of `B`'s raised to the number of `A`'s power. Yikes! -- * If your functions are not pure, your only remaining guarantees are death and taxes. --- # Interpreters ```scala def eval(expr: Expr): Int = expr match { case Const(i) => i case Add(a, b) => eval(a) + eval(b) } ``` -- In some other project, we can add another interpreter: ```scala def show(expr: Expr): String = expr match { case Const(i) => i.toString case Add(a, b) => s"(${show(a)}+${show(b)})" } ``` --- # Adding operations ```scala sealed trait Expr case class Const(i: Int) extends Expr case class Add(a: Expr, b: Expr) extends Expr case class Multiply(a: Expr, b: Expr) extends Expr ``` -- * If our interpreters are in the same project, we have a bunch of compiler warnings on unhandled case `Multiply`. -- * If our interpreters are not in the same project, and we didn't recompile, our pager is going off. --- # Object-oriented encoding ```scala trait Expr { def eval: Int } final class Const(i: Int) { def eval: Int = i } final class Add(a: Int, b: Int) { def eval: Int = a + b } ``` -- In some other project, we can add another operation: ```scala final class Negate(i: Int) { def eval = -i } ``` -- * We need to recompile all operations to add a new interpreter, like `show`. -- * And maybe `Expr` is a third-party depedency. -- * Or maybe they added it upstream and we didn't recompile `Negate`. --- # Everything is terrible. * This is called _the expression problem_. -- * We want to be able to add new operations to our algebra -- * We want to be able to add new interpretations of our algebra -- * We don't want to recompile existing code -- * We don't want our pager to go off (i.e., no casts) --- # Finally Tagless The name comes from a somewhat imposing [Haskell paper](http://okmij.org/ftp/tagless-final/JFP.pdf), and I wrote enough Haskell today. I'm here for the Scala tonight. -- ```scala trait ExprAlg[A] { def const(i: Int): A def add(a: A, b: A): A } ``` -- To use it, we pass the algebra as a parameter: ```scala def example[A](implicit alg: ExprAlg[A]) = { import alg._ add(add(const(1), const(2)), const(3)) } ``` --- # Adding interpreters ```scala implicit val EvalExpr = new ExprAlg[Int] { def const(i: Int) = i def add(a: Int, b: Int) = a + b } implicit val ShowExpr = new ExprAlg[String] { def const(i: Int) = i.toString def add(a: String, b: String) = s"(${a}+${b})" } ``` -- Summon them implicitly by type: ```scala scala> example[Int] res0: Int = 6 ``` -- Or explicitly. Tastes vary: ```scala scala> example(ShowExpr) res1: String = ((1+2)+3) ``` --- # On another slide, in another project ```scala trait MultExprAlg[A] { def mult(a: A, b: A): A } implicit val ShowMultExprAlg = new MultExprAlg[String] { def mult(a: String, b: String) = s"(${a}*${b})" } ``` -- We can compose algebras: ```scala def example2[A](implicit A: ExprAlg[A], M: MultExprAlg[A]) = { M.mult(A.const(6), A.add(A.const(4), A.const(3))) } ``` -- And interpret them without reopening the original project: ```scala scala> example2[String] res2: String = (6*(4+3)) ``` --- # Interpreting into F[_] -- * When most people say finally tagless, their return types are of kind `F[A]`. ```scala trait MemoryAlg[F[_]] { def store(i: Int): F[Unit] def recall: F[Int] } ``` -- * The `A` is the return type of the expression, and the `F` is the context in which it runs. -- * Good for more than grade-school math. -- * The set is `Int | Unit` -- * Operations are `store` and `recall` -- * Axiom: `store` followed by `recall` returns the argument to `store` -- * Axiom: `recall` followed by `recall` returns the same `Int` --- # Memory interpreter ```scala import cats.effect.IO import cats.effect.concurrent.Ref val memoryAlg: IO[MemoryAlg[IO]] = Ref.of[IO, Int](0).map { ref => new MemoryAlg[IO] { def recall = ref.get def store(i: Int) = ref.set(i) } } ``` --- # A calculator algebra in F[_] ```scala trait CalcAlg[F[_]] { def const(i: Int): F[Int] def add(a: Int, b: Int): F[Int] } import cats.Applicative, cats.implicits._ implicit def CalcAlg[F[_]: Applicative] = new CalcAlg[F] { def const(i: Int) = i.pure[F] def add(a: Int, b: Int) = (a + b).pure[F] } ``` --- # A calculator with a memory ```scala import cats.Monad def addTwo[F[_]](implicit F: Monad[F], calc: CalcAlg[F], mem: MemoryAlg[F]): F[Int] = for { a <- calc.const(2) b <- mem.recall c <- calc.add(a, b) _ <- mem.store(c) } yield c ``` -- ```scala scala> implicit val mem = memoryAlg.unsafeRunSync() mem: MemoryAlg[cats.effect.IO] = $anon$1@34d377b6 scala> addTwo[IO].unsafeRunSync() res3: Int = 2 scala> addTwo[IO].unsafeRunSync() res4: Int = 4 ``` --- # F-algebras * Some people call our last one an "F-algebra" -- * Mathematicians with blogs mean something like this: ```scala sealed trait ExprF[A] final case class ConstF[A](i: Int) extends ExprF[A] final case class AddF[A](a: A, b: A) extends ExprF[A] ``` -- * Not recursive: `AddF` does not refer to `ExprF`. -- * Types reflect the level of nesting of your expression: ```scala def oneDeep: ExprF[Int] = ConstF(1) def twoDeep: ExprF[ExprF[Int]] = AddF(oneDeep, oneDeep) def threeDeep: ExprF[ExprF[ExprF[Int]]] = AddF(twoDeep, twoDeep) ``` --- # Interpreting the F-algebra ```scala def eval1(e: ExprF[Int]): Int = e match { case ConstF(i) => i case AddF(a, b) => a + b } def eval2(e: ExprF[ExprF[Int]]): Int = e match { case ConstF(i) => i case AddF(a, b) => eval1(a) + eval1(b) } def eval3(e: ExprF[ExprF[ExprF[Int]]]): Int = e match { case ConstF(i) => i case AddF(a, b) => eval2(a) + eval2(b) } ``` -- What if I have four? --- # Fix We'll move the recursion over here: ```scala final case class Fix[F[_]](unfix: F[Fix[F]]) ``` -- Fix it all up: ```scala def const[A](i: Int): Fix[ExprF] = Fix(ConstF(i)) def add[A](a: Fix[ExprF], b: Fix[ExprF]): Fix[ExprF] = Fix(AddF(a, b)) def fixedUp: Fix[ExprF] = add(const(1), add(const(2), const(3))) ``` --- # It's getting late, I'm going to start waving my hands: `eval1` is an _initial algebra_: ```scala type Algebra[F[_], A] = F[A] => A val eval1: Algebra[ExprF, Int] = { case ConstF(i) => i case AddF(a, b) => a + b } ``` -- And F is for Functor ```scala import cats.Functor implicit val ExprFunctor: Functor[ExprF] = new Functor[ExprF] { def map[A, B](expr: ExprF[A])(f: A => B): ExprF[B] = expr match { case ConstF(i) => ConstF[B](i) case AddF(a, b) => AddF(f(a), f(b)) } } ``` --- # Catamorphism A Wikipedian once wrote, "Deeper category theoretical studies of initial algebras reveal that the F-algebra obtained from applying the functor to its own initial algebra is isomorphic to it." -- I don't know about all that, but the types line up: ```scala def cata[F[_]: Functor, A](fixed: Fix[F])(alg: Algebra[F, A]): A = { alg(fixed.unfix.map(cata(_)(alg))) } ``` -- Let's try it: ```scala scala> cata(fixedUp)(eval1) res5: Int = 6 ``` -- ```scala scala> cata[ExprF, String](fixedUp) { | case ConstF(i) => i.toString | case AddF(a, b) => s"(${a}+${b})" | } res6: String = (1+(2+3)) ``` --- # Optimization ```scala scala> val prog = add(const(1), add(const(2), const(0))) prog: Fix[ExprF] = Fix(AddF(Fix(ConstF(1)),Fix(AddF(Fix(ConstF(2)),Fix(ConstF(0)))))) scala> def optimize(e: ExprF[Fix[ExprF]]): Fix[ExprF] = | e match { | case AddF(Fix(ConstF(a)), Fix(ConstF(b))) => | Fix(ConstF(a + b)) | case other => | Fix(other) | } optimize: (e: ExprF[Fix[ExprF]])Fix[ExprF] scala> val optimized = cata(prog)(optimize) optimized: Fix[ExprF] = Fix(ConstF(3)) scala> cata[ExprF, String](optimized) { | case ConstF(i) => i.toString | case AddF(a, b) => s"(${a}+${b})" | } res7: String = 3 ``` --- # Other fun things to do with F-algebras -- * Anamorphisms -- * Hylomorphisms -- * Paramorphisms -- * Apomorphisms -- * Confuse your audience so they forgot their questions from the easy part of the presentation --- # Thanks! Code and slides at `indyscala/algebra` on GitHub ## Questions?