Functors - An Overview.

Higher-Kinded Types

When we program in a language like C# or Java, we often run into “concrete” types like intstring, and bool. The neat thing about concrete types is that we always know all the operations that we can perform on them (ints can be added, bools can be negated, and so on).
One step up on the abstraction ladder are “generic” types, like List<T>. A fancy term for generic types is “parametric polymorphism:” “parametric” because we’re working with a type parameter, and “polymorphism” because the parameter in question can have multiple (“poly-“) shapes (“-morphism”). So, we know what operations we can perform on the List itself (iterate over all its elements, Addan item, etc.), but we know absolutely nothing about the type represented by T. This gives us a lot more power of abstraction because we can write methods that manipulate these generic structures and have them guarantee to work no matter what type we eventually fill in for the type parameter.
In Haskell and Purescript, we’d write List<T> as List t: the name of the generic type ( List) comes first, and the name of the type parameter ( t) comes next, separated by a space. Haskell and Purescript know that t isn’t the name of a concrete type (like if we had written Int or String) because it starts with a lowercase letter.
We can go one step further and write a type like IEnumerable<T>, where our container type isn’t a concrete type but is only an interface. Now, we know only a little bit about the container type itself (specifically, that it has elements over which we can iterate), and we still know nothing about T. The number of operations we can perform on a value of type IEnumerable<T> is even smaller than those for List<T>. This limitation is actually a good thing because now we can pass in a Stack or a Queue to a method that takes an IEnumerable, for example, and the method will work as expected.

Usually, this is where we would have to stop because most languages don’t let us go any more abstract. However, Haskell and Purescript don’t have this restriction and support so-called “higher-kinded” types, where we can make both the internal type and the container type fully generic. If C# had syntactical support for this, it might look something like T1<T2>T1 could be IEnumerableQueue, etc., while T2 could be intstring, etc. Haskell and Purescript, however, do support this concept of higher-kinded types and use this syntax: f t (where f is the container type and t is the type parameter). Because f is lowercase, we know that it’s generic as well as t, and so can be any type that requires exactly one type parameter. For example, the following types would fulfill f:
  • List, which holds zero or more values of the same type.
  • Queue, which also holds zero or more values of the same type, but doesn’t allow random access.
  • Maybe (A.K.A. Option or Optional), which can contain either one value or be empty.
  • Promise, which eventually produces a value.
  • Redux store, which we can think of being parameterized by the type of the state it contains.
Whereas the following cannot be used for f:
  • Map, because that requires two type parameters, one for the key and one for the value.
  • string, because that doesn’t have any type parameters (another way of saying that is that the type string is already fully concrete).
  • Tuple, because that also requires two or more type parameters (depending on the number of elements it contains).
  • Redux reducer, which requires two type parameters, one for the message and one for the state.

Fun Fact!

In Haskell or Purescript, the higher-kinded type parameter (f in f t) is usually named f or m, while the type parameter it takes (t in f t) is usually named a(or b or c if more than one is needed).

Constraints

Granted, we really can’t do very much with a value of type f t because we intentionally don’t know very much at all about it. However, we can put constraints on our type parameters. In C#:
class Foo<T> where T : SomeConstraint


Now, T can’t be just anything but instead has some restrictions on what can be plugged in for it. After we’ve put one or more constraints on T, we now know more about what we can do with it. For example, if T is constrained to be an instance of IComparable, that means Foo will only accept types that support the CompareTo method, like int or char (but not, say, List<string>). In Haskell or Purescript, this type can be written SomeConstraint t => Foo t, which means that type tmust support the operations in SomeConstraint.
In Haskell and Purescript, we can also place constraints on our higher-kinded type parameters, like so: SomeConstraint f => f t. Now, we know that f supports whatever operations are included in SomeConstraint, and we can plug in any type that supports them. This is approximately analogous to the idea behind IEnumerable<T>, where the container type is abstract and so we can choose a specific type, like List or Queue, to make the type concrete. (We might write that example in Haskell and Purescript as IEnumerable f => f t.) We can also use more than one constraint on a single parameter or place constraints on more than one parameter at a time, like so:
(SomeConstraint f, SomeOtherConstraint f, AnotherConstraint t, YetAnotherConstraint t) => f t

Recap

What all this allows us to do is to have a system of specifying the structures of values, which will come in handy when we get to Functors below. Higher-kinded types allow us to focus on values that have general structure, and we can narrow the possibilities down with constraints.
Phew, that’s a lot to digest. Here’s a summary of how the different concepts (roughly) translate between C# and Haskell and Purescript.

Concrete Types

We know everything there is to know about these types.

C#

int


Haskell, Purescript

Int


Generic Types

We know everything there is to know about part of these types.

C#

Foo<T>


Haskell, Purescript

Foo t


Constraints

We know some things about various parts of these types.

C#

// Constraint on the "traditional" type parameter
Foo<T> where T : IBar
// Constraint on the container type itself
// Approximately:
IFoo<T>


Haskell, Purescript

-- Constraint on the "traditional" type parameter
IBar t => Foo t
-- Constraint on the container type
IFoo f => f t


Higher-Kinded Types

We only know about the very general shape of these types, but we can place constraints on them in order to do useful things with them.

C#

// Doesn't exist, but might look something like:
T1<T2>


Haskell, Purescript

f t


Functors

Lo and behold, we’ve made it to Functors!
We’ve been making statements about the “structures” of types without a clear definition, so here goes:

Structure

Rules about how you can use a type (i.e. the operations that can be performed on it). In an object-oriented (OO) language, this might be represented by interfaces; in a functional language, this might be represented by modules or typeclasses.
IEnumerable<T>, for example, has a different structure from IComparable<T>, because IEnumerables can be iterated over, sorted, etc., while IComparables can be compared.
Functor is a specific structure that supports exactly one operation: mapmap allows us to operate on value(s) inside a Functor (which we can think of as a container of some sort) without knowing (or caring) how the Functor is implemented. Each Functor has its own rules for how map works, but this general theme of manipulating contained values is the same.
You’ve already used map if you’ve ever used Array.prototype.map in JavaScript or LINQ’s IEnumerable.Select in C#, for example. You didn’t have to know how the container (the Array or specific IEnumerable) was implemented, since its implementation of map took care of that for you. All you needed to do was provide a function that takes an element of the list and returns something else, and the container handled the rest.
Without further ado, let’s take a look at map‘s type signature!

C#

public interface IFunctor {
 // There's no second parameter, because this method is on the `IFunctor` we want to map over
 IFunctor<TResult> Map<TResult>(Func<T, TResult> fn);
}


Haskell, Purescript

-- Here, we explicitly write the Functor to map over as the second parameter, because Haskell is functional rather than object-oriented
map :: Functor f => (a -> b) -> f a -> f b

map takes a function and a Functor and returns the Functor after the function has been applied to its element(s). The type of the element(s) inside the Functor is allowed to change.
So, Functors are really, really simple. Their only special “skill” is that they allow you to apply a function to what’s inside them.

What Can (and Can’t) Be a Functor?

So far, we’ve only seen List‘s implementation of Functor, but what else could be a Functor?
Queue can, because it can allow a function to be applied to all its elements. Maybe can also be a Functor: if there is no value, then an empty Maybe is returned; otherwise, a Maybe with its value modified by the provided function is returned. Promise can implement map by allowing you to modify the value that the Promise will eventually resolve to (e.g. somePromise.then(x => x + 1)).
So what can’t be a Functor? Well, a Redux store probably can’t. While it certainly allows its state to be modified, being a Functor would mean that the type of the state could change, since any function we use with map with must be allowed to change the type of the elements inside the Functor. We usually don’t want the type of our Redux state to change over time, so a Redux store isn’t a valid Functor.
Let’s see some quick examples of Functors in use. Here’s what it looks like to map over a list in JavaScript, C#, and Haskell and Purescript:

JavaScript

[1, 2, 3].map(x => x.toString()) // ["1", "2", "3"]


C#

new[] { 1, 2, 3 }.Select(x => x.ToString()).ToList() // new[] { "1", "2", "3" }


Haskell, Purescript

map show [1, 2, 3] -- ["1", "2", "3"]


Note that in the above example, the List started out as a List<int> but was converted to a List<string>. This is another important property of map that we’ll get to below.
And here’s what it looks like to map over JavaScript’s Promise type instead:

JavaScript

Promise.resolve(1)
// Think of `then` as `map`
.then(x => x.toString()) // resolves to "2"

In both cases, we’re handing map a function and each Functor is doing the rest for us in its own specific way.
As we can see, the concept of a Functor is intentionally very general, and this idea of very general abstractions is what makes category theory so powerful.

Example

To summarize: Functors allow us to treat “what to do” and “how to do it” as separate concerns. After a map implementation is written for a type, it can be used the same way with any function (as long as its parameters are the correct types, of course).
Now for a real-world example! Let’s pretend that we have a Maybe type of some sort in JavaScript, along with a corresponding fromMaybe function that takes a modifying function, a default value, and the Maybe to map over. If the Maybe contains a value, fromMaybe will return a modified version of that value; if, however, the Maybe does not contain a value, fromMaybe will return the default value that it was passed.
As an example, we might traditionally work with possibly nonexistent values like so:
const result = thingThatMightFail();
return result.value
 ? result.value + otherValue
 : 0;


The issue with this approach, however, is that we can forget to check that result is null or undefined(‘cannot read property of undefined,’ anyone?). If, however, we use some sort of Maybeobject, we might be able to rewrite the above using fromMaybe as follows:
const result = thingThatReturnsAMaybe();
return fromMaybe(
 v => v + otherValue, // Function to apply if we have a value
 0, // Default value
 result // Maybe object
);


Now, if we forget to use fromMaybe on our result, we’ll catch it when we try to actually use this value. Granted, this isn’t as useful in JavaScript as it is in a statically-typed language like C# where we’d be able to catch such errors at compile-time, but using Maybe like this everywhere we’d use null or undefined still allows us to reason about our code and know what can return a value that might not exist (instead of just hoping that we remembered to check for null or undefinedeverywhere we use a potentially nonexistent value).
If we want to run multiple operations in sequence on a value that might not exist, we might think to do it like this:
const json = someAjaxRequest();
let result = defaultUser;
if (json) {
 const id = extractId(json);
 const user = getUserById(id);
 result = upgradeUser(user);
}


However, with Maybe, we can write it differently:
Note:R.pipe is a function from the Ramda.js library that “composes” functions: it takes several functions and chains them together to create a new function (for more details, see the docs).
Also, for clarity, psuedo Flow function type annotations are included.
const maybeJson: Maybe<JSON> = someAjaxRequest(); // someAjaxRequest: () => Maybe<JSON>
const result: User = R.pipe(
 (maybeJson: Maybe<JSON>) => map((extractId: (JSON) => Id), maybeJson),
 (maybeId:   Maybe<Id>)   => map((getUserById: (Id) => User), id),
 (maybeUser: Maybe<User>) => fromMaybe(
 (upgradeUser: (User) => User),
 (defaultUser: User),
 maybeUser
 )
)(maybeJson);


And here’s the same example, but this time we’ll remove unnecessary Flow type annotations and take advantage of a technique called currying in order to see what the same code might look like in a real-world situation:
const maybeJson: Maybe<JSON> = someAjaxRequest();
const result: User = R.pipe(
 map(extractId),
 map(getUserById),
 fromMaybe(upgradeUser, defaultUser)
)(maybeJson);


So, let’s see what we gain from using Functors. As opposed to in the first, ‘traditional’ example, we didn’t have to manually handle the case where no value was returned. The Maybe Functor itself was able to decide how to do so, and all we needed to worry about was giving map the functions that we’d like to perform if a value does happen to exist. We can use any functions we want for each step in the pipeline, as long as they have the correct types (namely, the return type of each one must be the type of the next one’s parameter).
Functors are incredibly common in practice, and a good rule-of-thumb is to ask yourself whenever you begin to write a function that operates on a data structure if the function could be made general enough to operate on a Functor instead. For example, Haskell/Purescript functions like replace and zipcould be defined to work with lists specifically, but they’re instead implemented to only require that their parameters are Functors; as such, they work on many different structures for free! They don’t need to know anything about how each of those Functors works on the inside because maphandles the details for them.

Laws

Finally, Functors have two associated “laws” to ensure that the Functor will behave as one would expect. Don’t worry, you won’t go to jail if you get these laws wrong (although it is certainly an unpleasant surprise when a Functor is “unlawful”).

Law #1: Identity

Mapping the identity function over a Functor has no effect.
Fancy pseudo-Haskell way to say it:
map id = id


This one’s pretty self-explanatory: if your function doesn’t actually do anything to the element(s) inside the Functor, then it would certainly be surprising if you got back a different result than what you passed in.

JavaScript Example

map(x => x, [1, 2, 3]) // [1, 2, 3]


Law 2: Composition

Multiple maps in a row can be composed into one without any changes in the overall behavior.
Fancy pseudo-Haskell way to say it ((.) is the function composition operator):
map f . map g = map (f . g)

JavaScript Example

We could write our earlier JSON-manipulation example like so if we combine the two-in-a-row maps into a single map, with the new function to use being the composition of the two steps in our pipeline.
Reference:

Comments

Popular Posts