Why you should use YapDatabase

One managed context too far...

Very recently, someone submitted an issue to a Github project which I've open sourced. I saw the email title from GitHub, and immediately thought I'd got some feedback on one of the two projects I've open sourced in recent weeks. Exciting!

But, nope! It was for a project which is now 4 years old, that's so old it was before Swift! Written in Objective-C, before ARC. So, all joking aside, it is quite old for an unmaintained legacy project, I'm amazed that it was being used, and lets just say it was humbling scanning though the code.

Why is this relevant? Well, the project is my own implementation of Apple's NSFetchedResultsController but with support for post fetch filters. Huh? So, lets say you want to display the data in a table, but you want to filter the data by some metric, or allow the user to do this. Using Apple's tools, the standard approach would be to model your data entities in Core Data and then use a Fetched Results Controller (FRC) to query the datastore by your metric. Unfortunately, when using the common SQLite storage type, this isn't always possible. Hence, I wrote my own FRC replacement class which does this.

Okay... but why is this relevant? Because, one of the projects I've open sourced, TaylorSource is a Swift framework to support just this sort of thing but with significantly less code and compromises. The most significant change is to not use Core Data, and in this post, I'll discuss why.

NSManagedObject

Core Data provides tools to model your application domain. It is designed with data identity as a central axiom. By this I mean that an object fetched from Core Data in one thread is consistent with the same object fetched in a different thread. Initially, this sounds great and is exactly what you might want and expect. Changes to the object in one thread will update in-place in other threads. However, even in a trivial application, this is actually not desired behaviour. Writing asynchronous or multithreaded code is notoriously hard to do without errors. In order to mitigate such errors, it is easiest if you can reason about what is changing each thread, however this is very tricky to do with Core Data. It is why most of the changes to Core Data over the last 5 years have been around making this less painful with the introduction of bespoke queues and nested contexts.

These days, in our post-Swift world, we strive for immutability in our models and prefer data values over identity classes. Unfortunately such an architecture isn't possible using Core Data. The design of Swift itself indicates that Apple has learnt from the design of Core Data, and I expect that if they designed a new persistence layer today it would be very different.

There are many "features" of Core Data, such as the modelling tool, properties, relationships, migrations. These are all possible because entities (models in Core Data parlance) are NSManagedObject classes. Your Person entity is actually just an NSManagedObject and the name attribute can be accessed via key-value-coding. There are two problems with this. Firstly, a minor issue, is the lack of strong typing without synthesising classes. Secondly, and most significantly is the base type, NSManagedObject. It's job is to represent the entity inside a store, it maintains any relationships, performs serialisation and in general supports Core Data features. So, to use Core Data, you must inherit from this base class. It is the first step on the road to tightly coupling the application domain to the persistence layer. This is easily the biggest issue with Core Data, and so it also defines our first two requirements in a persistence layer.

  • No inheritance from a base type.
  • Support value types like struct and enum.

NSManagedObjectContext

I previously mentioned nested contexts, which are NSManagedObjectContexts (MOC) which are essentially windows into the datastore though which objects are made available. To create a MOC requires jumping though a lot of hoops supplying either a parent context, or a NSPersistentStoreCoordinator which in turn requires quite a bit of setup code, usually referred to as the core data stack. All of this boilerplate leads to really poor design patterns, exacerbated by Apple's templates which dump this into the application delegate. In turn this leads to increased global state, at worst novice developers will resort to importing their AppDelegate class into their controllers and model classes.

NSFetchRequest and NSFetchedResultController

Getting objects out of Core Data is no easy task either. Because Core Data manages an object's identifier (NSManagedObjectID) itself inside the base class, then it is not possible to access objects directly unless you already have the object identifier, which is internally created by Core Data. Therefore reading objects out of the store requires executing a fetch request. Yet more boilerplate.

// Create the fetch request for the entity.
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];

// Edit the entity name as appropriate.
NSEntityDescription *entity = [NSEntityDescription entityForName:@"City" inManagedObjectContext:self.managedObjectContext];
[fetchRequest setEntity:entity];

// Set the batch size to a suitable number.
[fetchRequest setFetchBatchSize:20];

// Edit the sort key as appropriate.
NSSortDescriptor *sortDescriptor = [[NSSortDescriptor alloc] initWithKey:@"population" ascending:NO];
NSArray *sortDescriptors = [[NSArray alloc] initWithObjects:sortDescriptor, nil];    
[fetchRequest setSortDescriptors:sortDescriptors];

And the above doesn't even have any filtering, it'll just fetch 20 City entities. Using a NSFetchedResultsController incorporates the use of a fetch request too, with the addition that it will inform its delegate when objects change, which does make it useful for driving table views. However, configuring a table view in such a way is at least 150 lines of code to the view controller.

In summary, Core Data requires non-trivial boiler-plate setup code, tightly couples data models to the persistence layer, and is not especially feature rich to boot. There must be a better way. Luckily we have YapDatabase.

Using YapDatabase

Our first requirement was to not require inheritance from a base type. This is to avoid tight coupling to the persistence layer. YapDatabase lets you use bog-standard NSObject classes. Implement NSCoding and everything will work without any additional configuration. Custom serialization is also possible, which opens up the use of frameworks such as Mantle or FastCoding. YapDatabase operates as a key-value store, and all that is required is the data archive of models. How that data is supplied is up to the developer, with NSCoding being the easiest and default.

This means that if an app starts off persisting model data in NSUserDefaults or directly to disk, transitioning to YapDatabase is trivial and requires no changes to the models. This is because the persistence layer (YapDatabase, Core Data, Realm etc) is decoupled from the model classes.

YapDatabase does require NSObject classes, which means that pure Swift classes, structs or enums cannot be stored, however, it is relatively straightforward to do this, and my framework, YapDatabaseExtensions has got you covered.

What about boiler-plate?

There isn't really a Core Data stack equivalent with YapDatabase. Create a database with a path to its location on disk. As for managed object contexts, YapDatabase is designed using transactions, which are performed inside connections. Connections are created using the database, and support block based methods for reading or writing objects inside transactions. This makes concurrency a breeze, especially as YapDatabase treats your objects as immutable, and indeed features some design guidelines to enforce this in Objective-C.

Extensions

So, hopefully it's clear than creating models and access to them is better and easier, but what about advanced features we might want in a persistence layer?

YapDatabase supports the concept of extensions, of which there are many built in. A YapDatabaseView can monitor the database for objects which pass a block based group filter and issue notifications. Combined with sorting, group filtering, group sorting, mappings and persistent caching, YapDatabaseView is so much more than NSFetchedResultsController. Additionally, Filtered Views and Search Results View (from the full text search extension) allow you to filter or search such database views enabling very power datasources. See my open source framework, TaylorSource for examples of this.

Other extensions are available for managing relationships, using secondary indexes, and even syncing with CloudKit.

Approaching Relationships

CoreData explicitly models relationships as an attribute of the entity, forcing the inverse relationship and supporting one-to-one and one-to-many. This is handy, as it means that if an object is deleted, any objects which depend on it can also be deleted, called cascade deletion. YapDatabase is doesn't require anything like this. It is a key-value store, well actually, it is a collection-key-value store, meaning that the developer must provide a collection name, conceptually like a bucket and a key within the collection as the index for the value. Therefore the easiest way to reference other objects is to store their identifies. For one-to-many relationships, it is easier to model this using the back reference which is probably one-to-one. However, for many-to-many relationships, it is generally easier to take advantage of the Relationship extension, which will maintain the references automatically, and provide an API within a transaction to iterate over relationship objects.

Wrapping up...

I've summarised much of this discussion in the table below. Although I've not discussed Realm, from my own experience I've found that it has largely the same design issues as Core Data, however, please comment if you've had different experiences.

Feature/ConstraintCoreDataYapDatabaseRealm
Inheritance from base classNSManagedObjectNSObjectRLMObject
Allow value typesNoPossibleNo
Schema MigrationRequiredDeveloper has controlRequired
Immutable ObjectsNot possibleYesNot possible
Property TypesArbitrary with transformersArbitraryBool, Int, Double, Float, String, NSDate, NSData.
RelationshipsBuilt in/RequiredDeveloper has controlBuilt in/Required

References

Posted on May 9
Written by Daniel Thorpe