Keeping data in memory instead of using databases

Not quite you are proposing, but several years ago (before cloud was pervasive) I worked on applications that used Apache Ignite as the backing in-memory database. It allows you to run in-JVM or externally, as a collection of nodes, so you don’t end up with gigantic JVMs. Although it supports persisting data to disk, in our particular case the data was fetched by another process, and we used Kafka for distribution, which also gave us a nice way to re-hydrate the data upon startup.

Our data was stored with a few select elements that we used for querying purposes, and the bulk of the payload was serialized using Protobuf, which is faster and more compact than JSON serialization. Note that Ignite shines as a distributed key-value map - maybe this has changed, but although it supports SQL style queries, performance was not ideal for very large data sets; we ended up implementing a light support for where clauses.

Anyway, my 2 cents.

1 Like

I will try to explain with a SaaS app as example. We run one instance for each customer (company). We load at startup:

val company: Company = try {
    jacksonObjectMapper().readValue(
        File("src/main/resources/company_${System.getenv("COMPANY_ID")}.json").readText(),
        object : TypeReference<Company>() {}
    )
} catch (e: Exception){
    logError("Exception while loading company JSON: $e")
    exitProcess(1)
}

Adding a new user to the company:

company.users.add(User(email, hashedPassword, name))

Editing a user:

val user = company.users.first { it.id == params["id"] }
user.email = params["email"]
//and so on

Deleting a user:

company.users.removeIf { it.id == params["id"] }

No need to save. The changes are in memory and gets saved when the app is restarted, plus each 10th second for backup.

val json = jacksonObjectMapper().writeValueAsString(company)
File("src/main/resources/company_${company.id}.json").writeText(json)

I had an unpleasure with working in https://prevayler.org for two years which is an in-memory database which bases on Java serialization. Server startup was taking about 20 minutes, same shutdown. Servers required a special kind of very expensive RAM to ensure data is not lost. Also there was no CI (what are tests? code reviews?), CD (copy .java files to server with ssh, compile it there and manually run a jar) or any kind of task mangement like Jira (e-mails are the way), but that’s a different story. Never again.

1 Like

That being said, it was 10000 times faster than SQL-based databases. With current caching techniques the differnce probably isn’t so big.

Thanks for sharing this experience. I’m not talking about in-memory database, though, but keeping the data in the regular Kotlin-code (in constants and variables), no db layer.

This is exactly how Prevayler works. You have a singleton “Repository” class which contains just collections, objects etc… You can access it for reading just as regular Java/kotlin objects, that’s why it is so fast. It uses a command pattern to modify data which somehow ensures ACID because commands are immediately serialized. This is actually quite smart.
Like this:

object Repository {
   var users: MutableList<User>;
}
1 Like

I see. Then it’s similar to my solution.

I agree, this is a silly idea, that will bite you in the ass very soon.

2 Likes

We will see. It’s anyways pretty easy to change to using a document database and save the objects as JSON there. Will let you know (if my apps grows (praying to Jesus)) about how this goes.

I don’t think this is true. The problem here is not how do you store the data underneath. It doesn’t really matter if you use JSON or some binary format, document database or local files. The main problem here is that you don’t control when you save/load data. Normally, when using databases we make changes using transactions which provide isolation, atomicity and data consistency. But the code has to cooperate with the process, it has to know when to start and commit the transaction, all data loading/saving has to be explicit. To fix the problem you will need to rework the whole code that touches the data in the DB.

1 Like

This is really a good reason for doing dependency inversion for your persistence. Even if you’re not going total Clean Architecture, a simple Repository pattern interface would protect from switching to another storage solution.

This is the part that makes red flags go up for me. I would argue for creating a quick and straightforward custom storage solution and skipping databases is fine… only if you protect your code from knowing about it. I take this quote to mean you allow your code to know that it’s loading/saving JSON.

Another example of a simple and quick storage solution, Java supports reading and reading/writing files for Properties classes, which is essentially just a Map<String, Serializable>. I see no problem using this as your backing persistence in the beginning because it’s hidden behind the interface and is interacted with only via model classes.

I’d be concerned to see any mention of the underlying persistence choice infecting much of the code (which could also happen when using ORM libraries). I’m not as concerned about making use of DB features or saving on an interval.

It should be pretty easy with my usecase:

get("/") {
  val userSession = getUserSessionFromDb(call)
  userSession.doWhateverYouWantAndItWillAllGetStoredAutomatically()
  saveUserSessionToDb(userSession)
}

For SAAS apps it should also be pretty easy: Users can simply wait in turn, if two requests come in at the same time, the second can wait until the first finishes.

So, how it went?

1 Like

FYI: I’m currently working on an in-memory store:

I’ started work 3 years ago but then left it abandoned for two years. Currently i’m working it over heavily. I will create a new release sometime in the next 4-8 weeks. That release should be good enough for introduction to a wider audience.

Manfred