I was set on the task of finding out what would be good for the database. Either using 1 table for everything, 1 table per user, or 1 database per user.
Here is my assessment on this.
approach 1: 1 table for everything.
In this approach you would need to hang a CustomerId
to everything. In your queries you need to filter everything through that CustomerId
, or do that right before sending back the data to the client.
This is the simpelest way to design a database,
here are some cons:
- when you do an upgrade, every client is down. If something goes wrong, every client is screwed.
- it is much more difficult to throttle resources. I.e., if one client is hammering the database, it is harder to give them more resources separate from everyone else.
- When a customer wants a roleback of a table / all data, you have to remove all data by
CustomerId
and insert the backup back in. And you have to do this fast, cause if a user modifies something whilst this is busy, you’ll run into problems. - if one customer wants to stay on an older version, you can’t really do that.
It also might get really big after a while. But it should be able to handle that.
approach 2: every customer has it’s own tables
pros:
- a customer can decide, i’ll stay on an older version.
cons:
- you have to update all tables manually per costumer (unless you write a script for it)
- the company wants to grow, at first you’ll be able to manage, but when you grow it’ll become a problem. Here is a good way to explain it:
"Your business is growing, and as it grows, you are adding new table for every new user. If you continue like this, there will be so many tables that your operating system will collapse. Your database is doomed. It's like if your doctor tells you that if you don't quit smoking, you will die in six months. The same for your database. If you don't quit creating tables, your database will die unpleasantly. Quit smoking, now!"
this is basically the one thing i think is not an option, the other options are a valid thing to choose. but this isn’t.
approach 3: every customer has it’s own database
this could also come with a seperate server per customer, but that seems like a waste of recourses.
in this approach you’d have to choose a customur before you login, this will give you the correct connectionstring
, and then you can login, and with every api call you’d have to give your connectionstring
. then the server would send you to the correct data via the connectionstring
.
pros
- a customer can decide, i’ll stay on an older version.
- there is no way of accidentally sharing data or modifying data from other customers.
- if there were to be a leak, it’s only one client and not all. (though this shouldn’t happen anyways.)
- don’t need to filter for clients
- you can use normal backup mechanisms
- customer could run their own server if they would want to
cons
- you have to update all databases manually (unless you write a script for it.)
- having multiple very small databases is not needed when there is not much data, it’s a bit overkill.
approach 4: the collective (extra idea from myself)
There is a bonus strategy:
You take the CustomerId
from approach 1 and 2, and combine it with the connection string of approach 3.
you create a database that holds every customer, i’ll call it the adressDB
for now. this holds the connectionstring
and CustomerId
by every customer.
Then you make 1 database, let’s call this one datastorageDB
. All data goes in here. No seperate tables either. every record has its CustomerId
attatched, so that you can find the records that are only yours. So this is basically approach 1.
But then why the connectionstring
? Well, in the adressDB
all connectionstring
’s will point towards datastorageDB
.
The reason you want to do this is so that if a customer get’s too big, or wants to stay on an older version, or other reason, you can create a new database, and replace the connectionstring
that points to datastorageDB
with a connectionstring
that points to the new database.
Pros:
- this way you are prepared for the future, where you might suddenly need multiple databases. This way you don’t need to change much code when that happens.
- for as long as you don’t need it you can just use one database, which is easier to handle and probably cheeper. (i don’t know how this is with mongoDB)
- you can support older versions in this method too.
- customer could run their own server if they would want to
- the Pros of solution 1 and 3.
cons:
- when you are only using 1 database, it will feel like overengineering. (not really a bad con honestly)
- it’s going to take longer to build this structure.
- you have to update all databases manually (unless you write a script for it). but it’ll be less bad than solution 3, since you’ll have way less databases.
conclusion
I personally think 4 is the best, but otherwise use 3.
notes
- Grabbing a lot of data from a db and filtering it in the backend is a bad practice.
- Databases are optimised on filtering, so avoiding filters is not neccecary.