Multi-Tenant: Database Per Tenant or Shared?

  Рет қаралды 21,158

CodeOpinion

CodeOpinion

Күн бұрын

Пікірлер: 110
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
Our software supports database per tenant, but in production we've yet to use that feature. Instead of adding filter by tenant to all the queries, we created row level security on a MS-SQL database. Each Tenant has its own SQL USER. Each database tabale uses security policy to apply a predicate at the database level. That predicate will check the SQL USER running the query and apply a predicate to filter out any rows that are not for that tenant. This has the advantages that the developers are no longer responsible for applying the tenant filter in the code, and that any SQL injection attacks will automatically be restricted to just that tenant. Obviously this is only a practical solution when you have a low number of tenants.
@awright18
@awright18 3 ай бұрын
I just came here to say that exact thing :)
3 ай бұрын
Wow, I had no idea that you could do that! Interesting!
@lukassinkus6162
@lukassinkus6162 3 ай бұрын
didn't know this existed, very interesting approach. what's your flow for onboarding new customers? do you have automated scripts that create these users with these special permissions? how do you store these user credentials?
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
@@lukassinkus6162 We have a SQL USER per tenant, not per user. Given that we've got a few customers with a large user base, creating the SQL USER is a trivial manual task. In our SAAS system all users authenicate via their tenant's AzureAD system and we then receive a claim principle with their details. Those include a tenant id. We use that to lookup the SQL USER which is then stored in a server side session for that user. All subsequent database queries use this SQL USER in the connection string.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Thanks for the comment! Yes, DB user with db level filtering is a great option if you can do it, such with SQL Server.
@maple.everything
@maple.everything 3 ай бұрын
For our SaaS product, we're using a hybrid model. Consumers are, writ-large, on one of several regionally-pooled APIs & DBs. For our important enterprise customers, we have dedicated environments for each. This gives them complete isolation and better performance guarantees. We can also move customers from one pool or environment to another in cases where a shared customer is promoted to an enterprise customer - this is where globally unique ids can save you. The enterprise fees are commensurate with the development of both the hybrid model and the resources they're consuming in their environment. I can tell you one thing for certain: Infrastructure as Code helps a ton if you're doing either pooled DB & API, DB & API per tenant, or both simultaneously. It makes spinning up those extra environments and deploying changes much, much faster and more reliable.
@logantcooper6
@logantcooper6 3 ай бұрын
We have a shared database for all tenants and use query filters. Works well enough. We plan to squeeze as much life out of this setup as we grow before we decide to take on any more complexity.
@alex_chugaev
@alex_chugaev 3 ай бұрын
That’s agile way, I like it 👍🏻
@JoeEnos
@JoeEnos 3 ай бұрын
If you're going with database-per-tenant, I would just make sure you seriously take the time upfront to build (or buy) proper sync and validation tools between each database. Easy to skip if you're just starting with a handful of databases, but that number creeps up, and executing a schema change in a thousand databases, or running 999 schema comparisons manually isn't a lot of fun.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Absolutely. The amount of operational overhead can be a nightmare.
@ankeshkapil3129
@ankeshkapil3129 3 ай бұрын
but maintaining so much data in a single table is also a nightmare
@elpe21
@elpe21 3 ай бұрын
We've got over 300 databases on-premises. Migrations are part of the system , dbs are not in sync ( don't have to be ). We've also have postgres versions ranging from 9.2 to 16 ;)
@JoeEnos
@JoeEnos 3 ай бұрын
@@elpe21 Ouch, that's just absurd.
@elpe21
@elpe21 3 ай бұрын
@@JoeEnos Well, we update customers one by one / when they ask for it. If you think 'MS Windows', they also have multiple version running on computers , also people update the system when they want ;) Often our customers do not have good connection so we can't move them to cloud ( and our system is not cloud-friendly yet )
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
One of the advantages for database per tenant is if you need to do a database restore. This allows a single tenant to be affected.
@JacobGlanz1234
@JacobGlanz1234 3 ай бұрын
I would argue. It’s much easier for me to maintain a replica than a DB per tenant. All the cloud managed DBs also offer managed replicas
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
@@JacobGlanz1234 The problem is that if we ever manage to corrupt a tenant's data and need to roll back then that database restore would affect all tenants. Separate DB per tenant also helps if we have off board a customer and have to delete their data due to GDPR requirements. As stated on a previous comment, our software supports DB per tenant, but we're not using that feature yet with any of our tenants.
@malehernandez1975
@malehernandez1975 3 ай бұрын
It is worth noting that a multi-tenant system can, more or less cheaply, support a single tenant on a new instance, e.g. if you have some tenants with gigantic amounts of data, or expect some with legal restrictions needing an on premise instance. The reverse path can be more complicated, if you finally realize that you can't maintain a lot of instances.
@qj0n
@qj0n 3 ай бұрын
Yes! That's why i always suggest starting with single DB. It's much easier to both switch to db per tenant or to hybrid approach. This is the most flexible way
@ulrichborchers5632
@ulrichborchers5632 3 ай бұрын
Very nice topic. Great overview of important considerations and options. There are so many variations. Database versioning. API versioning. Backwards compatibility. Containerized versus shared resources. Single instances or Cluster. Split per schema on the same host versus different tables in the same schema versus one distict daemon or machine per tenant ... production, staging and testing environments ... split by tenant in the user-facing data center but mirror into a shared data structure ...
@CodeOpinion
@CodeOpinion 3 ай бұрын
Thanks for all the video ideas! 😀
@shawn_bullock
@shawn_bullock 3 ай бұрын
At all the banks I've worked at, and where I'm at now (a data analytics company) we always use database per tenant. It's the right thing for us, we cannot risk a query defect leaking data, or a problem with one tenant affecting others, or a query-heavy tenant affecting the CPU/RAM of other tenants. Also if we have to restore (which we've done once or twice over the years) it only affects a single tenant. The downside is migrations and cost. But it greatly simplifies our application architecture and compliance audits.
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
There's pros and cons for each method. You have to choose what's important for YOUR business model.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Thanks for the comment. Data leaking is probably the comment/subject that gets brought up the most. The other which I totally forgot to bring up is if you are in a situation where you have cross tenant communication. Maybe a separate video one day.
@repotranstech
@repotranstech 3 ай бұрын
For banks and fintech it makes sense.
@tibba69
@tibba69 3 ай бұрын
Yes please do a video about cross tenant communication/data sharing
@bitaligners
@bitaligners 3 ай бұрын
I think the hybrid approach is valid: 1 only database to maintain and a schema dedicated for each tenant. Sometimes it's possible to spread schemas along 2 or more db instance to tackle noisy neighbors or performance issues. Schemas ensure isolation without applying any query filters. it's like to select the right database but without the burden to manage many databases.
@CodeOpinion
@CodeOpinion 3 ай бұрын
From an infrastructure operational angle, yes less to manage. From a code operations the migrations/updates to those schemas is what needs to usually be more involved in automation during deployment. Especially if you have a single code/app release that are hitting those different schemas. (eg, add a new column to a large table that is going to lock the table, rolling that out to each schema on the same DB has a bunch of perf/availability implications).
@ranulpallemulle5447
@ranulpallemulle5447 3 ай бұрын
At my previous job, we tried to do database per tenant for a SaaS product and one issue we ran into was the actual provisioning of database. We tried to create the database “on-demand” - i.e when a user signed up as a new organization. The problem was we did this on azure with mssql databases so naturally it was slow and added complexity to the sign up process (showing the user a “please wait” screen, azure API requests with preconfigured ARM templates, retries on db creation failure, attempting to use a pool of databases and expanding the pool when needed, managing elastic pools, etc). Not to mention the cost aspect and migration rollouts. Should have gone with something like RLS or query filter
@N8ontv
@N8ontv 3 ай бұрын
I inherited a db per tenant architecture. With CI/CD and automation in place many of the downsides mentioned are easily addressed other than cost. Our real complexity came from also having tenant-agnostic data that all tenants nevertheless need access to.
@CodeOpinion
@CodeOpinion 3 ай бұрын
I never brought that topic up, as well as the topic of cross tenant communication, which I wanted to. Likely another video down the road.
@jackwright517
@jackwright517 3 ай бұрын
Thank you so much this is ultra informative! Im working on a multi tenanted system right now, and the timing is impeccable!
@CodeOpinion
@CodeOpinion 3 ай бұрын
Glad it was helpful!
@prashanthb6521
@prashanthb6521 Күн бұрын
For HPC workloads, its always better to have multiple databases, hopefully on different machines, even if it takes more maintenance.
@ziftytodd
@ziftytodd 3 ай бұрын
Also consider data privacy and retention laws vary by country, so you might want to have a shared multi-tenant database for each unique region/country that your clients are in. Reduces complexity from one db per client, but makes dealing with each unique set of privacy laws easier.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Great point!
@cdarrigo
@cdarrigo 3 ай бұрын
Resources for tenant, including database and potentially API, also have an impact for provisioning new tenants. If you have a single database and API for all your tenants, provisioning a new tenant is pretty trivial. If every tenant gets their own same resources, then you have to provision and potentially seed all those resources every time you deploy new tenant. Just something else to consider
@ThingEngineer
@ThingEngineer 3 ай бұрын
Great video, you covered a lot of variations. Multi-Tenant with Supabase is great when using their auth and Postgres functions. There is not need to send user or tenant info because it is part of the session and accessible from the function call, they also scale amazingly.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Nice, thanks for the info!
@ITManPH
@ITManPH Ай бұрын
Thanks Man! really need this video for a project.
@trashmail8
@trashmail8 3 ай бұрын
Besides using a UUID for the customerID, wouldn't it also be a good idea to also use a UUID for the tenantID? If you ever want to migrate a tenant from database A to database B, you might run the risk that a tenantID "1" already exists in database B.. In other words, making both the tenantID and customerID a UUID, you can move data around freely without running the risk of using duplicate keys or data leaking between tenant instances.
@lukassinkus6162
@lukassinkus6162 3 ай бұрын
My guess would be that tenants are globally unique and are probably coming from an external source, or that it just didn't make sense to mention it for brevity
@iaminterestedineverything
@iaminterestedineverything 3 ай бұрын
Some database storage engines such as InnoDB don't do well with uuid, as they can cause excessive page IO if they happen to be the clustered index due to their randomness
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
@@lukassinkus6162 It's probably a good idea to use a UUID for all records in case you decide to migrate from a DB per tenant to a shared DB.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Yes I likely would make them globally unique, however you likely store the tenants in some type of primary DB which also knows which DB they are associated with. So even then it could be a UUID/GUID but could also be something simple. Also note, as someone else mentioned, DBs like MySQL/InnoDB don't do well with UUIDs as primary keys (unless that's changed?)
@steve-wright-uk
@steve-wright-uk 3 ай бұрын
@@iaminterestedineverything Using UUIDs with clustered indexes is a known performance problem. However UUIDv7 gets around this problem by having a time component that means that they are naturally sorted in the order they are generated. The upcoming .Net 9 will have ths natively, but they are Nuget packages that can generate them now.
@qj0n
@qj0n 3 ай бұрын
Single db is our approach and we limit the issue with noisy neighbours by using tenantId and PartitionKey. This way is azure storage when data is split between nodes, each tenant should be on one node, so they only affect this node
@CodeOpinion
@CodeOpinion 3 ай бұрын
Ya partitioning by tenant at storage, if you db supports it, is nice.
@IndigoDalliance
@IndigoDalliance 3 ай бұрын
This is interesting. Do you guys use PartionKey on Cosmos Db? I don't know anything about it and want to know more.
@devbert1880
@devbert1880 3 ай бұрын
i love this channel. a topic that would really intrest me: how to deal with identity and access mangement stuff in a multi-tenant scenario - without cloud-lockin :)
@CodeOpinion
@CodeOpinion 3 ай бұрын
I'll add it to my list of topic ideas! Thanks
@georgehelyar
@georgehelyar 3 ай бұрын
For relational, shared database, horizontally sharded on tenant id with something like citus. Currently doing this in prod for 11b req/mo on 100k tenants. Would use nosql but the data in that database is quite relational with a lot of m:m. Considered graph db but most graph db are just an inefficient layer on top of nosql/rdbms. In general if you have a lot of databases, they are still on the same server, so they still share resources - separate databases really just saves you from having to put the tenant id in every table and use composite keys, but the cost is in the schema management.
@anhcoder
@anhcoder 3 ай бұрын
Split to multiple for better querying, re-indexing.
@marna_li
@marna_li 3 ай бұрын
I have multiple tenants per DB based on the query filter method. But I realized there are cases you want it separate. So I could built some infrastructure to opt-out for separate databases based om customer requirements. Provided that they don't want separate instances of the app all together. I have seen that some companies do that because of data requirements and so they don't upgrade all customers to newer versions without them paying. This is kind of the pre-Cloud and pre-SAAS method. I just feel like it must be a hell to manage with branching and patching bugs across specific codebases for each customer.
@jprince1993
@jprince1993 3 ай бұрын
A tenant per collection for Document store, makes me a feel uneasy. I like the clear separation that it gives, however for many tenants with many collections...it gets messy quick. Other methods are either having the tenant ID on each document in a collection OR a tenant shard key.
@Philinmiami
@Philinmiami 3 ай бұрын
We have a database per tenant model. In our use case, the system could be split into micro services, because 1/2 of the business logic and tables are exactly the same for all tenants. However, due to the needs of each tenant, we would have too many variations in the tables, and we need to do quite extensive join operations. We looked into a hybrid (multi-tenant for the 1/2 that's similar) and tenant per database, but since most of our business deals with joining from the "core" tables to the tenant tables, the test queries we ran were not promising. In our case, since we know exactly how the complexity runs, we have a git repo that is "core" repo, that every tenant is forked from and upstreams from, so in the few cases that we have to change things in the core tables, we just run the git upstream fetch commands. One thing to note, is that in our industry, data leakage is 100% not acceptable to any of our clients, and we'd probably lose them as an account (and this would be our best case scenario).
@orianatoubia681
@orianatoubia681 Ай бұрын
I love your channel! Thank you
@nafg613
@nafg613 3 ай бұрын
Shared postgres cluster, database per tenant. A single program instantiating an application object instance per tenant for one project, for another separate Cloud Run per tenant managed by Pulumi
@JacobGlanz1234
@JacobGlanz1234 3 ай бұрын
On SQL server you can do a RLS, so a query without a Tenant ID doesn’t return anything. You can even configure to bypass RLS with db admin users
@CodeOpinion
@CodeOpinion 3 ай бұрын
Ya, row level with a specific user is a great option if your DB supports it. Or choose a DB that does it before you begin if that's the route your going to take. 😀
@hoareg2
@hoareg2 3 ай бұрын
In some cases, tenants might have different versions, e.g 8.7 vs 8.6 of the application and DB so sharing the DB between them is not feasible. Also some tenants might have some features which are not available for others. In short, it looks to me that having a DB for each tenant is more complicated but more flexible.
@CodeOpinion
@CodeOpinion 3 ай бұрын
features by tenant can often be done by feature flags. Yes, flexibility comes with complexity.
@appstratum9747
@appstratum9747 3 ай бұрын
Depending on the RDBMS, schema per tenant may be of interest. To all intents and purposes, when you take advantage of the RDBMS security model, you effectively have a "logical" database that you can isolate with its own associated users. You can migrate individual schemas on a per tenant basis. This is just another option to consider in your toolkit of solutions. As ever, whether this makes sense depends on what you're doing, who you're doing it for, what data you're doing it with and what traffic you're dealing with from your various tenants. Like the other approaches mentioned by Derek, this isn't a "one case fits all" solution.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Ya it's not a once case fits all for sure, as you clearly know! Ya I've also noticed people doing "table" per tenant which similar idea as schema per. Schema per tenant can be helpful early on for costs and operational maintenance for sure. As always, context is king.. and that context can change.
@dinov5347
@dinov5347 3 ай бұрын
We have a notion of tenant (network) and just in the presentation we use automatic filtering on the tenant. The biggest issue is how to create indexes. Do you always index by the tenant id first? Or secondary? Or use db partitioning. Our system has a mixture of the above and quite frankly they are all have pros and cons. What do you think is the best approach of optimizing queries?
@GabeHandle
@GabeHandle 3 ай бұрын
Regardless of your choice, Lumbergh is gonna need you to go ahead and come in on Saturday... thanks!
@CodeOpinion
@CodeOpinion 3 ай бұрын
Glad someone noticed it!
@JacobGlanz1234
@JacobGlanz1234 3 ай бұрын
Have you ever heard of a SaaS product doing DB per tenant besides for legal reasons?
@awright18
@awright18 3 ай бұрын
I've seen various countries requiring data to be on their soil.
@trashmail8
@trashmail8 3 ай бұрын
@@awright18 In that case you can still have a database that's shared by multiple tenants, but with multiple data center locations hosting this shared database and assigning tenants based on geographical/legal zones to a particular data center.
@awright18
@awright18 3 ай бұрын
@@trashmail8 true
@ChaseFreedomMusician
@ChaseFreedomMusician 3 ай бұрын
I have. SQLite is fantastic for this.
@awright18
@awright18 3 ай бұрын
I once worked at a company that would have various versions (even custom versions) of schema per tenant and they used separate databases, and servers for this.
@zonegamma8197
@zonegamma8197 3 ай бұрын
very interesting thanks
@CodeOpinion
@CodeOpinion 3 ай бұрын
It's a giant topic I could spend 1hr on.
@guhfeitoza
@guhfeitoza Ай бұрын
Hey, i'm developing a web application (ERP) with a database that will be exactly the same for multiple companies to get and store data. Should I continue with a single database for them all or should i create a database for each client i get? For now, the web application will be mostly to show up data, nothing heavy. Thx
@nightshade427
@nightshade427 3 ай бұрын
another benefit of db per tenant is that they can geolocate near the tenant
@lukassinkus6162
@lukassinkus6162 3 ай бұрын
then you probably also need to deploy your app close to them as well, since what matters is distance to the server not just the db
@nightshade427
@nightshade427 3 ай бұрын
yep I'm usually using edge for compute and keep db close to tenant, this way compute and db are close to users
@CodeOpinion
@CodeOpinion 3 ай бұрын
Ya, great point.
@GrowsDrummer
@GrowsDrummer Ай бұрын
Hello, thank you for the video. We are debating our approach for a multi tenant application. We will have some big tenants with sensitive data and some smaller tenants. The way would be a mixed environment with multiple schemas in the same db server. Now here comes a big problem, a system admin should be able to create form templates to be available for all tenants. What do you think would be the best approach to make these templates available? Is it possible for a tenant with his own schema to also access a “global schema”? Or is it possible to save the template as a “system admin” on all schemas? Could it be done with writing a view on the db accessing all schemas? Rel Db is Postgres. How would you consider keeping the tenant_id nullable so you can set default values for everyone? Possible use case get the default templates and the tenant specific templates. Thanks in advance.
@CodeOpinion
@CodeOpinion Ай бұрын
Ya for sure can have global schema, but the issue, which is similar if tenants need to cross communicate is if they live on different instances, you aren't doing any single DB connection interaction. So you must treat things separately still.
@GrowsDrummer
@GrowsDrummer Ай бұрын
@@CodeOpinion how would you solve this issue? Using some sort of script generator and importing templates via script on all schemas?
@mohammadtoficmohammad3594
@mohammadtoficmohammad3594 3 ай бұрын
Thank you
@CodeOpinion
@CodeOpinion 3 ай бұрын
You're welcome
@kubrakivan1579
@kubrakivan1579 3 ай бұрын
I have read all the comments but I don’t see about one database and separate schemas for each tenant. Isn’t it a solution? Can it work such a way? As I know PostgreSQL database can do it. Please give me your professional thoughts.
@CodeOpinion
@CodeOpinion 3 ай бұрын
Yup absolutely can do that as well. I mention it in other videos about having a single database instance but different schemas on that instance. Also helpful for ownership over data per service. In this case tenant, so different purpose.
@ChaseFreedomMusician
@ChaseFreedomMusician 3 ай бұрын
Hot Take: if you have a multi-tenant shared relational DB do not enforce FK. Have the keys but don't enforce them it will make your life so much better for splitting and changing data.
@mohammedfalih8713
@mohammedfalih8713 3 ай бұрын
it also risky
@ChaseFreedomMusician
@ChaseFreedomMusician 3 ай бұрын
@@mohammedfalih8713 Is it?
@mhsn27mhsn10
@mhsn27mhsn10 3 ай бұрын
What is about Ai ?
@CodeOpinion
@CodeOpinion 3 ай бұрын
what about it?
@mhsn27mhsn10
@mhsn27mhsn10 3 ай бұрын
@@CodeOpinion make video if you have time and give us your opinion about ai and how you can use it
@alex_chugaev
@alex_chugaev 3 ай бұрын
Is “noisy neighbor” that big issue? I mean you can auto-scale instances (computing and database) in many cases.
@CodeOpinion
@CodeOpinion 3 ай бұрын
It depends if you want to auto-scale or not. Often times the noisy neighbor in my experience is because of just bad code/query that isn't running as intended. Or over time data has grown that's causing the issue. Often seems like its in the dark corner of the app nobody goes to then all of a sudden calls something that murders the DB.
Screaming Architecture: Not Driven By Entities
9:22
CodeOpinion
Рет қаралды 7 М.
Multi-tenant Architecture for SaaS
11:07
CodeOpinion
Рет қаралды 124 М.
Support each other🤝
00:31
ISSEI / いっせい
Рет қаралды 81 МЛН
Beat Ronaldo, Win $1,000,000
22:45
MrBeast
Рет қаралды 158 МЛН
DTOs & Mapping : The Good, The Bad, And The Excessive
11:03
CodeOpinion
Рет қаралды 18 М.
Microservices with Databases can be challenging...
20:52
Software Developer Diaries
Рет қаралды 113 М.
The Correct Way to Run Database Migrations in EF Core
19:51
Nick Chapsas
Рет қаралды 36 М.
DRY principle is why your codebase sucks?
8:02
CodeOpinion
Рет қаралды 10 М.
Microservices are Technical Debt
31:59
NeetCodeIO
Рет қаралды 698 М.
Proxy vs Reverse Proxy vs Load Balancer | Simply Explained
13:19
TechWorld with Nana
Рет қаралды 239 М.
Vertical Slice Architecture isn't technical
8:19
CodeOpinion
Рет қаралды 7 М.
Multi-tenant architecture in 20 minutes
18:56
Carmel Hinks Saxby
Рет қаралды 114 М.
Support each other🤝
00:31
ISSEI / いっせい
Рет қаралды 81 МЛН