Deep Dive Into the Repository Design Pattern in Python

  Рет қаралды 72,295

ArjanCodes

ArjanCodes

Күн бұрын

In this video, I’ll take a closer look at the repository design pattern in Python. This is a very useful pattern that allows you to keep your data storage separate from your data operations.
💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide
🔥 GitHub repository: git.arjan.codes/2024/repository
💻 ArjanCodes Blog: www.arjancodes.com/blog
✍🏻 Take a quiz on this topic: www.learntail.com/quiz/rvgics
Try Learntail for FREE ➡️ www.learntail.com/
🎓 Courses:
The Software Designer Mindset: www.arjancodes.com/mindset
The Software Architect Mindset: Pre-register now! www.arjancodes.com/architect
Next Level Python: Become a Python Expert: www.arjancodes.com/next-level...
The 30-Day Design Challenge: www.arjancodes.com/30ddc
🛒 GEAR & RECOMMENDED BOOKS: kit.co/arjancodes.
👍 If you enjoyed this content, give this video a like. If you want to watch more of my upcoming videos, consider subscribing to my channel!
Social channels:
💬 Discord: discord.arjan.codes
🐦Twitter: / arjancodes
🌍LinkedIn: / arjancodes
🕵Facebook: / arjancodes
📱Instagram: / arjancodes
♪ Tiktok: / arjancodes
👀 Code reviewers:
- Yoriz
- Ryan Laursen
- Dale Hagglund
🎥 Video edited by Mark Bacskai: / bacskaimark
🔖 Chapters:
0:00 Intro
0:37 Repository code example
4:13 About the pattern
8:04 Better software testing
9:36 Warnings and Caveats
11:24 Final thoughts
#arjancodes #softwaredesign #python
DISCLAIMER - The links in this description might be affiliate links. If you purchase a product or service through one of those links, I may receive a small commission. There is no additional charge to you. Thanks for supporting my channel so I can continue to provide you with free content each week!

Пікірлер: 108
@ArjanCodes
@ArjanCodes 4 ай бұрын
💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide
@yurykliachko1815
@yurykliachko1815 4 ай бұрын
this is a good pattern when your entity (Post in this case) is stored partially in different storages (sql DB + cloud storage, sql db + nosql DB etc), it hides all the complexity. Thank you for this guide!
@ArjanCodes
@ArjanCodes 4 ай бұрын
Glad you enjoyed the topic!
@FernandoCordeiroDr
@FernandoCordeiroDr 4 ай бұрын
I had to use this pattern recently. I was working on a Django app that had to work both with MongoDB and Postgres' PGVector. I created a repository for each and then a factory function that, based on environmental variables. determines which repo to be used. These repos are then used inside the methods of normal Django models. The main benefit is that adding an integration to another vector database is just a matter of creating a new repository.
@notead
@notead 4 ай бұрын
Hey Arjan, I just want to say thank you. I was able to land a job in data engineering thanks to your course and your videos on design patterns. Seeing your approach to building applications finally made it click for me that learning a language is the "easy" part, and that understanding _how to think about systems_ not only makes me a better developer - but is a super important, generalizable skill that goes beyond just programming. Maybe that's obvious for many, but I am really grateful for that insight.
@ArjanCodes
@ArjanCodes 4 ай бұрын
It's an absolute pleasure hearing about your success story and your learning journey, thank you for letting me be part of it! Best of luck :)
@rohailtaimourInc
@rohailtaimourInc 4 ай бұрын
Hi @arjancodes, I’ve been really enjoying your videos and specifically how you always focus on how to test the code you demonstrate. Thank you for your content. I was wondering if you can cover testing functions that are decorated? They pose an interesting challenge and I didn’t find it to be straightforward to test such use cases
@markasiala6355
@markasiala6355 4 ай бұрын
I also have used this pattern without knowing it simply by focusing on decoupling and dependency injection. I have an abstract data class and an abstract FileIO class. Gives me flexibility on how I load data into the class or write it out. This helps me track changes in the data when I compare versions of the output data (e.g., I read in data from a user-friendly Excel file but write it out to pipe-delimited text, JSON, or YAML output where a simple diff tells me what changed).
@adjbutler
@adjbutler 4 ай бұрын
I love your pattern videos! (I will even allow you to make up your own patterns) Or do PART 2, 3, 4 on previous patterns! Your videos are amazing! Thank you
@alexandarjelenic2880
@alexandarjelenic2880 4 ай бұрын
Or combining patterns, or more example of solving the same issue with various approaches.
@notead
@notead 4 ай бұрын
I agree! It would also be really cool to see more videos of him refactoring projects into using design patterns, especially hearing him discuss why he makes certain choices, the considerations and thoughts that cross his mind when making them.
@dadoo94000
@dadoo94000 4 ай бұрын
Thank you Arjan. I use this pattern in fastapi. Layer endpoint > layer services (logic etc) > layer repository with FastApi dependencies between these layers. I like it. Effectively, often I need more than simple CRUD operation and add it to my repository layer. It's not a good idea I think. Maybe we should create theses different method in services. But I like these pattern. WHen I need to call external api, I create a repository also for that. For me repo = access to data
@Djellowman
@Djellowman 4 ай бұрын
Great no-nonsense video!
@axeldelsol8503
@axeldelsol8503 4 ай бұрын
This pattern is also very useful when you are wrapping a API offering CRUD routes for resources Great video !
@Nalewkarz
@Nalewkarz 4 ай бұрын
It's much more suited for your usecase than his.
@prinsniels
@prinsniels 4 ай бұрын
I use the pattern a lot, but in a more general way. I tend to write things on the base of interfaces, combining it with dependency injection makes things easy to test and allows for composable programs and great flexibility. I tend to stay away from ORMs, for me they add an extra layer of complexity to programs and in analitics it quickly ends in writing straight SQL to your ORM, so cutting the middle man seems wise then 😅 Thnxs for the video!
@oscarmulin114
@oscarmulin114 4 ай бұрын
Agree with avoiding ORMs 100%.
@bachkhoahuynh9110
@bachkhoahuynh9110 4 ай бұрын
In data-centric applications, you can stay away from ORMs, but if your team uses an object-oriented domain model, ORMs are especially useful.
@mhotzel
@mhotzel 4 ай бұрын
I always use this kind of Repository, but I didn't know, that I follow a pattern 😀. Thank you.
@ArjanCodes
@ArjanCodes 4 ай бұрын
Glad the video was helpful!
@rrwoodyt
@rrwoodyt 4 ай бұрын
I like the separation and abstraction. It would have been interesting to see you make a class that could handle a generic dataclass, but that's beyond the scope of what you were trying to show. Maybe next time...
@SeliverstovMusic
@SeliverstovMusic 4 ай бұрын
I use repository on top sqlalchemy. A have a base repo class with CRUD function. For every table, I create a new subclass, and all CRUD operation become available for the table. Magic =)
@barefeg
@barefeg 4 ай бұрын
Awesome. Maybe follow ups could be how to define filters in your get_all that are no tied to SQL (e.g. specification pattern), as well as handling transactions with unit of work pattern.
@obsidiansiriusblackheart
@obsidiansiriusblackheart 4 ай бұрын
I find like most patterns, I have used this before but didn't know the name. Thanks for this awesome video! Your channel really helps me better understand coding and jargon in the field (I have ~10 years coding xp and 6/7 years work xp)
@ArjanCodes
@ArjanCodes 4 ай бұрын
I'm really happy to hear that these types of videos have been useful! :)
@ajflorido
@ajflorido 3 ай бұрын
using this pattern with SqlAlchemy you can load different models dinamically and use the same repo to get the data for different db engines. For example we have models for postgres,oracle and mysql that with SA some columns definitions for the model are quite different and we load the correct model dinamically within the repo itself, so you can also decouple this pattern into another step for different engines. Thanks Arjan!
@VashdyTV
@VashdyTV 4 ай бұрын
Great guide as always!
@ArjanCodes
@ArjanCodes 4 ай бұрын
Thank you so much!
@basedmuslimbooks
@basedmuslimbooks 4 ай бұрын
I love this - can you expand your repository design patterns to other databases ? Mongodb is something im struggling with. Or graph databases
@sandeshgowdru8869
@sandeshgowdru8869 4 ай бұрын
Thanks a lot for making videos, I was looking for a architecture for getting data from multiple sources, I was looking into a combination of factory, strategy etc, But this pattern is perfect for my need Once again thanks a lot for sharing this....
@ArjanCodes
@ArjanCodes 4 ай бұрын
I'm glad this video was helpful for your current objectives :)
@dankprole7884
@dankprole7884 4 ай бұрын
I use this for reading and writing dataframes. csv, parquet or pickle, local storage or s3. Same interface 😊
@nightcrawer
@nightcrawer 4 ай бұрын
Hey Arjan! thanks for the post. In a more complex applications using DDD is a good practice to separate domain from models ? My repository return a model and my model know hot to convert into a domain
@davidmasipbonet2508
@davidmasipbonet2508 4 ай бұрын
Why do you need create_table to be a classmethod?
@2006pizzaboy15
@2006pizzaboy15 4 ай бұрын
You can also look at the Unit of Work pattern that often goes hand in hand with Repository.
@devilslide8463
@devilslide8463 4 ай бұрын
I particularly appreciate the ease of mocking this repository. It's very convenient for testing the logic of services that utilize the repository class.
@ArjanCodes
@ArjanCodes 4 ай бұрын
I'm glad you enjoyed this design pattern!
@dalenmainerman
@dalenmainerman 4 ай бұрын
I actually used this one unintentionally At the earliest stages of a project, all data was stored in a bunch of csv files (not my idea, not my decision) Implementing all data-related operations with this pattern allowed me to migrate to the real database almost effortlessly
@edgeeffect
@edgeeffect 4 ай бұрын
"not my idea, not my decision" ... is the (sad) story of our lives!
@sharkpyro93
@sharkpyro93 4 ай бұрын
i worked in a project of a national wide editors and magazines publisher company and they used some excel sheets as db, it was miserable
@Naej7
@Naej7 4 ай бұрын
@@sharkpyro9395% of the world data is stored in Excel sheets…
@dalenmainerman
@dalenmainerman 4 ай бұрын
@@sharkpyro93 I have to work with google sheets as a db on my current project. Annoying af, trying to teach my colleagues to use real databases, wish me luck
@sharkpyro93
@sharkpyro93 4 ай бұрын
@@dalenmainerman why do i feel like i know how your collegues look?
@BradleyBell83
@BradleyBell83 4 ай бұрын
Any reason why ABC was used as opposed to Protocol?
@aimbrock
@aimbrock 4 ай бұрын
Talk about coming full circle here... I went searching for this Protocol package you mention and found a blog article claiming that Protocol is better and everyone should abandon ABC. In that same blog article he links to an ArjanCodes video that maybe answers your question: kzbin.info/www/bejne/rqfFZpt9gdR-ZqM Having just discovered Repository Pattern and Unit of Work and now ABC and Protocol I have no perspective to offer but I thought it funny.
@muzafferckay2609
@muzafferckay2609 4 ай бұрын
Implementing repository pattern is not about switching from sql to nosql or vice verce. It decouple the business logic from the persistent layer this can be orm or sql language. As you mentioned implementing repository pattern limit querying, updating ... It is too hard to provide all feature that orm does. For example you have to define comparison operators such as in, grater than, less than etc. Your get method shold take set of relational fields to fetch as it is going to be used in different places. You have to define or, and and more complex query. Basically you have to define your own query language step by step as you need. And translating your query to Orms. Otherwise you have to define too many different get methods for querying
@Naej7
@Naej7 4 ай бұрын
It is about switching. By decoupling the business logic from the persistance layer, you can switch the persistance class (one for SQL, one for NoSQL)
@broomva
@broomva 4 ай бұрын
And how would you abstract away the SQL queries in the repository definition, so that different types of repository implementations could be made by changing something like a SQL objects template?
@ChrisBNisbet
@ChrisBNisbet 4 ай бұрын
Hmm, do the tests you showed us test anything other than the mock class you created for the purpose of adding tests?
@joelffarthing
@joelffarthing 4 ай бұрын
Imagine an application function or use case that depends on a repository; You can inject the 'fake' Repository in your test instead of the version that uses a real database. That way, you have something that implements the expected interface, but doesn't actually require a real database. He talked about this but didn't actually show an example. Architecture Patterns With Python is a great book that goes over this and other patterns in detail.
@ChrisBNisbet
@ChrisBNisbet 4 ай бұрын
@@joelffarthing Yep, I get all that.
@SkielCast
@SkielCast 4 ай бұрын
In this case you have nonly a couple of columns but using row_factory = sqlite3.Row and casting to dict would have allowed to use the **kwargs syntax which is especially handly in this case, maybe the code could be a little easier to follow that way
@edgeeffect
@edgeeffect 4 ай бұрын
It's one of my favourite patterns and I so dearly wish we had it in the awful legacy app we've got at work.
@FolkOverplay
@FolkOverplay 4 ай бұрын
Is there a special reason why the tests were not refactored to use parameterize?
@bachkhoahuynh9110
@bachkhoahuynh9110 4 ай бұрын
The repository pattern is not about switching from SQL to noSQL. We call this switching effect the persistent ignorance principle. the main thing to consider to use the repository pattern is that you want to decouple domain logic from infrastructure logic. A repository is usually backed by an ORM because when you use raw SQL, you eventually implement some ORM's features such as changes tracking, proxy for lazy loading, ... I only use raw SQL for complex queries.
@user-pz3wg6ch9b
@user-pz3wg6ch9b 4 ай бұрын
Why classmethod when it's not accessing any class instance variable also not returning the class? Can be a staticmethod right.
@TheOnlyEpsilonAlpha
@TheOnlyEpsilonAlpha 4 ай бұрын
I used it lately without knowing that I used it and without the decorators. Wrote a file handling py for crud that way without specifying the content so it could be used to handle c.r.u.d.operations and is not stuck to a specific content type
@klmcwhirter
@klmcwhirter 4 ай бұрын
A common misconception of design patterns is that the concrete implementations need to have the same method signatures (or implement the same interface). That simply is not true! The spirit of the Repository pattern is to decouple storage from business logic. If the storage strategy changes, then the business logic layer should not have to change. See the Open-Closed principle for details. That is hard to accomplish if every Repository in your code base has the same set of method signatures, i.e., implements the same interface (er, Protocol as you have taught us). The methods should implement a business function required by the layer calling into the storage layer instead. First of all, you NEVER should embed SQL statements in today's world. That is a huge design smell that will never pass code review in an enterprise context. Second, the functions in a Repository class should be elegantly "callable" from the business layer and not just implement CRUD methods. That is a wrong usage of the Repository design pattern. It is a misconception that an OR/M provides a Repository - that is not true! Session management and Repository implementations are different concerns and do not belong together. Except in "Hello, World" examples I guess. Don't do that. It just does not work when you have a complex data concept involving hundreds of tables. Yep, those are normal in real world use cases. Please think about the place where you may need to move functionality from a database to an API. That will provide you with a correct mental model about the Repository pattern. just encapsulate the behavior needed for the underlying operational storage mechanism. At the end of the day, it is a specialized kind of an Adapter. I love your content @ArjanCodes, please keep doing what you are doing. But this one could have been presented better. I, as an educator myself, realize there are compromises that need to be made to simplify introduction of (potentially) new concepts. But you went too far this time. Sorry.
@johnabrossimow
@johnabrossimow 4 ай бұрын
I wrote a class to access the filepaths in the project repository my app creates.
@Jakub1989YTb
@Jakub1989YTb 4 ай бұрын
Why classmethods if you are not using them to create "instances" of the class? Didn't you mean staticmethods? This is very misleading.
@tihon4979
@tihon4979 4 ай бұрын
Cool! What about Unit of work pattern? ;)
@edgeeffect
@edgeeffect 4 ай бұрын
Yeah.... this is all starting to sound a little bit like "Doctrine" ... but that's PHP???????
@Vijay-Yarramsetty
@Vijay-Yarramsetty 4 ай бұрын
thanks
@SkielCast
@SkielCast 4 ай бұрын
Wouldn't it make more sense for the PostRepository methods to take a Post object rather than kwargs? That way we could have leverage typing
@vikingthedude
@vikingthedude 4 ай бұрын
This looks like the strategy design pattern, applied to storage. Here, the SQLite storage is a specific strategy. Another strategy could be a remote storage. Am I understanding this right?
@Naej7
@Naej7 4 ай бұрын
I understand what you mean, but I can’t really say it is exactly the same thing. It does use the same mechanism though, which is essentially dependency injection
@thomaseb97
@thomaseb97 4 ай бұрын
most patterns are conceptually similar, atleast within the same category, there is very little difference between them, they just tackle somewhat specific tasks if its easier for you to imagine it as strategy pattern go for it
@CottidaeSEA
@CottidaeSEA 4 ай бұрын
The repository is one of my favorites, because I really don't like it when database queries are tightly coupled with logic or the entities themselves.
@Nalewkarz
@Nalewkarz 4 ай бұрын
You are not limited to Python 3.12 with this. You can do it also with older versions. Just use T = TypeVar("T") and then inherit from Generic[T] in the repository. But i'll allow myself some criticism. This pattern is not very usefull without more strict Port/Adapter pattern where repository is implementation of concrete interface. For simple CRUDS you can go just with ORM it's just not worth the effort.
@Naej7
@Naej7 4 ай бұрын
I use it every day, because I need a InMemory version for my tests
@PietroBrunetti
@PietroBrunetti 4 ай бұрын
If I don't remember wrong, I saw it in the Cosmic Python book.
@peterlogg5576
@peterlogg5576 3 ай бұрын
Is there a reason to make the parent `Repository` an `ABC` rather than a `Protocol`? We generally use `ABC` for our repositories but I'm curious if there's a reason not to implement it as a Protocol instead?
@feldinho
@feldinho 4 ай бұрын
Using this pattern, how would you deal with N+1 problems? Imagine you have posts with authors; it's easy to get an author with no posts or a post with no authors, but what about retrieving both together? Calling each other's repository would lead to infinite recursion while using joins in both repos would lead to duplicate logic. How would you solve this?
@NotNullReference
@NotNullReference 4 ай бұрын
You can add as many methods as need. The repository pattern is just a form to abstract and decouple the data access logic from bussines logics. So, if you need the Posts and Authors, you create a query that holds both items, in which repository you create this methods depends in the "dependent" side of the query, 'cause is different to said: - "I need the author of this book": meaning that you need a BookWithAuthor class, with the author as dependent side, so a join between book to author with a where in BookId - "I need the books of this author": meaning that you need a AuthorBook class, with the book as dependent side, so a join between author to book with a where in AuthorId Wherever is your case, you create the method in the strong side, in the first case, is in the BookRepository, in the second is AuthorRepository
@Nalewkarz
@Nalewkarz 4 ай бұрын
It's not very usefull without rest of the hexagonal architecture building blocks. Basically you just won't do it like you think. You must have some facade like "use cases" or "service" then pack database objects to entities that in that case would be aggregates because it will consist of two different related types od objects. Just imagine DAO with prefetched related objects. I can recommend veru good book about such implementation "Implementing the Clean Architecture by Sebastian Buczynski".
@bentosalvador336
@bentosalvador336 4 ай бұрын
Hey man, your question is very common. But it is also very simple to answer. Repository is NOT made for "queries" or "get data performatically". Repository should be used to persist the state of an "aggregate". In my point of view, you should have only the necessary methods to retrieve the Aggregate, then you modify it, and send it again to the repository asking to persisirst it. To avoid this kind of confusion about how to use repos, take a look at the CQRS concept.
@sarveshsawant7232
@sarveshsawant7232 4 ай бұрын
Great
@ArjanCodes
@ArjanCodes 4 ай бұрын
Thanks!
@luscasleo
@luscasleo 4 ай бұрын
I didn't know that its possible to declare generic classes using brackets like that and even not needing to declate the typevar T. Which python version is that?
@maephisto
@maephisto 4 ай бұрын
3.12
@Naej7
@Naej7 4 ай бұрын
The newest version 😉
@MrLotrus
@MrLotrus 4 ай бұрын
It hurts when adding transactions
@maephisto
@maephisto 4 ай бұрын
Around 5:40 we see that the new class Repository is a generic class that returns T (via get), list of T (via get_all). But why the add and update methods have no notion of T? Why do we go away from the generics and we introduce the **kwargs: object? I was expecting an add method which takes as an input a T.
@aflous
@aflous 4 ай бұрын
It allows more flexibility in the sense that you would not be tied to only use Post as an argument for these methods
@maephisto
@maephisto 4 ай бұрын
@@aflous well... So why not returning then with get and get_all an object with random fields. My point is : some methods are specialized for T, others not. And I don't think that's right because I understand flexibility but either everything is flexible or nothing is like that and it's based on T.
@aflous
@aflous 4 ай бұрын
@@maephisto when you perform a get or get_all you wouldn't really need to specify any other info and you would expect to get an object of the same type (or a list of objects of that type). For other methods like update, you need at least to specify some other info like the data you want to supply for the update.
@maephisto
@maephisto 4 ай бұрын
Not fully convinced. When you add, you add T. But the example with update make sense. Thanks
@jwcnmr
@jwcnmr 3 ай бұрын
Design Patterns are usually "discovered." Has this pattern been described elsewhere?
@loicquivron3872
@loicquivron3872 4 ай бұрын
The thumbnail looks so cursed
@Naej7
@Naej7 4 ай бұрын
Right ? It has a « made with AI » vibe
@brainforest88
@brainforest88 4 ай бұрын
Tipp: Never use Select * in a sqlquery in code. It bites back. Worked 25 years developing db applications in Oracle (pl/sql). Looking at sqlalchemy queries is exhausting the L1 cache in my brain. I‘m used to write my sql straight. Easier to understand and I doubt I can do everything I need with Orm, so why start with it in the first place.
@dankprole7884
@dankprole7884 4 ай бұрын
Agreed, I use them both so infrequently I want to double my chances of remembering so I just use sql
@plato4ek
@plato4ek 4 ай бұрын
9:05 So, in essence, you create a mock and put this mock under test. This is not a proper way to do testing.
@xtunasil0
@xtunasil0 4 ай бұрын
It's the standard way to work with the java spring framework
@thepaulcraft957
@thepaulcraft957 4 ай бұрын
saw it during a internship every day and now I am sick of it
@Naej7
@Naej7 4 ай бұрын
So since you’ve seen code every day, you’re now sick of code as well ?
@xiggywiggs
@xiggywiggs 4 ай бұрын
@Naej7 some days... yeah lol
@thepaulcraft957
@thepaulcraft957 4 ай бұрын
@@Naej7 no but it was completely overused and made simple things much too complicated
@Naej7
@Naej7 4 ай бұрын
@@thepaulcraft957 Probably not, it’s used for a reason (often tests)
@thepaulcraft957
@thepaulcraft957 4 ай бұрын
@@Naej7 but for many things you could use simple dtos instead of full repositories. Testing is a good point though.
ArjanCodes Q&A 2023 | Everything You Wanted to Know!
26:32
ArjanCodes
Рет қаралды 12 М.
5 Signs of an Inexperienced Self-Taught Developer (and how to fix)
8:40
PINK STEERING STEERING CAR
00:31
Levsob
Рет қаралды 20 МЛН
Which one is the best? #katebrush #shorts
00:12
Kate Brush
Рет қаралды 21 МЛН
5 Design Patterns That Are ACTUALLY Used By Developers
9:27
Alex Hyett
Рет қаралды 186 М.
What the Heck Are Monads?!
21:08
ArjanCodes
Рет қаралды 68 М.
Why Use Design Patterns When Python Has Functions?
23:23
ArjanCodes
Рет қаралды 99 М.
Have you replaced your DB because of the Repository Pattern?
10:53
Python 3.12 Generic Types Explained
18:27
ArjanCodes
Рет қаралды 57 М.
7 Python Code Smells: Olfactory Offenses To Avoid At All Costs
22:10
Modern Python logging
21:32
mCoding
Рет қаралды 152 М.
The Flaws of Inheritance
10:01
CodeAesthetic
Рет қаралды 899 М.
PLEASE Use These 5 Python Decorators
20:12
Tech With Tim
Рет қаралды 95 М.
Protocol Or ABC In Python - When to Use Which One?
23:45
ArjanCodes
Рет қаралды 197 М.
PINK STEERING STEERING CAR
00:31
Levsob
Рет қаралды 20 МЛН