Git Annex Is The Coolest Program You've Never Heard Of

  Рет қаралды 31,022

DistroTube

DistroTube

Күн бұрын

Пікірлер: 77
@aacolive
@aacolive 2 жыл бұрын
0:23 Git Annex 1:20 Sync files between local folders 3:28 Turn the directory into an Annex repository 5:40 Git annex add 7:12 Getting both annex repos in touch (Like a Tinder match) 8:46 Sync files between repos via link 10:04 Getting the real file (--content) 11:26 Using the Gitlab as another example
@tejing2001
@tejing2001 2 жыл бұрын
Git-annex is absolutely amazing. You really have only scratched the surface. My personal favorite thing you can do with it is distribute large files intelligently through repos on external drives, aka sneakernet. Each repo keeps track of what other repos have and need during syncs, so you can set the external drive to "need" anything that the other side of the transfer needs but doesn't yet have, and then the sync --content will load exactly what needs to be carried to the other side, automatically. Once it gets the other side, the content will of course be loaded by the target, but it will also be cleared from the external drive, since the target repo now has it, and it no longer meets the criteria.
@ChristianF2
@ChristianF2 2 жыл бұрын
Pro
@123anonymous456
@123anonymous456 2 жыл бұрын
Where to find a detailed article or video showcasing this use case?
@griof
@griof 2 жыл бұрын
A veru good use of git-annex is machine learning and data science. In my company we use git-annex to synchronize machine learning models and huge data sets that git can't handle. We storage them in whatever cloud provider and download them locally on demand. Super useful
@nathanielsabanski
@nathanielsabanski 2 жыл бұрын
git annex really needed a modern quickstart guide. Best video you've ever done mate.
@hydejel3647
@hydejel3647 2 жыл бұрын
One of those programs which you probably didn't heard of but are incredibly useful: entr. It watches a file (or files/directory) and if any change was detected, it runs an arbitrary command. Super useful.
@cafkafk
@cafkafk 2 жыл бұрын
The timing of this was so perfect. I was just about to dive into the docs of this, but overwhelmed I ended up opening youtube to procrastinate and... Godspeed DT, goodspeed.
@0x007A
@0x007A 2 жыл бұрын
I am curious what use case git-annex addresses that rsync does not provide? If the data source is offline or otherwise inaccessible and you need the referenced "bigfile", what happens when you try to pull the full content of the referenced document from another computer?
@rsmith31416
@rsmith31416 2 жыл бұрын
As far as I understand, git annex keeps an inventory of your files without actually storing their contents unless you want to on demand. This is useful for huge files (100GB+) that cannot be stored easily but a version history of the repository is still needed across computers. In short, it is not a backup solution. It is, for the most part, a distributed synchronization system of metadata which can also retrieve the actual files if you really need them.
@MagyarUS
@MagyarUS 2 жыл бұрын
The big one is you can version control the files.
@leftaroundabout
@leftaroundabout 2 жыл бұрын
Yes - Git Annex really is a program more people should be using! But two notes of caution: 1. `git annex sync` has a habit of trying to be too clever, performing automatic multi-way merges. For experienced Git users who use commands such as `rebase`, this can lead to weird and undesired behaviour, perhaps inexplicably rolling back changes. I personally usually use the more manual `git annex copy` or `git annex get` commands to prevent this. 2. The ability to version-control lots of big files can make it tempting to use it for _lots and lots_ of automatically generated data files, and you think that this repo can still be checked out on a resource-limited machine (by just not fetching the big files). However, because each file is represented by a symlink, which file systems don't store very efficiently, even such a repo can end up taking a significant amount of extra space and in particular inodes. One workaround is to pack the data files into fewer, bigger `.tar` archives and only put those in Git Annex, though this isn't a great solution either - because Git Annex is not based on diffs but always entire files, replacing one of the files in an archive would mean a new compy of the whole archive needs to be stored.
@dany08011
@dany08011 2 жыл бұрын
One of those game changing apps for me is noisetorch, it's great to suppress background noise when using the mic, works really good.
@shatterstone3045
@shatterstone3045 2 жыл бұрын
Unrelated, but I just want to say that I did my first successful installation of vanilla Arch in a VM without archinstall yesterday!
@gladwinmohlamonyane4033
@gladwinmohlamonyane4033 2 жыл бұрын
Congrats dude 🎉
@gladwinmohlamonyane4033
@gladwinmohlamonyane4033 2 жыл бұрын
I used to do that too, but I build my env for work so I use the archinstall command even though I dislike how it handles partitioning my drive. Mainly because I will def add a lot of bloat 🤣
@tejderha
@tejderha 2 жыл бұрын
What is the difference with Git LFS
@DataLad
@DataLad 2 жыл бұрын
Git LFS uses an all-or-nothing centralized approach. Git-annex is a truly decentralized network. It excels even when not all data live (or can live) in the same place.
@DataLad
@DataLad 2 жыл бұрын
Also, git-annex can use Git LFS as *one* of the many special remote types it supports
@tejderha
@tejderha 2 жыл бұрын
@@DataLad thank you 👍🏻
@DevDungeon
@DevDungeon 2 жыл бұрын
Have you used jigdo before? In Debian, it's the `jigdo-file` packae. It is used for downloading massive files in pieces. Debian uses it for downloading the full set of packages. For example, the 19-DVD set that contains every single package. I have an offline copy of that and wikipedia just in case the internet ever goes away. It's about 170G of storage for both the full Debian repo and Wikipedia combined which can fit on a single thumb drive. Just try to think about how many minds and man-hours and how much information that is. In a way, you could consider that thumb drive one of the most valuable items in the entire world.
@toxiccan175
@toxiccan175 2 жыл бұрын
That's awesome. I've always wanted to accomplish something similar. How do you navigate Wikipedia offline? Are there any other sites you've archived?
@rahilarious
@rahilarious 2 жыл бұрын
I've seen jigdo mentioned on debian but never made an effort to understand it. Maybe a video showcasing it would be helpful
@rmcellig
@rmcellig 2 жыл бұрын
Excellent Derek!!! 🙂
@DaraulHarris
@DaraulHarris 2 жыл бұрын
Git annex is how I used to manage my music, using various free warehouses.
@Kludgedean
@Kludgedean 2 жыл бұрын
I like the sound of that, could you give me a heads up on where I can acquire these my dude? :) Here's hoping
@DaraulHarris
@DaraulHarris 2 жыл бұрын
Essentially, use the rclone remote to store your music on dropbox, google-drive, etc. It can encrypt them, too. I had made a longer comment, but I don't see it :/
@rafeu2288
@rafeu2288 2 жыл бұрын
@@DaraulHarris I think that youtube sometimes eats the comments that include external links, that may be why your comment disappeared. Did the same for a time myself, but now I copy-paste my comments in a file before sending them, that way I can tweak them if KZbin doesn't like the links or what. Good luck with that. :)
@squ34ky
@squ34ky 2 жыл бұрын
How does this compare to git-lfs?
@huantian
@huantian 2 жыл бұрын
Yeah exactly, as soon as I saw this I immediately thought it was similar to LFS LFS seems more supported than annex though
@leftaroundabout
@leftaroundabout 2 жыл бұрын
The main difference is that with Git Annex, checking out contents of large files is by default separate from checking out the repository at a given commit. This makes it a bit more difficult to use than git-lfs, but it also gives more fine-grained control about where the contents actually are stored, in particular useful when you have huge repos and still want to be able to quickly make clones on a resource-limited machine. Also, Git Annex repositories can be used without having the Git Annex program at all (it's really just standard symlinks but under version control, you can view the contents and in principle even copy around the actual files manually).
@AndersJackson
@AndersJackson 2 жыл бұрын
I followed and used git-annex from when it was developed. It was even better when it had Jabber support, but it is still amazing.
@krishenbhatti
@krishenbhatti 2 жыл бұрын
I still haven't figured out Git completely
@DistroTube
@DistroTube 2 жыл бұрын
You and me both. :D
@alexandrosvangelatos9979
@alexandrosvangelatos9979 2 жыл бұрын
Same... Especially since github changed the login process. I cant get in via the terminal :/
@gamerking64
@gamerking64 2 жыл бұрын
Maybe a Git tutorial soon?! Would be very helpfull
@lawrencedoliveiro9104
@lawrencedoliveiro9104 2 жыл бұрын
Nobody has. My philosophy with regard to open-source tools is, I figure out bits of them as I need them. This is how I work with Emacs, Git, Bash, the Python library, CMake, Debian packaging ... everything.
@drishalballaney
@drishalballaney 2 жыл бұрын
​@@DistroTube can you please make more videos on git basics ?
@sergeynikiforov8012
@sergeynikiforov8012 6 ай бұрын
Can anybody tell me how to configure the shell like in this video?
@TecnocraciaLTDA
@TecnocraciaLTDA 2 жыл бұрын
why use git to sync folders in your own system when there is rsync? git would be necessary if you want to track the changes made in that files and have the option to rollback...
@yarikoptic
@yarikoptic 2 жыл бұрын
Well -- I might tend to overuse git, e.g. keeping all my configs in git, and using etckeeper (from the same Joey who develops git-annex) for keeping /etc under git. One use case for git-annex could be "on-demand throw away rsync". E.g. if I want to try something out, I would create a local clone, 'git annex get' only needed files, "experiment", see the result, and possibly bring it back. With rsync, it would be a bit more cumbersome.
@theodorealenas3171
@theodorealenas3171 2 жыл бұрын
Hey that looks awesome! But what if someone wants to pull your big file from gitlab? It's on your local machine so your machine needs to be online when the other person wants your big file, no? It does sound proper, because gitlab gets rid of bloat, but is it convenient enough to use it with git lab? Some comments talk about use cases of git annex, and they convinced me it's a useful tool, but I want to know more about the online aspect.
@brunoais
@brunoais 2 жыл бұрын
Same question
@tatotick8513
@tatotick8513 2 жыл бұрын
I have been using git-annex for 10 years now, and is my most trusted way of storing all of data. I consider myself a git-annex "expert" so happy to help anyone if you need some help with it.
@theodorealenas3171
@theodorealenas3171 2 жыл бұрын
So you can link to a big file that's been deleted and it's only kept in .git? Is it a way to link to an old version of a big file?
@FARDEENKHANQWE123
@FARDEENKHANQWE123 2 жыл бұрын
creative thumbnail there buddy...
@cheako91155
@cheako91155 2 жыл бұрын
RCS, it's git for a single file... Great for /etc or if for testing you want to replace something in /use/bin with a wrapper script.
@brunoais
@brunoais 2 жыл бұрын
For what you did, usually I'd end up using git-lfs. Also, were git-annex files actually uploaded to gitlab?
@123anonymous456
@123anonymous456 2 жыл бұрын
TBH, I would have preferred a short demo with actual huge files created using dd which then get modfied slightly and synchronized over a low-bandwith connection.
@haj2.025
@haj2.025 2 жыл бұрын
Your on the edge of audio feedback today.
@jfftck
@jfftck 2 жыл бұрын
I had to use Git Annex many years ago, but have never had a use for it for quite awhile.
@Flackon
@Flackon 2 жыл бұрын
you rarely hear of git-annex because people use programs that supersede it like git LFS, rclone or restic or what have you (depends on the use case)
@leftaroundabout
@leftaroundabout 2 жыл бұрын
But git annex unifies those use cases in a way that is more powerful than any of those tools, even when they're combined.
@tatotick8513
@tatotick8513 2 жыл бұрын
The reason git-lfs is known more is purely because of marketing. Git annex doesn't have the marketing power of git lfs which is backed by one of the largest corporations in the world. In actual functionality, git lfs is unusable once you have gotten used to git annex.
@Flackon
@Flackon 2 жыл бұрын
@@tatotick8513 Do self-hosted git services have good support for git annex? because Gogs and Gitlab have built-in support for git lfs out of the box, which is a big factor in my adoption (not to mention Github itself, of course)
@tatotick8513
@tatotick8513 2 жыл бұрын
​@@Flackon It used to be fully supported in git-annex but was deprecated due to not having sufficient volume of users. However, the important thing is that this only means that the files cannot be stored in gitlab. It can still be used to sync the git repo. I personally self host a gitea server for the git repo, and a minio server for keeping track of the files. It doesn't practically change much since a "git annex sync --content" magically works once the setup has been completed in either case.
@academicalisthenics
@academicalisthenics Жыл бұрын
I am currently using git lfs for this kind of stuff. Is there any meaningful difference between those?
@Fanaro
@Fanaro 2 жыл бұрын
Maybe compare it to git-LFS?
@anantgupta1188
@anantgupta1188 2 жыл бұрын
Git Annex is a game changing program as it is a haskell program*
@leonardonovara9348
@leonardonovara9348 2 жыл бұрын
Hello DT, please, make a video about Lazygit, a tool that I didn't know existed but now I think I need it.
@PicyPoe
@PicyPoe 2 жыл бұрын
Terminal looks so sexy. What theme is that?
@shastro6939
@shastro6939 2 жыл бұрын
Nice
@DistroTube
@DistroTube 2 жыл бұрын
Thanks!
@THIRSTYGNOME
@THIRSTYGNOME 2 жыл бұрын
it's neat, but I don't think I would use Git that way. I would rsync or sftp "large" files around. I think it would be more impressive if it took a large file and chunked it in to smaller archive chunks in Git, and recreate the item back into the original file after running a git pull. This could allow you to version control larger archives, and get around git size limits, and help on upload bandwidth saturation Say you have a 5 GB encrypted tar.gz file. Program would turn it into say 10 small 500MB files. You could then run the tool on the other end to get the large file back. Now that could be cool, but might go against terms of service for GitLab/GitHub/etc.
@leftaroundabout
@leftaroundabout 2 жыл бұрын
Git Annex actually uses rsync under the hood, but it saves you the trouble of having to remember which file to copy from where to where.
@WilliamLDeRieuxIV
@WilliamLDeRieuxIV 2 жыл бұрын
Git was never meant to be a cloud storage solution -- no wonder why it doesn't do well with large files (which really shouldn't be in your repo anyway).
@LionKing-qp1lk
@LionKing-qp1lk 2 жыл бұрын
FIXED: git doesn't do well with large files (that's why they shouldn't be in your directory under git)
@auroradraco9974
@auroradraco9974 2 жыл бұрын
Damn, looks sick
@alankjohn9263
@alankjohn9263 2 жыл бұрын
fill the comment section with all the other underrated cool programs but offcourse foss only!!
@DeshierArchitecte
@DeshierArchitecte 2 жыл бұрын
Meh, why not jut use rsync?
@slavko5666
@slavko5666 2 жыл бұрын
Anyone got an idea on how to sync game saves from Retroarch?
@vincentas1
@vincentas1 2 жыл бұрын
soyface thumbnails always get a click from me
@ShaunakDe
@ShaunakDe 2 жыл бұрын
Thanks for demoing this
@lanpartylandlord6123
@lanpartylandlord6123 2 жыл бұрын
quit it w the soyface thumbnails brooo! ur channel is better than that!
@mrCetus
@mrCetus 2 жыл бұрын
Have you used jigdo before? In Debian, it's the `jigdo-file` packae. It is used for downloading massive files in pieces. Debian uses it for downloading the full set of packages. For example, the 19-DVD set that contains every single package. I have an offline copy of that and wikipedia just in case the internet ever goes away. It's about 170G of storage for both the full Debian repo and Wikipedia combined which can fit on a single thumb drive. Just try to think about how many minds and man-hours and how much information that is. In a way, you could consider that thumb drive one of the most valuable items in the entire world.
Essential Linux Commands - Cat, Tac and Tee
21:07
DistroTube
Рет қаралды 24 М.
Getting Started With Git and GitLab
24:34
DistroTube
Рет қаралды 44 М.
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН
Joey Hess: "git annex is complete, right?"
21:23
DataLad
Рет қаралды 958
Xargs Should Be In Your Command Line Toolbag
16:24
DistroTube
Рет қаралды 103 М.
Gitea - Keep Your Repo Private At Home!
12:20
Jim's Garage
Рет қаралды 49 М.
So You Think You Know Git - FOSDEM 2024
47:00
GitButler
Рет қаралды 1,3 МЛН
Shell Aliases Every Linux User Needs
29:03
DistroTube
Рет қаралды 38 М.
13 Advanced (but useful) Git Techniques and Shortcuts
8:07
Fireship
Рет қаралды 938 М.
The 5 Things That Taught Me The Most About Linux
18:41
DistroTube
Рет қаралды 97 М.
The mind behind Linux | Linus Torvalds | TED
21:31
TED
Рет қаралды 6 МЛН
Essential Keybindings For Bash, Fish and Zsh
16:41
DistroTube
Рет қаралды 30 М.
Rust Programs Every Linux User Should Know About
13:18
DistroTube
Рет қаралды 136 М.