Crawl4AI - Crawl the web in an LLM-friendly Style

  Рет қаралды 8,646

Unclecode

Unclecode

Күн бұрын

Welcome to the detailed walkthrough of Crawl4AI v0.2.0! 🚀
In this video, I'll dive deep into the code base of Crawl4AI, our powerful web crawling tool designed for AI enthusiasts and developers. We'll explore all the new and exciting features that make this release a game-changer:
🕷️ Efficient web crawling to extract valuable data from websites
🤖 LLM-friendly output formats (JSON, cleaned HTML, markdown)
🌍 Supports crawling multiple URLs simultaneously
🌃 Replace media tags with ALT
🆓 Completely free to use and open-source
📜 Execute custom JavaScript before crawling
📚 Chunking strategies: topic-based, regex, sentence, and more
🧠 Extraction strategies: cosine clustering, LLM, and more
🎯 CSS selector support
📝 Pass instructions/keywords to refine extraction
I explain all these features in detail in the video. No API key, signup, or other boring stuff required! 🌐
Check out the repo: [Crawl4AI on GitHub](github.com/unc...)
If you find this tool useful, please star the repo and leave a comment! Your feedback helps us improve and support the project.
Follow me on Twitter (X) for updates on my research on function-calling for LLMs and AI agents: x.com/unclecode
I appreciate your feedback and thoughts on this project.
#Crawl4AI #WebCrawling #AI #LLM #Colab #WebScraping #OpenSource #GitHub #OpenSourceAI

Пікірлер: 13
@po6577
@po6577 4 ай бұрын
Love how you so excited of your project! Keep it up man! Great project
@unclecode788
@unclecode788 4 ай бұрын
Thanks! Will do!
@AWSFan
@AWSFan 2 ай бұрын
Very useful Project, I must admit! Is it a recursive crawler, when I say recursive, I mean it, (not restricted to depth threshold). Also How differet is this from FireCrawl, in terms of functionality and other stuffs. I can't wait to get started on using this project, and give it a shot! Thanks!
@plumpy8854
@plumpy8854 3 ай бұрын
Hey man. I'm going to be honest but i'm new to data scraping and wanted to ask if crawl4ai can be used to scrape data from tiktok. They have implemented some harsh measures with request rate limits and login requirements. From what i saw crawl4ai has some login feature but just wanted to ask you if i'm going in the right direction. Otherwise looks great
@MikeLevin
@MikeLevin 3 ай бұрын
Looks exciting. Have you considered a nix script?
@xinfeng3022
@xinfeng3022 3 ай бұрын
possible to put up a prebuilt docker image, including the 'models'? I had problem downloading the models during build docker. Thanks!
@unclecode788
@unclecode788 3 ай бұрын
I will work on that. Trying to have a version without model dependency as well
@carlosa.villanuevacampoy931
@carlosa.villanuevacampoy931 3 ай бұрын
Really cool man! Can I crawl all accessible subpages from a main page? So I crawl 2 levels in total?
@unclecode788
@unclecode788 3 ай бұрын
You can send multiple links, so first crawl the main page, then get links and send them again. However soon I will release the ability to se the depth and get a cool result for that
@fieldcommandermarshall
@fieldcommandermarshall 4 ай бұрын
WHAT HAPPENED TO THE FLUTE UNCLE CODE
@unclecode788
@unclecode788 4 ай бұрын
Hahahaha!! Ok, ok, message received
@bitcoinquickbytes
@bitcoinquickbytes 4 ай бұрын
i got a result object. how to parse it
@unclecode788
@unclecode788 4 ай бұрын
Result is an object like this: class CrawlResult(BaseModel): url: str html: str success: bool cleaned_html: str = None markdown: str = None extracted_content: str = None metadata: dict = None error_message: str = None So you can access using this property (cleaned_html, markdown, extracted_content), or dump the model into a python dictionary using "result.model_dump()`
Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)
20:19
Cole Medin
Рет қаралды 113 М.
Unlimited AI Agents running locally with Ollama & AnythingLLM
15:21
Tim Carambat
Рет қаралды 137 М.
The selfish The Joker was taught a lesson by Officer Rabbit. #funny #supersiblings
00:12
إخفاء الطعام سرًا تحت الطاولة للتناول لاحقًا 😏🍽️
00:28
حرف إبداعية للمنزل في 5 دقائق
Рет қаралды 42 МЛН
小天使和小丑太会演了!#小丑#天使#家庭#搞笑
00:25
家庭搞笑日记
Рет қаралды 35 МЛН
哈莉奎因怎么变骷髅了#小丑 #shorts
00:19
好人小丑
Рет қаралды 55 МЛН
Marker: This Open-Source Tool will make your PDFs LLM Ready
14:11
Prompt Engineering
Рет қаралды 51 М.
This AI Agent can Scrape ANY WEBSITE!!!
17:44
Reda Marzouk
Рет қаралды 58 М.
Why Agent Frameworks Will Fail (and what to use instead)
19:21
Dave Ebbelaar
Рет қаралды 67 М.
Scrapegraphai Usecase
16:36
ScrapeGraphAI
Рет қаралды 14 М.
Fabric: Opensource AI Framework That Can Automate Your Life!
9:48
Intermediate Javascript: Design Patterns
23:30
Taylor King
Рет қаралды
host ALL your AI locally
24:20
NetworkChuck
Рет қаралды 1,1 МЛН
The selfish The Joker was taught a lesson by Officer Rabbit. #funny #supersiblings
00:12