* (00:00) Introduction to Hacker News and the problem of finding specific information * (00:28) Creating a script to scrape Hacker News posts and comments * (00:58) Embedding the data into a semantic search system * (01:32) Searching for sentiment analysis on a specific post * (02:12) Running the Hau model and local model through AMA * (02:42) Overview of the process from scraping to embeddings * (03:18) Using the embeddings to answer user questions * (03:54) Code overview and setup * (04:30) Getting information from Hacker News and adjusting the system * (05:03) Using the sentence transformer and adjusting the system message * (05:37) Chunk sizes and the Vault file * (06:08) Testing the system with a "needle in a haystack" search * (06:41) Finding a specific comment and getting more information * (07:16) Searching for posts with specific keywords * (07:56) Analyzing the data with Claude 3 Sonnet * (08:25) Running the program and getting the results * (09:03) Results: topic, URL, most emotional comments, and KZbin video ID * (09:43) Early stages of the project and future updates * (10:16) Conclusion and public repo **Key Takeaways** --------------- 🔍 Hacker News can be overwhelming, but AI can help find specific information 💻 Scripting and scraping can gather data, which can be embedded for semantic search 🤔 Sentiment analysis and user questions can be answered with context 📊 Code is available and adjustable for different sources and needs 🔮 Claude 3 Sonnet can analyze data for key topics and emotional responses **Benefits** ------------ ✔ Solves the problem of finding specific information on Hacker News ✔ Uses AI for semantic search and sentiment analysis ✔ Adjustable code for different sources and needs ✔ Analyzes data for key topics and emotional responses **Cons** ---------- ❌ Early stages of the project, needs more work ❌ Prompt and desired output need refinement ❌ Limited to Hacker News, could be expanded to other sources **Score** --------- - 8/10 for solving a specific problem with AI - 7/10 for code and adjustability - 6/10 for early stages and room for improvement **Like or Dislike** ------------------- 👍 Like for the creative solution and potential for expansion and improvement.
@hope427 ай бұрын
You picked a great topic that time and killed it. A wonderful use case to be reused in the use case to be reused in the use case haha. 😮
@ShaunPrince7 ай бұрын
I think HackerNews has an RSS feed, so there is no need to scrape.
@sirinongorur56767 ай бұрын
great stuff, need more of this. Keep up the good work!
@Brkcln7 ай бұрын
Good one, when have time i will try this too
@coinboybit72815 ай бұрын
wondering if it better to chunk it by each json item/post instead of 1000 characters since the data is already in json format?
@gumshoe94967 ай бұрын
Very good project!
@watchdog1637 ай бұрын
Looking forward to grabbing it from the pøblic reepo.
@lokeshart33407 ай бұрын
Do u get it? If u get pls reply and also the speech to speech ready also pls help
@inout33947 ай бұрын
Thx
@micbab-vg2mu7 ай бұрын
nice:)
@user---------7 ай бұрын
It’s not entirely clear what exactly the script is doing here? Is it looking for some facts in all the comments?