Python Tutorials for Digital Humanities

Python Tutorials for Digital Humanities

On this channel, I provide tutorials for working with Python in a digital humanities project. I design my videos and tutorials for humanists who have no coding experience. I am a medieval historian by trade, but I create my videos with all humanists in mind. If you want to interact with the videos in more dynamic ways, check out my website, www.PythonHumanities.com. On that site, I host live coding exercises and quizzes. It is still a work in progress and will be complete during the Summer of 2020. I post 1-10 videos per week, so check back frequently.

✅Be my Patron: www.patreon.com/WJBMattingly

Best way to do Table Detection in 2024 with TF-ID - Quick Tutorial with Code

5:54

Best way to do Table Detection in 2024 with TF-ID - Quick Tutorial with Code

5 ай бұрын

What is Synthetic Data and How do we Generate it?

18:02

What is Synthetic Data and How do we Generate it?

5 ай бұрын

What is Semantic Searching? (NLP Concepts)

7:31

What is Semantic Searching? (NLP Concepts)

6 ай бұрын

Mastering Streamlit in 2024: Creating Interactive Applications with Input Widgets (02)

5:43

Mastering Streamlit in 2024: Creating Interactive Applications with Input Widgets (02)

8 ай бұрын

Best way to do Named Entity Recognition in 2024 with GliNER and spaCy - Zero Shot NER

5:01

Best way to do Named Entity Recognition in 2024 with GliNER and spaCy - Zero Shot NER

9 ай бұрын

Best Way to Transcribe Audio and Video with Python and Whisper-MLX ASR #datascience

12:10

Best Way to Transcribe Audio and Video with Python and Whisper-MLX ASR #datascience

10 ай бұрын

Shoud I learn NLP in 2024? #datascience #machinelearning #ai

8:46

Shoud I learn NLP in 2024? #datascience #machinelearning #ai

Жыл бұрын

Streamlit in 2024 Tutorial - 01 - The Basics - The different ways to write data into an app

7:46

Streamlit in 2024 Tutorial - 01 - The Basics - The different ways to write data into an app

Жыл бұрын

Understanding Pandas in Python: A Comprehensive Overview in 3 Minutes

3:13

Understanding Pandas in Python: A Comprehensive Overview in 3 Minutes

Жыл бұрын

The Best Way to Build a RAG System with Python - Verba from Weaviate - Quick Tutorial

7:00

The Best Way to Build a RAG System with Python - Verba from Weaviate - Quick Tutorial

Жыл бұрын

What is RAG or Retrieval-Augmented Generation #RAG #NLP #MachineLearning #AI #GPT #techtalk

4:52

What is RAG or Retrieval-Augmented Generation #RAG #NLP #MachineLearning #AI #GPT #techtalk

Жыл бұрын

How to use GPT Builder from OpenAI in ChatGPT

5:44

How to use GPT Builder from OpenAI in ChatGPT

Жыл бұрын

How to Create a Python Package or Library and Upload to PyPi with Twine in 5 Easy Steps #python

12:35

How to Create a Python Package or Library and Upload to PyPi with Twine in 5 Easy Steps #python

Жыл бұрын

How to Easily Find Keywords in a Document with KeyBERT in Python

7:17

How to Easily Find Keywords in a Document with KeyBERT in Python

Жыл бұрын

Training a spaCy SpanCat Model to Annotate in Texts more quickly in Prodigy | SpanCat 03

9:14

Training a spaCy SpanCat Model to Annotate in Texts more quickly in Prodigy | SpanCat 03

Жыл бұрын

How to Prepare Annotations in Prodigy for Training a SpanCat Model in spaCy (Part 2 | SpanCat) #nlp

12:22

How to Prepare Annotations in Prodigy for Training a SpanCat Model in spaCy (Part 2 | SpanCat) #nlp

Жыл бұрын

SpanCat with spaCy on Real Data | Part 01 - The Project and Cultivating Data for Annotation

12:08

SpanCat with spaCy on Real Data | Part 01 - The Project and Cultivating Data for Annotation

Жыл бұрын

LatinCy | How to use spaCy for Latin NLP in Python #nlp #spacy

7:35

LatinCy | How to use spaCy for Latin NLP in Python #nlp #spacy

Жыл бұрын

How to use the Map Function in Python for Beginners - Intermediate Python with Free Textbook

7:17

How to use the Map Function in Python for Beginners - Intermediate Python with Free Textbook

Жыл бұрын

When to use NER, EntityRuler, SpanCat, or SpanRuler in spaCy

10:49

When to use NER, EntityRuler, SpanCat, or SpanRuler in spaCy

Жыл бұрын

What are Named Tuples in Python? #python #datascience #data #programming #digitalhumanities

13:10

What are Named Tuples in Python? #python #datascience #data #programming #digitalhumanities

Жыл бұрын

Easy Tutorial for Zip and Enumerate in Python

5:47

Easy Tutorial for Zip and Enumerate in Python

Жыл бұрын

Using Lists in a spaCy 3 Pattern for EntityRuler, SpanRuler, or Matcher (spaCy Quick Tip) v. 3.5

3:35

Using Lists in a spaCy 3 Pattern for EntityRuler, SpanRuler, or Matcher (spaCy Quick Tip) v. 3.5

Жыл бұрын

Fuzzy Matching with spaCy 3.5 (spaCy 3.5 update)

6:44

Fuzzy Matching with spaCy 3.5 (spaCy 3.5 update)

Жыл бұрын

Case Insensitive Matching in Python with .Casefold()

7:33

Case Insensitive Matching in Python with .Casefold()

Жыл бұрын

The Easiest Way to do Coreference Resolution with spaCy with spaCy-Experimental

11:54

The Easiest Way to do Coreference Resolution with spaCy with spaCy-Experimental

Жыл бұрын

Topic Modeling with LeetTopic - Transformer Topic Modeling that Generates a Bokeh App (EASY!)

16:42

Topic Modeling with LeetTopic - Transformer Topic Modeling that Generates a Bokeh App (EASY!)

2 жыл бұрын

Easily Make DataFrame App with Streamlit Pandas (Only 2 lines of Python!)

12:45

Easily Make DataFrame App with Streamlit Pandas (Only 2 lines of Python!)

2 жыл бұрын

How to Serialize (Save) spaCy Doc Containers to Disk with DocBin and Pickle

13:23

How to Serialize (Save) spaCy Doc Containers to Disk with DocBin and Pickle

2 жыл бұрын

Пікірлер

@rChandan_Singh

@rChandan_Singh 21 сағат бұрын

There is no single method explained for non english corpus

@AIAgents-k1h 2 күн бұрын

Good explanation notebook lm does the same functionality with based on rags

@ShainaD843 3 күн бұрын

This video is so intriguing!! I'm very new at trying to figure this out. Is there a way to link items based on a system id or name using several different SQL Server Tables? Connecting all items with 2 or 3 degrees of separation from that search number?

@AnaximandroAndrade

@AnaximandroAndrade 4 күн бұрын

Great explanation!

@cloudtech273 7 күн бұрын

RAG=Customized responses?

@Kookoobi. 8 күн бұрын

What if say Charles was already there and they wanted to change the dates? @python-programming

@PranithaAnnajigari

@PranithaAnnajigari 8 күн бұрын

I couldn't find any such packages on internet or GPT ( as I wanted to understand more)

@LaLa-ww9su 11 күн бұрын

I enjoyed the 9 modules. I learned a lot!

@pierrebotha2823

@pierrebotha2823 11 күн бұрын

I'm building a RAG studio voice assistant.

@arjunp5840 13 күн бұрын

Thank you for making these videos. One of the best videos series I found.

@YuktaAshok 13 күн бұрын

Can you please share the github repo

@ASMAEELAANOUNI

@ASMAEELAANOUNI 14 күн бұрын

your steps are all clear, thank you for this simplicity

@ManuelUgarte-p9h

@ManuelUgarte-p9h 14 күн бұрын

Nice

@ManuelUgarte-p9h

@ManuelUgarte-p9h 14 күн бұрын

Nice.

@Daniel-fl4si 14 күн бұрын

A dumb question, this model works only for english texts, right?

@Bartyron 16 күн бұрын

I got an error with finding tesseract in the command prompt because it was not run in administrator mode.

@hamzasarwar2656

@hamzasarwar2656 17 күн бұрын

I guess we cannot append the json file as it contains single object only. The above method will fail as the file gets larger and then we won't be able to load the large file in python memory.

@HamzaMahmood1261

@HamzaMahmood1261 17 күн бұрын

thank you for being honest that this is not enough this is just a start point

@ishaanthtm 19 күн бұрын

Kernel dies on windows 11 while running get_similar_images method

@mkazigwa3783 19 күн бұрын

❤

@thedevmachine 19 күн бұрын

The documents are not not sent to the LLM. Only the results of your query.

@jannik93 16 күн бұрын

Can you explain how the database and its documents are searched for this query information?

@thedevmachine 16 күн бұрын

@@jannik93 from the document chucks there is embeddings created and stored in the vector database. When you ask a question this question is also converted as embedding with that the cevtor database is searched for matching results with distance value and returns the text from the documents from the database and passed to the LLM with question and answer and how the LLM should handle it.

@jannik93 16 күн бұрын

@@thedevmachine thank you! For understanding: You mean "document chunks"? Are the embeddings created by the retrieval model or the ai?

@thedevmachine 16 күн бұрын

@@jannik93 you read the document as text create chunks from it then from each chunk you create embeddings and store it on the vector database. To use to ask a question that question is also conveted to embedding so you do search and you get result or results depending on your question the result contains the relevant chunks as text and you pass the original question and the answers with a prompt to a LLM to generate a natural response.

@jannik93 16 күн бұрын

@@thedevmachine I think I got it now. Thank you very much!

@waelhussein4606

@waelhussein4606 21 күн бұрын

Great videos, thanks! It’s important to understand that IDF reduces the weight of common words that frequently appear in most documents within the corpus, as these words contribute little to document classification. Conversely, it highlights less common words, making them more important for distinguishing the documents in which they appear.

@waelhussein4606

@waelhussein4606 21 күн бұрын

Thanks for the videos. I love your energy and clarity. I would love to learn more about how you determined the keywords for each cluster.

@Tinytots199 25 күн бұрын

I can't remove now??

@paulkabiito8355

@paulkabiito8355 27 күн бұрын

Beautiful, more of these please

@TheMrWARLORD 28 күн бұрын

How is the vector database generated ?

@beautifulmind684

@beautifulmind684 29 күн бұрын

mark

@ysyvon Ай бұрын

Thank you for this video series! I was a Computer Science and Mathematics Major, but realised that the pure science wasn't really my passion, so I switched my major to History, but I still love computer science and especially Python, so I am still getting my minor in Comp Sci and Linguistics. The Digital Humanities is really right up my alley, having this as a resource is so helpful!

@lyhs0219 Ай бұрын

Actually RAG consists of 2 parts, you presented the query part, but the other part is about breaking down the document into chunks and also generate vector embeddings from them

@traveling-historian

@traveling-historian Ай бұрын

I mentioned those aspects. I didn't cover chunking because of the issue of timing in a short (60 seconds). Chunking is a pre-step in RAG systems and not an essential one. Some documents don't require chunking, such as Tweets for example. I did mention vector databases.

@talielnussen3867

@talielnussen3867 Ай бұрын

I legit thought this was shells on sand at the beach

@niflag Ай бұрын

numpy has a different linear algebra library by default on conda vs pip. Also, downloading a wheel file is not a 'tedious task;' compiling a wheel is.

@SteveGiome Ай бұрын

the share.streamlit is shown "You do not have access to this app or it does not exist". is it pay-only now? or just temporary disable.

@niflag Ай бұрын

So quiet

@icns01 Ай бұрын

A superb explanation...thank you.

@Fun-jb2yn Ай бұрын

hey can u pls make a OCR for extracting data from a blood report pls pls

@markomarjanovic8348

@markomarjanovic8348 Ай бұрын

You are a fantastic teacher!! Love the way you conceptualized this and the way you explain things!

@AoyamaIchome Ай бұрын

French revolution happened in 1789

@HesamFarsiDehnavi

@HesamFarsiDehnavi Ай бұрын

Amazing

@rikii7237 Ай бұрын

fifty_points to gryffindor

@rikii7237 Ай бұрын

Thank you SO MUCH for keeping these videos short and straight to the point.

@kishork2211 Ай бұрын

A very clear & simple explanation about RAG. Now another one question is hen you want to use this in real business case , how do you consider business data feed into it .

@rubo111 Ай бұрын

Concise.

@EranM Ай бұрын

then came gpt and the rise of LLMS pissing on those models...

@jcwade Ай бұрын

dont combine unneeded tech in a video about another one, hold control variables constant and simple.

@saranshtiwari8543

@saranshtiwari8543 Ай бұрын

Hey KZbin, Why am I seeing this after 2 years? Recommend videos like these as soon as they get uploaded!

@uswakhan3050 Ай бұрын

how to apply ocr on different language text

@grgr1467 Ай бұрын

hi ! where can i find the source file you used?

@geetharagiphani6670

@geetharagiphani6670 Ай бұрын

Could you please share the text file that that’s been used in this video?

@JennaHasm 2 ай бұрын

I LOLed so much at your channel's description.

@johnyoung8848 2 ай бұрын

in the previous video trying to run the model The kernel appears to have died. It will restart automatically. -- running py3.10.2 - installed via conda latest spacy etc etc.. any ideas