#5 Read and process multiple text files in Python

  Рет қаралды 24,328

Data Skills for Everyone

Data Skills for Everyone

Күн бұрын

Пікірлер: 27
@saud5133
@saud5133 2 жыл бұрын
Hello... How to Read a Binary File Like Metastock Files Using Python
@pradnyakasar614
@pradnyakasar614 2 жыл бұрын
How to find out the unique word count of multiple text files at same time?
@smurfk7678
@smurfk7678 6 ай бұрын
How to interpolate multiple.nc files in python Jupytrr notebook and get a single file in .nc format? Currect resolution of my gridded data is 0.25*0.25 degree i want to interpolate it upto 0.01*0.01 degree. Can you help me by code?
@hopelopez83
@hopelopez83 3 жыл бұрын
How do you generate a comparison report of multiple files and reading each line of all the text files?
@dataskillsforeveryone8205
@dataskillsforeveryone8205 3 жыл бұрын
I am not exactly sure what you mean with the comparison report, but I see it can be as simple as maybe comparing the length of content of each file to something more sophisticated, in which case you may need some advanced python packages. But for the simple case of say computing some statistics on the content of each file and reporting, you coud use a list comprehension in combination with some functions to achieve that
@1UniverseGames
@1UniverseGames 2 жыл бұрын
Can you add one thing here, like we read .txt file from a directory, now, I run each .txt file and save it to a RESULT_Dir folder? how can I do that, like Input Dir of txt-> our code then read each .txt file one after another for 2 minutes and then save the results of scanning or parsing of .txt file to a different directory and each resultant txt file will be different here as each file contain different information, how can I do it? but now I learn get input file and read it, but I wanna learn next part can you help please
@dataskillsforeveryone8205
@dataskillsforeveryone8205 2 жыл бұрын
Yes, that should work the same way. You can begin by setting the output folder just like we did for the input directory called input_folder in the code. Then, within the same loop, you are reading the individual files, you can do some processing with the data and then write it out to the destination folder. Or you can to chose to carry out some bulk operation with all the files and then later write out everything in bulk to the disk. Many ways to go about it
@olumidebenjami4
@olumidebenjami4 2 жыл бұрын
You are a life saver
@32334694
@32334694 2 жыл бұрын
Great video sir thanks! So I have 2 folders of IMDb reviews negative and positive reviews, I labelled them 0,1 resp. Stored it in a list of tuple seperately. Added the 2 lists. Random.shuffle to mix it up. Split it into train and test at 75/25 ratio. My question is how do I read the train set txt files and create a vocab using word_tokenize. And then tfidf of that vocab? Thanks in advance.
@dataskillsforeveryone8205
@dataskillsforeveryone8205 2 жыл бұрын
It would be great if you can share your code so that it's more concrete about what you are trying to do. It seems to me that you have already read these files into memory. I would like to see it a bit more clearly what you have done up to the point where you are using word_tokenize
@hosamho3119
@hosamho3119 Жыл бұрын
طططط
@gopikishan1028
@gopikishan1028 2 жыл бұрын
thank you it worked for me ...
@dataskillsforeveryone8205
@dataskillsforeveryone8205 2 жыл бұрын
You're welcome!
@ganeshsrivatsakalahasti6813
@ganeshsrivatsakalahasti6813 3 жыл бұрын
Hi sir. I've few doubts regarding few concepts. Would kindly request you to help me. How may I reach you out please ?
@dataskillsforeveryone8205
@dataskillsforeveryone8205 3 жыл бұрын
Hi Sir, thank you linking up. I would be glad to hear your question and provide any help that I can. I'm not sure yet how I can connect with you privately on KZbin. If you know how, I'll be glad to learn :)
@syedrehman5818
@syedrehman5818 2 жыл бұрын
how can i preprocess (nltk, stopwords, tokenization) all text files in a folder (database)?
@dataskillsforeveryone8205
@dataskillsforeveryone8205 2 жыл бұрын
Dear Syed, thanks for reaching out. I would love a bit more clarification. But depending on what you are trying to achieve, you might either read in your files and process them one by one and join the results at the end, or read and join all the files and then process them together for a single output. It depends
@SHASHANKRUSTAGII
@SHASHANKRUSTAGII 2 жыл бұрын
How to get away with this error? 'utf-8' codec can't decode byte 0xc9 in position 9: invalid continuation byte
@dataskillsforeveryone8205
@dataskillsforeveryone8205 2 жыл бұрын
The error suggests that you are using the utf-8 encoding to open a file with a different encoding. This may call for a trial and error approach using different encoding in the open method. But if you know the encoding of the document, then used that straight away in the call to open method. The other option is to open the file in binary mode and then read it in. If you are reading the file with pandas, you could also ignore this error and read in the rest of the data
@SHASHANKRUSTAGII
@SHASHANKRUSTAGII 2 жыл бұрын
@@dataskillsforeveryone8205 Thank you. It helped me.
@rohitjangra6939
@rohitjangra6939 2 жыл бұрын
Returning a blank list. While reading the json from a folder.
@dataskillsforeveryone8205
@dataskillsforeveryone8205 2 жыл бұрын
Hi Rohit, it's not very clear what you mean. If you can provide more information, that would be great
@ajaxx627
@ajaxx627 3 жыл бұрын
Please I have a problem with some work. I was given a list of words let’s say about 200 different words. And I’m meant to create a code that generates 3 random words each together. Eg wordlist=[a, b, c, d, e,................z] Output should be = a, d, z c, o, x And so on Please how do I do it?
@dataskillsforeveryone8205
@dataskillsforeveryone8205 3 жыл бұрын
If you are interested in selecting 3 random words until all words are used up, here is one approach using a python list: 1. use random.sample to randomly pick 3 words (but words will still be in the original list after picking) 2. remove the words that have been picked from the original list 3. add the picked words to another list just so you won't repeat their selection. Something like this 4. keep the procedure repeated until you have exhausted the original list words = [] # add your words here num_words = len(words) num_to_pick = 3 sampled = [] indices = list(range(num_words)) for i in range(int(num_words/num_to_pick)): if num_to_pick < len(indices): random_indices = random.sample(indices,k=num_to_pick) else: random_indices = indices sampled.extend([words[j] for j in random_indices]) selected_words= [words[i] for i in random_indices] for r in random_indices: indices.remove(r) print(selected_words)
@ajaxx627
@ajaxx627 3 жыл бұрын
@@dataskillsforeveryone8205 Tysm 😊 I already did it tho but I appreciate. God bless 💚
@DoUKnowMee
@DoUKnowMee 2 жыл бұрын
for me its just reading 1 random file in the folder
@hosamho3119
@hosamho3119 Жыл бұрын
حط
#6 Save multiple text files to disk in Python
13:59
Data Skills for Everyone
Рет қаралды 1,2 М.
Python Tutorial - How to Read in Values From a .txt File
9:59
TinkerAssist
Рет қаралды 14 М.
УДИВИЛ ВСЕХ СВОИМ УХОДОМ!😳 #shorts
00:49
Thank you Santa
00:13
Nadir Show
Рет қаралды 53 МЛН
From Small To Giant 0%🍫 VS 100%🍫 #katebrush #shorts #gummy
00:19
#10 Python Data Extraction using Server Log Files
16:45
Data Skills for Everyone
Рет қаралды 7 М.
Python Programming Tutorial #13 - How to Read a Text File
8:18
Tech With Tim
Рет қаралды 139 М.
Python Tutorial: CSV Module - How to Read, Parse, and Write CSV Files
16:12
Tutorial 27 - Using glob to read multiple files in python
11:54
ZEISS arivis
Рет қаралды 32 М.
How To Merge Text Files in Python
6:19
PikFix
Рет қаралды 3,4 М.
Every Python dev falls for this (name mangling)
14:11
mCoding
Рет қаралды 139 М.
#9 Basic Python Data Extraction from Text Files
20:13
Data Skills for Everyone
Рет қаралды 10 М.
Turning multiple CSV files into a single pandas data frame
8:09
Python and Pandas with Reuven Lerner
Рет қаралды 31 М.
УДИВИЛ ВСЕХ СВОИМ УХОДОМ!😳 #shorts
00:49