Join My AI Career Program www.nicolai-nielsen.com/aicareer Enroll in My School and Technical Courses www.nicos-school.com
@theuser810 Жыл бұрын
The repository link is not in the description
@axelanderson20302 жыл бұрын
For anyone who is getting poor results: 1. The small dataset means that a random split might not generalise the problem. for example, the train dataset might contain much higher percentage of a digit than another 2. You can use opencv to perform preprocessing which can improve performance. Using morphological transformations to remove noise can improve performance immensely. 3. To avoid overfitting, I found that a Gaussian noise layer can help. This makes it harder to learn therefore harder to overfit. Hope this helps!
@kalifardiansyah5863 Жыл бұрын
have a question!. how to avoid miss detect of character? especially between two similiar character. example. letter Z detected 2, letter S detected 5, letter I detected 1, etc
@axelanderson2030 Жыл бұрын
@@kalifardiansyah5863 you may require more training data, or a larger CNN architecture
@HarshpreetSingh-jz2lf Жыл бұрын
I tried it with 60000 images, used morphological techniques but still doesn't provide accuracy, val_loss just doesn't go below 14
@axelanderson2030 Жыл бұрын
@@HarshpreetSingh-jz2lf do you have a class imbalance in the dataset? Is the model built correctly? Is the data preprocessed correctly? I can't help you if you don't provide any context except for "it no work"
@souhailel-ghayam47143 жыл бұрын
Hey, Thank you very much for this beautiful explanation of the code and the philosophy behind ocr with LSTM and CTC layer. Can you please verify if the code always works well because I was executing it and it was working but now doesn't. I think there is a problem in mapping characters to numbers and mapping numbers to their original characters by the function of ('' layers.experimental.preprocessing.StringLookup''). I tried to compilate it in google colab but when I tried to visualize the data it doesn't give the correct label text. I would be very thankful if you verify it and give some solutions to fIxe the problem of mapping characters to numbers and mapping numbers to their original characters .
@NicolaiAI3 жыл бұрын
Thank you very much for watching! The code should not depend on anything and should be working every time, hmm 🤔
@nadyasudusinghe22132 жыл бұрын
Hi, I'm getting the same error. Did you find the solution?
@traderdaniel47492 жыл бұрын
Same here. I used only digits as labels therefore I removed "char_to_num" and "num_to_char"
@benoitd94 Жыл бұрын
Do you think I can use your code to decode the digits of my water counter?
@NicolaiAI Жыл бұрын
Maybe u Can try easyocr for that!
@megistoneАй бұрын
I've finally ended with this working configuration: images = sorted(map(str, list(data_dir.glob("*.png")))) labels = [img.split(path.sep)[-1].split(".png")[0] for img in images] vocab = sorted(set("".join(labels))) max_length = max(len(label) for label in labels) char_to_num = StringLookup(vocabulary=vocab, mask_token=None, num_oov_indices=0, oov_token="[UNK]") num_to_char = StringLookup(vocabulary=char_to_num.get_vocabulary(), invert=True, mask_token=None, num_oov_indices=0, oov_token="[UNK]") And rest of the code like in video.
@mortezarisan3261Ай бұрын
Hello, do you have the captcha code for this clip, please send me?
@user-kw9cu28 күн бұрын
Thank you
@megistone24 күн бұрын
@@mortezarisan3261 if u mean model code, yes: train_model = build_train_model(vocab) train_model.summary() early_stopping = EarlyStopping(monitor="val_loss", patience=early_stopping_patience, restore_best_weights=True, min_delta=1e-5) history = train_model.fit(train_dataset, validation_data=validation_dataset, epochs=epochs, callbacks=[early_stopping], verbose=1) prediction_model = get_prediction_model(train_model) compile_prediction_model(prediction_model) prediction_model.summary() ____ def decode_batch_predictions(pred, num_to_char): results = ctc_decode(pred, tf.ones(pred.shape[0]) * pred.shape[1], "greedy")[0][0][:, :] return [tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8").replace(num_to_char.oov_token, "") for res in results] def build_train_model(vocab: list) -> Model: input_img = Input(shape=(img_width, img_height, 1), name="image") labels = Input(name="label", shape=(None,), dtype="float32") x = Conv2D(32, (3, 3), activation="relu", kernel_initializer="he_normal", padding="same", name="Conv1")(input_img) x = MaxPooling2D((2, 2), name="pool1")(x) x = Conv2D(64, (3, 3), activation="relu", kernel_initializer="he_normal", padding="same", name="Conv2")(x) x = MaxPooling2D((2, 2), name="pool2")(x) new_shape = ((img_width // 4), (img_height // 4) * 64) x = Reshape(target_shape=new_shape, name="reshape")(x) x = Dense(64, activation="relu", name="dense1")(x) x = Dropout(.2)(x) x = Bidirectional(LSTM(128, return_sequences=True, dropout=.25))(x) x = Bidirectional(LSTM(64, return_sequences=True, dropout=.25))(x) x = Dense(len(vocab) + 1, activation="softmax", name="out2vec")(x) output = CTCLayer(name="ctc_loss")(labels, x) # Define the model model = Model(inputs=[input_img, labels], outputs=output, name="ocr_model") model.compile(Adam()) return model def get_prediction_model(train_model: Model) -> Model: return Model(inputs=train_model.get_layer(name="image").output, outputs=train_model.get_layer(name="out2vec").output) def compile_prediction_model(prediction_model: Model): prediction_model.compile(Adam())
@EnsignerTV3 жыл бұрын
thanks a lot !
@NicolaiAI3 жыл бұрын
Thanks for watching!
@ehsanroshan70682 жыл бұрын
Hi Nicolai, thanks for great explanation. Could you please explain how to measure accuracy?
@adepusairahul737510 ай бұрын
where is the repository link i am not able to find it in description
@hsnhsynglk3 жыл бұрын
## Preprocessing # Mapping characters to integers char_to_num = layers.experimental.preprocessing.StringLookup( vocabulary=list(characters), mask_token=None ) # Mapping integers back to original characters num_to_char = layers.experimental.preprocessing.StringLookup( vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True
@badihaboulhosn81782 жыл бұрын
Thanks, thought i was the only one!
@syedmuzammilahmed6872 Жыл бұрын
Thanks Man
@UZMAALFATMI10 ай бұрын
thanks so much!
@GuyJustCool3 жыл бұрын
Dear Coding Lib! im here with the Capthcha project! seems like turning the shuffle on messes with the shuffling function and does incorrect tplit. I have yet to find solution, and would really appreciate if you looked into it! If shuffle is off, it works well. Another person pointed the bug out, and its labels being on wrong images
@HassanKhan-ei2wh Жыл бұрын
## Preprocessing # Mapping characters to integers char_to_num = layers.experimental.preprocessing.StringLookup( vocabulary=list(characters), mask_token=None ) # Mapping integers back to original characters num_to_char = layers.experimental.preprocessing.StringLookup( vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True
@syedmuzammilahmed6872 Жыл бұрын
@@HassanKhan-ei2wh Thanks Man
@syedmuzammilahmed6872 Жыл бұрын
@@HassanKhan-ei2wh When i add num_oov_indices = 0 parameter in stringLookup code then model training code work but it post labels on wrong images. So i removed num_oov_indices and now my model training code of earlystopping is not working. Any solution for this ?
@megistoneАй бұрын
@@syedmuzammilahmed6872 Just add num_oov_indices=0 to num_to_char also, it help me
@alexmoruz19932 жыл бұрын
Hi Nicolai, I was wondering would there be a way to feed in this kind of network wider images with text or have kind of dynamic input with size?
@omkarmestry4117 Жыл бұрын
I m trying to run this code but m getting error like InvalidArgumentError : graph execution error Anyone can help with this
@şulemeşe-z7w9 ай бұрын
can i extract text from images by the way ? My final project is extract text from images but i can not coding . I need to help please .
@syedmuzammilahmed6872 Жыл бұрын
Hi Nicola When i add "num_oov_indices" = 0 parameter in stringLookup code then model training code work but it post labels on wrong images in visualization part before training and creating model. So i removed "num_oov_indices" and now my model training code of earlystopping is not working. Code stop in very first epoch Any solution for this ?
@coconutnut21 Жыл бұрын
Can I use this for model for license plates?
@user-kw9cu2 жыл бұрын
can you provide library versions you used
@abhisekseal80442 жыл бұрын
Hi, I am a beginner in this field and I've watched your video and implemented this code. Its working fine but I need to test a single captcha image how can I do that. I was trying to do that but the prediction was not good . Please help me out if you can. 🥺
@GODS_CODM2 жыл бұрын
Have you found the answer to this?
@chelvanchelvam43323 жыл бұрын
can it suitable for text recognition task?
@NicolaiAI3 жыл бұрын
Yes if u just train it on what u want to recognize
@chelvanchelvam43323 жыл бұрын
@@NicolaiAI Thank you I will try.
@tricialamjingyi2 жыл бұрын
Hi, how can I get for captcha that has 6 digits each picture? Currently it’s 5 digits in your example, I know I need to change something in the model but I can’t seem to figure it out, :( the error I keep getting is cannot add tensor to batch. Number of elements does not match. Shapes are: [tensor]: [5] [batch]: [6] How should I change or how do I understand what I need to change?
@arslanmushtaq97742 жыл бұрын
Did you find the solution?
@kentoky6568 Жыл бұрын
Hello, in my case I tried changing the dataset for images with 4 characters and it was adapted to all 4, it would mean that you should make a model for each different length.
@aryangupta2051 Жыл бұрын
hey did you fix it?
@aryangupta2051 Жыл бұрын
@@arslanmushtaq9774 hey did you fix it?
@prathamshah55218 ай бұрын
Hey i am not getting accurate results, i checked your github for some reason the labels arent matching the captchas during testing what would you recommend to do
@LucasDM43 ай бұрын
Fix the code / fix the labels
@alokthakur32989 ай бұрын
can anyone provide me wiyh the code
@Cordic452 жыл бұрын
Sir Why we can't use regular objects detection to detect the number ?
@Konnits Жыл бұрын
Hi! Im trying the code but i having an error while training : Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [5], [batch]: [6]. Anyone can help me to fix this?
@arpittalmale6468 Жыл бұрын
same bro
@lanasillomaster703410 ай бұрын
I was replicating this project with another dataset i made and got that error because I forgot a letter when labelling a file
@hendrywijaya10173 жыл бұрын
Excuse me bro, i have an issue when im running build_model() function after CTC Loss its happen in line 43 about x = layers.Reshape(target_shape = new_shape, name='reshape')(x) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 73 74 # Panggil Functionnya buat bkin model ---> 75 model = build_model() 76 model.summary() in build_model() 41 # floor division menghasilkan nilai berupa hasil dari pembagian bersisa 42 new_shape = ((img_width // 4), (img_height // 4) * 64) ---> 43 x = layers.Reshape(target_shape = new_shape, name='reshape')(x) 44 x = layers.Dense(64, activation='relu', name='dense1')(x) 45 x = layers.Dropout(0.2)(x) /usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in __call__(self, *args, **kwargs) 975 if _in_functional_construction_mode(self, inputs, args, kwargs, input_list): 976 return self._functional_construction_call(inputs, args, kwargs, --> 977 input_list) 978 979 # Maintains info about the `Layer.call` stack. /usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in _functional_construction_call(self, inputs, args, kwargs, input_list) 1113 # Check input assumptions set after layer building, e.g. input shape. 1114 outputs = self._keras_tensor_symbolic_call( -> 1115 inputs, input_masks, args, kwargs) 1116 1117 if outputs is None: /usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in _keras_tensor_symbolic_call(self, inputs, input_masks, args, kwargs) 846 return tf.nest.map_structure(keras_tensor.KerasTensor, output_signature) 847 else: --> 848 return self._infer_output_signature(inputs, args, kwargs, input_masks) 849 850 def _infer_output_signature(self, inputs, args, kwargs, input_masks): /usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in _infer_output_signature(self, inputs, args, kwargs, input_masks) 886 self._maybe_build(inputs) 887 inputs = self._maybe_cast_inputs(inputs) --> 888 outputs = call_fn(inputs, *args, **kwargs) 889 890 self._handle_activity_regularization(inputs, outputs) /usr/local/lib/python3.7/dist-packages/keras/layers/core.py in call(self, inputs) 537 # Set the static shape for the result since it might lost during array_ops 538 # reshape, eg, some `None` dim in the result could be inferred. --> 539 result.set_shape(self.compute_output_shape(inputs.shape)) 540 return result 541 /usr/local/lib/python3.7/dist-packages/keras/layers/core.py in compute_output_shape(self, input_shape) 528 output_shape = [input_shape[0]] 529 output_shape += self._fix_unknown_dimension(input_shape[1:], --> 530 self.target_shape) 531 return tf.TensorShape(output_shape) 532 /usr/local/lib/python3.7/dist-packages/keras/layers/core.py in _fix_unknown_dimension(self, input_shape, output_shape) 516 output_shape[unknown] = original // known 517 elif original != known: --> 518 raise ValueError(msg) 519 return output_shape 520 --------------------------------------------------------------------------- and this the error message ValueError: total size of new array must be unchanged, input_shape = [50, 50, 64], output_shape = [50, 768]
@bbtvines3 жыл бұрын
how to impliment it???You just read all docs
@NicolaiAI3 жыл бұрын
Hi, 80% of the video is implementation
@creatur3 жыл бұрын
@@NicolaiAI I am having a single captcha and I trained my modes. So how can I solve that captcha?
@NicolaiAI3 жыл бұрын
What do u mean by single captcha? In the video they are passed through the model one by one too
@creatur3 жыл бұрын
@@NicolaiAI 😔😔😔I am noob with tf. I wanted to make a api which gets captcha by base6 4 and solves captcha and send back the captcha response
@GODS_CODM2 жыл бұрын
@@NicolaiAI i want to input a single CAPTCHA and I want the model to predict it
@ZainAbdin-e7s Жыл бұрын
How to crack 6 digits and characters captcha
@aryangupta2051 Жыл бұрын
hey did you get a method?
@traderdaniel47492 жыл бұрын
Anyone else has the same error?: File "C:\Users\user\PycharmProjects\ocr_gas\ocr.py", line 135, in call * label_length = tf.cast(tf.shape(y_true)[1], dtype="int64") ValueError: slice index 1 of dimension 0 out of bounds. for '{{node ocr_model_v1/ctc_loss/strided_slice_2}} = StridedSlice[Index=DT_INT32, T=DT_INT32, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1](ocr_model_v1/ctc_loss/Shape_2, ocr_model_v1/ctc_loss/strided_slice_2/stack, ocr_model_v1/ctc_loss/strided_slice_2/stack_1, ocr_model_v1/ctc_loss/strided_slice_2/stack_2)' with input shapes: [1], [1], [1], [1] and with computed input tensors: input[1] = , input[2] = , input[3] = . Call arguments received by layer "ctc_loss" " f"(type CTCLayer): • y_true=tf.Tensor(shape=(None,), dtype=float32) • y_pred=tf.Tensor(shape=(None, 50, 12), dtype=float32)