How do you apply BERT's magic to languages beyond just English? Why isn't it always as simple as re-training BERT on text from your language?
In four Colab Notebooks with a video walkthrough, this tutorial explains, implements, and compares several approaches:
4x Colab Notebooks   +   Video Walkthrough
PyTorch   +   huggingface/transformers
A model trained on 100 different languages must have a pretty strange vocabulary--let's see what's in there!
NLP Base Camp Members have complete access to this tutorial
and all of my NLP content!
These Notebooks can be easily modified to run for any of the 15 languages included in the XNLI benchmark!
9. Russian
10. Swahili
11. Thai
12. Turkish
13. Urdu
14. Vietnamese
15. Chinese
(Note: Monolingual Notebook requires finding a BERT model trained on your language)
50% Complete
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.