CS5720 - Week 7
Slide 136 of 140
Using Pre-trained Word Embeddings
Popular Pre-trained Embeddings
π Google Word2Vec
300d vectors, 3M vocab
Trained on Google News
π Stanford GloVe
Multiple dimensions available
Common Crawl, Wikipedia, Twitter
β‘ Facebook FastText
Subword information
157 languages available
π₯ Domain-Specific
BioWordVec, Law2Vec
Specialized vocabularies
Integration Techniques
π Feature Extraction
π― Fine-tuning
π§± As Embedding Layer
β Out-of-Vocabulary Words
Typical Workflow
π₯
Load Embeddings
Download and load pre-trained vectors
πΊοΈ
Map Vocabulary
Align with your dataset vocab
π
Initialize Model
Set up embedding layer
π
Train Model
Fine-tune or freeze
# Quick example: Using GloVe with Keras from tensorflow.keras.layers import Embedding embedding_layer = Embedding( input_dim=vocab_size, output_dim=300, weights=[embedding_matrix], trainable=False # Freeze pre-trained weights )
β Previous
Next β
Prepared by Dr. Gorkem Kar
Modal Title
Γ
Modal content goes here...