Word Embedding Generation Of Sanskrit Shlokas-A Hybrid Approach

Machine Learning
Python

This project aims to generate word embeddings for Sanskrit shlokas by employing a hybrid approach that combines traditional linguistic knowledge with modern computational techniques. Sanskrit, an ancient Indo-Aryan language, holds significant cultural and religious importance, with shlokas serving as fundamental units of expression in various texts. The proposed methodology involves collecting a diverse dataset of Sanskrit shlokas and preprocessing them through tokenization, sandhi splitting, and normalization. Traditional linguistic features, including morphological, phonological, and semantic aspects, are extracted from the shlokas. These features are then integrated with modern natural language processing (NLP) techniques. The project aims to contribute to the preservation and understanding of Sanskrit language and literature while also showcasing the potential of hybrid approaches in computational linguistics.

Hybrid Approach for Sanskrit Shloka Word Embeddings