Download PDFOpen PDF in browserString Similarity Based on Phonetic in the Gujarati Language Using Gujsim AlgorithmEasyChair Preprint 21539 pages•Date: December 12, 2019AbstractSearching with top 10 search engine to find “ગાન્ધીજી” or “ગાંધીજી” and surprised to see the result which far differs from one to another. As in the Gujarati language, both strings are correct. Therefore, String similarity algorithm is useful for text mining applications. Basically, string similarity compares each character from both strings but it may not give the accurate result on highly rich Gujarati language due to different kinds of writing styles which depend on matras, reph, vatu and diacritics on simple and compound alphabets. GUJSIM (GUJarati SIMilarity) algorithm is the hybrid approach to do strings similarity for Gujarati language. Here, the author compares 70 strings pairs and GUJSIM algorithm gives good percentage result. Keyphrases: Gujarati language, phonetic, string distance, string similarity
|