Sri Lanka’s artificial intelligence journey is entering a new phase—one defined not by adoption, but by architecture. With Chat2Find LLM, the country is moving beyond simply using global AI systems and is instead building a localized intelligence stack designed for its own legal systems, business intelligence, languages, and data realities. This marks a shift toward engineering AI that reflects Sri Lanka’s unique context rather than relying entirely on external models. Chat2Find marks a significant milestone as Sri Lanka’s first trilingual large language model, with deep training in Sinhala supported by a dataset exceeding one billion data points.
Chat2Find LLM is now available Open source at Hugging Face and Lanka Data (Local Repository)
Datasets
Intelligent AI Layer
At its core, Chat2Find is not structured as a single monolithic model. It operates as a layered system that combines the strengths of global foundation models with a deeply localized intelligence layer. The base layer provides general reasoning and language capabilities, while higher layers refine outputs to align with Sri Lankan legal terminology, business intelligence, taxation systems, and educational frameworks. This is further enhanced by a retrieval-driven design, allowing the system to access and integrate real-time data such as laws, gazettes, and regulatory updates. The result is an AI system that behaves less like a static chatbot and more like a continuously evolving knowledge engine.

A defining technical feature of Chat2Find is its reliance on retrieval-augmented generation. In a country where laws and policies frequently evolve through official publications, static training alone is insufficient. The system ingests structured and unstructured data, processes it into searchable formats, and retrieves relevant information at query time. This ensures that responses are grounded in actual documents, improving both accuracy and traceability. By combining vector-based semantic search with traditional keyword methods, the system can handle the complexity of legal and multilingual queries more effectively.
Language processing presents another major challenge that Chat2Find attempts to solve. Sri Lanka’s trilingual environment—Sinhala, Tamil, and English—requires models that can operate seamlessly across scripts and linguistic structures. Global AI systems often struggle in this area due to limited training data. Chat2Find addresses this through a combination of multilingual corpora, adapted tokenization, and cross-lingual retrieval techniques. This allows users to query in one language and receive accurate, context-aware responses, even when the source material exists in another. The practical outcome is broader accessibility, especially for users outside English-speaking segments.
On top of the core system, Chat2Find introduces domain-specific modules that transform general AI into structured, applied intelligence. Legal, tax, and business modules operate on curated datasets and tailored prompting strategies, enabling them to perform specialized tasks such as legal research, compliance guidance, and decision support. These modules are continuously updated to reflect regulatory changes, making them particularly relevant in professional and institutional settings.
Another important dimension is the system’s ability to integrate real-time data. Unlike conventional LLM deployments that rely on periodic retraining, Chat2Find uses ongoing data pipelines to ingest new information. This includes government publications, financial updates, and other authoritative sources. The system processes, validates, and indexes this data incrementally, ensuring that it remains current. This transforms the platform into a live intelligence system rather than a static repository of knowledge.
From an infrastructure perspective, Chat2Find operates within the constraints typical of emerging markets. Limited access to large-scale compute resources necessitates a hybrid approach that balances cloud-based model inference with locally optimized retrieval systems. Efficiency becomes a key design principle, with strategies such as model routing, caching, and optimized token usage helping to control costs and latency. These considerations are crucial for making AI accessible at scale within Sri Lanka.
A notable strategic direction is the move toward an open and collaborative ecosystem. By aligning with open-weight models and encouraging developer participation, Chat2Find positions itself as a shared national platform rather than a closed proprietary system. This vision extends into the development of an AI marketplace, where local developers can build and distribute tools on top of the platform. Such an ecosystem has the potential to stimulate a domestic AI economy and reduce dependence on foreign platforms.
In terms of performance, Chat2Find does not aim to compete with global frontier models in sheer scale. Instead, it focuses on contextual precision, real-time relevance, and affordability. This trade-off reflects a deliberate design choice: prioritizing local usefulness over global generality. The system may not match the raw capabilities of the largest models, but it delivers greater value in scenarios that require deep understanding of Sri Lankan systems and languages.
Despite its promise, several technical challenges remain. Data quality is an ongoing concern, particularly given the fragmented nature of local datasets. The lack of standardized benchmarks for Sinhala and Tamil complicates evaluation, while maintaining accuracy in high-stakes domains such as law and taxation requires continuous refinement. Infrastructure limitations and funding constraints also pose challenges for long-term scaling.
Even with these hurdles, the broader significance of Chat2Find is clear. It represents a shift toward building AI systems that are not only technologically advanced but also contextually grounded. By combining retrieval-driven architecture, multilingual processing, domain-specific intelligence, and an open ecosystem approach, Chat2Find is laying the foundation for a sovereign AI layer in Sri Lanka.
Ultimately, its importance lies not in being the largest or most powerful model, but in being the most relevant to its environment. It redefines the goal of AI development in Sri Lanka—from replicating global systems to creating intelligence that understands and serves the country at a fundamental level.
















