Posts
LLM Hallucinations
Reducing LLM Hallucinations: A Deep Dive into Reflection LLM and Vector Stores Raymond Bernard
Senior Engineer and Solutions Architect specializing in NAS, SAN, and NVIDIA BaseBOD H100. Passionate about Data Science, AI, and Open Source. LLM enthusiast driving innovative solutions in cloud and AI technologies.
September 8, 2024
Ray Bernard
ray.bernard@outlook.com
Video demo
Code
Large Language Models (LLMs) have become invaluable tools across various domains, from content creation to coding assistance.
read morePosts
Moa + Star = Better Open Source LLM part 3
Introduction Artificial Intelligence (AI) has made significant strides, particularly with the advent of large language models (LLMs). These models, while powerful, often require innovative methods to harness their full potential, especially in complex reasoning tasks. This blog post explores two advanced methodologies, Mixture of Agents (MoA) and Self-Taught Reasoner (STaR), and how their integration can push the boundaries of AI capabilities.
Mixture of Agents (MoA) Methodology The Mixture of Agents (MoA) methodology is designed to leverage the strengths of multiple language models by creating a collaborative framework.
read morePosts
Open-Webui Mixture of Agents part 2
Youtube Video : https://youtu.be/KxT7lHaPDJ4
Introduction The Mixture-of-Agents (MoA) methodology has demonstrated state-of-the-art performance using open-source models, as detailed in my previous blog. We have created two pipelines (Groq and Ollama) for Open-WebUI. These pipelines serve as versatile, UI-agnostic OpenAI-compatible plugin frameworks.
In this blog, we will demonstrate how MoA can be integrated into Open WebUI, an extensible, feature-rich, and user-friendly self-hosted WebUI designed to operate entirely offline. Open WebUI supports various LLM runners, including Ollama and OpenAI-compatible APIs.
read morePosts
Mixture of Agents part 1
How Mixture-of-Agents Enhances Large Language Model Capabilities Introduction In recent years, Large Language Models (LLMs) have significantly advanced the field of natural language understanding and generation. These models, pre-trained on vast datasets and fine-tuned to align with human preferences, have demonstrated impressive capabilities. However, the rapid growth in the number of LLMs and their diverse strengths has presented a new challenge: how to effectively harness the collective expertise of multiple LLMs.
read morePosts
Instruct on Intel Max Series GPUs
Fine-Tuning LLaMa 3 8B Instruct on Intel Max Series GPUs: An Exciting Journey In this guide, we embark on an exciting journey to fine-tune the powerful LLaMa 3 8B Instruct model using a custom dataset on Intel Max Series GPUs. Intel GPUs offer incredible speed and cost-effectiveness, making them an attractive choice for training large language models.
I successfully trained the LLaMa 3 8B Instruct model using my custom dataset, leveraging the power of HuggingFace.
read more