Posts

LLM Hallucinations

Reducing LLM Hallucinations: A Deep Dive into Reflection LLM and Vector Stores Raymond Bernard Senior Engineer and Solutions Architect specializing in NAS, SAN, and NVIDIA BaseBOD H100. Passionate about Data Science, AI, and Open Source. LLM enthusiast driving innovative solutions in cloud and AI technologies. September 8, 2024 Ray Bernard ray.bernard@outlook.com Video demo Code Large Language Models (LLMs) have become invaluable tools across various domains, from content creation to coding assistance.

Posts

Moa + Star = Better Open Source LLM part 3

Introduction Artificial Intelligence (AI) has made significant strides, particularly with the advent of large language models (LLMs). These models, while powerful, often require innovative methods to harness their full potential, especially in complex reasoning tasks. This blog post explores two advanced methodologies, Mixture of Agents (MoA) and Self-Taught Reasoner (STaR), and how their integration can push the boundaries of AI capabilities. Mixture of Agents (MoA) Methodology The Mixture of Agents (MoA) methodology is designed to leverage the strengths of multiple language models by creating a collaborative framework.

Posts

Open-Webui Mixture of Agents part 2

Youtube Video : https://youtu.be/KxT7lHaPDJ4 Introduction The Mixture-of-Agents (MoA) methodology has demonstrated state-of-the-art performance using open-source models, as detailed in my previous blog. We have created two pipelines (Groq and Ollama) for Open-WebUI. These pipelines serve as versatile, UI-agnostic OpenAI-compatible plugin frameworks. In this blog, we will demonstrate how MoA can be integrated into Open WebUI, an extensible, feature-rich, and user-friendly self-hosted WebUI designed to operate entirely offline. Open WebUI supports various LLM runners, including Ollama and OpenAI-compatible APIs.

Posts

Mixture of Agents part 1

How Mixture-of-Agents Enhances Large Language Model Capabilities Introduction In recent years, Large Language Models (LLMs) have significantly advanced the field of natural language understanding and generation. These models, pre-trained on vast datasets and fine-tuned to align with human preferences, have demonstrated impressive capabilities. However, the rapid growth in the number of LLMs and their diverse strengths has presented a new challenge: how to effectively harness the collective expertise of multiple LLMs.

Posts

Instruct on Intel Max Series GPUs

Fine-Tuning LLaMa 3 8B Instruct on Intel Max Series GPUs: An Exciting Journey In this guide, we embark on an exciting journey to fine-tune the powerful LLaMa 3 8B Instruct model using a custom dataset on Intel Max Series GPUs. Intel GPUs offer incredible speed and cost-effectiveness, making them an attractive choice for training large language models. I successfully trained the LLaMa 3 8B Instruct model using my custom dataset, leveraging the power of HuggingFace.