Research Projects
Latent Thought LLM
An experimental approach to teaching language models to reason in latent space while maintaining text generation capabilities.
Overview
This project explores a novel approach to language model reasoning by progressively training a model to operate in latent space for its reasoning steps while maintaining the ability to process text inputs and generate text outputs. The model learns to "think" using hidden state representations instead of token embeddings, potentially enabling more fluid and abstract reasoning capabilities.
Key Features
- Progressive transition from text-based to latent-space reasoning
- Dual-mode operation: text processing and latent thought chains
- Maintains alignment between embedding and hidden state spaces
- Specialized loss functions for both text prediction and latent space alignment
- Automatic answer extraction using \boxed{} format
Research Implications
By enabling models to reason in latent space, we can potentially overcome the limitations of token-based thinking, allowing for more nuanced, efficient, and powerful AI reasoning. This approach may lead to models that can handle complex reasoning tasks with greater precision and fewer computational resources.
Nuanced Speech
A neural pipeline for preserving emotional characteristics in speech-to-speech conversion using Whisper and Kokoro.
Overview
This project aims to preserve and transfer speech characteristics (emotion, tone, emphasis) through the text generation pipeline by creating a bypass connection from speech recognition to speech synthesis. By maintaining emotional features in the latent space, we can create more natural and expressive speech synthesis.
Key Features
- Speech-to-text using OpenAI's Whisper
- Text-to-speech using Kokoro
- Emotional feature preservation through style encoder network
- Multi-voice training for style transfer
- STFT and style similarity loss functions for better audio quality
Research Implications
This research demonstrates how latent space can be used to capture and transfer nuanced emotional characteristics that are difficult to express in text alone. The approach shows how embeddings can encode rich information that would be lost in traditional text-based pipelines.
Join Our Research Team
We're looking for researchers and engineers passionate about pushing the boundaries of AI communication beyond traditional language models. If you're interested in latent space, embeddings, and the future of AI, we'd love to hear from you.