Run LLMs natively in Ruby with Rust + GPU support

from Rubyflow 2 days ago

Red Candle is a Ruby gem designed to run various large language models, including Llama 2 and Mistral, using Rust within the Ruby process. It employs Magnus to interface with Hugging Face's Candle crate, which allows direct access to the LLM without needing Python or a separate server. Key features encompass chat completions, embeddings, reranking, and named entity recognition, with hardware acceleration options like Metal and CUDA. Additionally, it supports both safetensors and gguf quantized model formats for improved efficiency and flexibility.

Red Candle is a Ruby gem enabling direct execution of large language models such as Llama 2, Llama 3, Mistral, and Gemma within a Ruby process using Rust.

It integrates with Magnus to connect with Hugging Face's Candle crate, allowing Ruby applications to access LLMs via FFI without requiring Python or a separate server.

Key functionalities include support for chat completions, embeddings, reranking, named entity recognition, along with hardware acceleration utilizing Metal and CUDA.

Red Candle can utilize both safetensors and gguf quantized model formats, optimizing model performance and resource utilization.

Read at Rubyflow

#ruby #machine-learning #language-models #rust #hugging-face

Collection

[

...

]

Run LLMs natively in Ruby with Rust + GPU supportRun LLMs natively in Ruby with Rust + GPU support Briefly

Run LLMs natively in Ruby with Rust + GPU support
Run LLMs natively in Ruby with Rust + GPU support
Briefly