For the longest time I put programming language dogma ahead of technical pragmatism but recently the need for synthetic data generation helped steer me towards llama.cpp. I assumed the minimal operational complexity I enjoy with Elixir was somehow at odds with llama.cpp but together they provided enough harmony to help me level up.
My end goal here was to unlock access to bigger models like Mixtral-8x7B within Elixir so I could do data engineering with 24GB or less of vRAM. To my surprise the happy path was simple, straightforward and supports Mistral, Mixtral, Gemma and just about any flavor of Llama 2.
The clone and make process is simple enough
git clone --depth=1 https://github.com/ggerganov/llama.cpp.git cpp
cd cpp
make clean && LLAMA_CUBLAS=1 make -j
Next, download a quantized gguf of Mixtral8x7b that works with llama.cpp.
def download do
url = "https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf?download=true"
dir = "/home/hello/cpp/models"
path = "mistrial-instruct.gguf"
full_path = Path.join([dir, path])
File.mkdir_p!(dir)
Req.get!(url, into: File.stream!(full_path))
end
Now, using System.cmd shell out to llama.cpp from Elixir.
def prompt(text) do
ggml_exec = "/home/hello/cpp/main"
System.cmd(ggml_exec, [
"-ngl",
"20",
"-m",
"/home/hello/cpp/models/mistrial-instruct.gguf",
"-c",
"2048",
"--temp",
"1.0",
"--repeat_penalty",
"1.1",
"-n",
"-1",
"-p",
"<s>[INST]#{text} [/INST]"
])
|> case do
{cpp, 0} ->
[_, result] = String.split(cpp, "[/INST]")
result
_other ->
raise "BOOM"
end
end
With this function you can prompt Mixtral8x7b and generate synthetic data with ease!
def generate do
"data.json"
|> get_json()
|> Enum.map(fn data ->
topic = data["topic"]
friend = data["friend"]
prompt = """
Imagine you are Toran, write a text message from Toran to #{friend} about #{topic}.
"""
results = prompt(prompt)
result =
if String.valid?(results) do
results
else
nil
end
%{instruction: prompt, output: result}
end)
|> Enum.reject(&is_nil(&1.output))
|> writejson("result.json")
end
I've had great success with this simplistic approach because data engineering is a blend of extraction, cleaning and now prompting + function calling to other llms!