Aman Azad

Model Switching Made Easy

10/2024 | Aman Azad

Vercel's AI SDK is insane. It took me more work to set up my auth flow (hashing passwords, integrating a forgot password flow, etc) than it did to set up a model switcher for my AI generations.

Granted I rolled my own auth (though I used a package for encrypting passwords, how deep do you want to go lol), just taking a step back the complexity the AI SDK abstracts away is just the right amount.

I'm building side project with a v0-type workflow, where I have a chat window on the left, and a one-shot generation window on the right. Both AI generation flows end up calling a similar bit of code:

import { bedrock } from "@ai-sdk/amazon-bedrock";
import { convertToCoreMessages, streamText } from "ai";

...

const result = await streamText({
  // model: openai("gpt-4o"),
  model: bedrock(ai_model),
  prompt,
 });

return result.toDataStreamResponse();

Few imports from the AI SDK package, and an ai_model parameter that's just a string corresponding to the model ID on AWS Bedrock. The code snippet above is called in both of my AI API routes api/ai/generate, and api/ai/chat.

The approved models IDs I just store in an object:

export const approvedModelIds = {
  claude3x5sonnet: "anthropic.claude-3-5-sonnet-20240620-v1:0",
  claude3opus: "anthropic.claude-3-opus-20240229-v1:0",
  cohereCommandRPlus: "cohere.command-r-plus-v1:0",
  cohereCommandR: "cohere.command-r-v1:0",
  llama3x1x70b: "meta.llama3-1-70b-instruct-v1:0",
  mistral7b: "mistral.mistral-7b-instruct-v0:2",
  mistral8x7b: "mistral.mixtral-8x7b-instruct-v0:1",
  awsTitan: "amazon.titan-text-lite-v1",
  // llama321b: "meta.llama3-2-1b-instruct-v1:0", // Disabled, errors on call
  // gpt4o: "gpt-4o", // Disabled because call made to third party API
}

And staying more in the Vercel ecosystem, I use some shadcn components to quickly come up with this UI that goes ahead and renders out the models.

Full stack, from backend to UI, and with a little less than an hour of coding time, I'm now able to hot-swap models with a drop down.

What's great here is because I'm able to quickly swap between models, I get the obvious benefit of testing internally which foundational models works best for my use case.

Also, I'm able to tease out the small bugs that come up when switching models. Like for example, the Mistral models don't support prompts with the system role, and certain Llama 3 models don't support on-demand throughput - but when it does (like for the model listed above), I just uncomment the model and I'm good to go.

Socials

Email me at aman.s.azad@gmail.com to get in touch.