小众AI

litellm
litellm - 简化大模型 API 调用的工具
将各种 AI 大模型和服务的接口,统一转换成 OpenAI 的格式,简化了在不同 AI 服务/大模型切换和管理的工作。此外,它还支持设置预算、限制请求频率、管理 API Key 和配置 OpenAI 代理服务器等功能。
  官网   代码仓

litellm能够将各种 AI 大模型和服务的接口,统一转换成 OpenAI 的格式,简化了在不同 AI 服务/大模型切换和管理的工作。此外,它还支持设置预算、限制请求频率、管理 API Key 和配置 OpenAI 代理服务器等功能。

liteLLM

主要功能

使用相同的输入输出格式调用100多个大型语言模型(LLMs)

  • 将输入转换为服务提供商的完成、嵌入和图像生成等端点所需的格式
  • 确保输出的一致性,文本响应始终位于[‘choices’][0][‘message’][‘content’]路径下
  • 在多个部署(如Azure、OpenAI等)之间实现重试/回退逻辑 - 路由器
  • 跟踪支出并为每个项目设置预算 - OpenAI代理服务器

使用说明

LiteLLM v1.0.0 now requires openai>=1.0.0. Migration guide here
LiteLLM v1.40.14+ now requires pydantic>=2.0.0. No changes required.

pip install litellm
from litellm import completion
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["COHERE_API_KEY"] = "your-cohere-key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)

Call any model supported by a provider, with model=<provider_name>/<model_name>. There might be provider-specific details here, so refer to provider docs for more information

异步

from litellm import acompletion
import asyncio

async def test_get_response():
    user_message = "Hello, how are you?"
    messages = [{"content": user_message, "role": "user"}]
    response = await acompletion(model="gpt-3.5-turbo", messages=messages)
    return response

response = asyncio.run(test_get_response())
print(response)

liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response.
Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)

from litellm import completion
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for part in response:
    print(part.choices[0].delta.content or "")

# claude 2
response = completion('claude-2', messages, stream=True)
for part in response:
    print(part.choices[0].delta.content or "")

日志

LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack

from litellm import completion

## set env variables for logging tools
os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key"
os.environ["HELICONE_API_KEY"] = "your-helicone-auth-key"
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["ATHINA_API_KEY"] = "your-athina-api-key"

os.environ["OPENAI_API_KEY"]

# set callbacks
litellm.success_callback = ["lunary", "langfuse", "athina", "helicone"] # log input/output to lunary, langfuse, supabase, athina, helicone etc

#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])

更多...


ai-financial-agent
探索人工智能在投资研究中的应用。
Meetily
一个 AI 驱动的会议助手,可捕获实时会议音频、实时转录并生成摘要,同时确保用户隐私。
CHRONOS
CHRONOS是一种新颖的基于检索的时间线摘要 (TLS) 方法,通过迭代提出有关主题和检索到的文档的问题来生成按时间顺序排列的摘要。