小众AI

GraphRAG
GraphRAG - 使用 LLM 生成知识图谱
微软开源的一个项目,它利用图(Graph)来增强检索(Retrieval)和生成(Generation)的能力,通过结合知识图谱与图机器学习,提升大型语言模型在私有数据集上的推理和问答性能。
  官网   代码仓

GraphRAG是微软开源的一个项目,它利用图(Graph)来增强检索(Retrieval)和生成(Generation)的能力,通过结合知识图谱与图机器学习,提升大型语言模型在私有数据集上的推理和问答性能。

graphrag-architecture-diagram.png

主要功能

GraphRAG的主要功能包括:

  • 知识图谱构建:利用大型语言模型(LLM)从非结构化文本中提取实体、关系和关键声明,构建知识图谱。
  • 增强检索能力:通过知识图谱的语义理解,增强模型对复杂查询的检索能力,提供与问题真正匹配的答案。
  • 综合推理与生成:能够跨越分散的信息片段,通过共享属性将它们串联起来,提供合成的新见解,并生成自然流畅的回答。
  • 主题总结与概括:能够整体理解并总结大型数据集合或单一长篇文档的语义概念,回答如“数据集中的主要主题是什么?”等全局性问题。

安装部署

建议尝试使用解决方案加速器包。这为 Azure 资源提供了用户友好的端到端体验。

开源编译

环境准备

Name Installation Purpose
Python 3.10 or 3.11 Download The library is Python-based.
Poetry Instructions Poetry is used for package management and virtualenv management in Python codebases

Getting Started

Install Dependencies

# Install Python dependencies.
poetry install

Execute the Indexing Engine

poetry run poe index <...args>

Executing Queries

poetry run poe query <...args>

Azurite

Some unit and smoke tests use Azurite to emulate Azure resources. This can be started by running:

./scripts/start-azurite.sh

or by simply running azurite in the terminal if already installed globally. See the Azurite documentation for more information about how to install and use Azurite.

Lifecycle Scripts

Our Python package utilizes Poetry to manage dependencies and poethepoet to manage build scripts.

Available scripts are:

  • poetry run poe index - Run the Indexing CLI
  • poetry run poe query - Run the Query CLI
  • poetry build - This invokes poetry build, which will build a wheel file and other distributable artifacts.
  • poetry run poe test - This will execute all tests.
  • poetry run poe test_unit - This will execute unit tests.
  • poetry run poe test_integration - This will execute integration tests.
  • poetry run poe test_smoke - This will execute smoke tests.
  • poetry run poe check - This will perform a suite of static checks across the package, including:
    • formatting
    • documentation formatting
    • linting
    • security patterns
    • type-checking
  • poetry run poe fix - This will apply any available auto-fixes to the package. Usually this is just formatting fixes.
  • poetry run poe fix_unsafe - This will apply any available auto-fixes to the package, including those that may be unsafe.
  • poetry run poe format - Explicitly run the formatter across the package.

Troubleshooting

“RuntimeError: llvm-config failed executing, please point LLVM_CONFIG to the path for llvm-config” when running poetry install

Make sure llvm-9 and llvm-9-dev are installed:

sudo apt-get install llvm-9 llvm-9-dev

and then in your bashrc, add

export LLVM_CONFIG=/usr/bin/llvm-config-9

“numba/_pymodule.h:6:10: fatal error: Python.h: No such file or directory” when running poetry install

Make sure you have python3.10-dev installed or more generally python<version>-dev

sudo apt-get install python3.10-dev

LLM call constantly exceeds TPM, RPM or time limits

GRAPHRAG_LLM_THREAD_COUNT and GRAPHRAG_EMBEDDING_THREAD_COUNT are both set to 50 by default. You can modify this values to reduce concurrency. Please refer to the Configuration Documents


更多...


ai-financial-agent
探索人工智能在投资研究中的应用。
Meetily
一个 AI 驱动的会议助手,可捕获实时会议音频、实时转录并生成摘要,同时确保用户隐私。
CHRONOS
CHRONOS是一种新颖的基于检索的时间线摘要 (TLS) 方法,通过迭代提出有关主题和检索到的文档的问题来生成按时间顺序排列的摘要。